AI & ML interests

retrieval augmented generation, grounded generation, large language models, LLMs, question answering, chatbot

Recent Activity

vectara's activity

ofermendΒ 
posted an update 7 months ago
ofermendΒ 
posted an update 10 months ago
view post
Post
1756
If you are a debate fan or did this as an extracurricular activity as a kid, you might have fun with this demo - debate bot. Debate against AI/RAG:

vectara/debate-bot
Β·
nthakurΒ 
posted an update 10 months ago
view post
Post
3381
🦒 The SWIM-IR dataset contains 29 million text-retrieval training pairs across 27 diverse languages. It is one of the largest synthetic multilingual datasets generated using PaLM 2 on Wikipedia! πŸ”₯πŸ”₯

SWIM-IR dataset contains three subsets :
- Cross-lingual:nthakur/swim-ir-cross-lingual
- Monolingual: nthakur/swim-ir-monolingual
- Indic Cross-lingual: nthakur/indic-swim-ir-cross-lingual

Check it out:
https://huggingface.co/collections/nthakur/swim-ir-dataset-662ddaecfc20896bf14dd9b7
clefourrierΒ 
posted an update 11 months ago
view post
Post
6022
In a basic chatbots, errors are annoyances. In medical LLMs, errors can have life-threatening consequences 🩸

It's therefore vital to benchmark/follow advances in medical LLMs before even thinking about deployment.

This is why a small research team introduced a medical LLM leaderboard, to get reproducible and comparable results between LLMs, and allow everyone to follow advances in the field.

openlifescienceai/open_medical_llm_leaderboard

Congrats to @aaditya and @pminervini !
Learn more in the blog: https://huggingface.co/blog/leaderboard-medicalllm