AI & ML interests

retrieval augmented generation, grounded generation, large language models, LLMs, question answering, chatbot

Recent Activity

ofermend  updated a model about 3 hours ago
vectara/hallucination_evaluation_model
stsui96  published a dataset 3 days ago
vectara/hhem_leaderboard_datasets
stsui96  updated a dataset 3 days ago
vectara/hhem_leaderboard_datasets
View all activity

clefourrier 
posted an update 3 months ago
view post
Post
1258
Always surprised that so few people actually read the FineTasks blog, on
✨how to select training evals with the highest signal✨

If you're serious about training models without wasting compute on shitty runs, you absolutely should read it!!

An high signal eval actually tells you precisely, during training, how wel & what your model is learning, allowing you to discard the bad runs/bad samplings/...!

The blog covers in depth prompt choice, metrics, dataset, across languages/capabilities, and my fave section is "which properties should evals have"👌
(to know on your use case how to select the best evals for you)

Blog: HuggingFaceFW/blogpost-fine-tasks
  • 2 replies
·
ofermend 
posted an update 4 months ago
view post
Post
355
Excited to share open-rag-eval (https://github.com/vectara/open-rag-eval) a new open source project to help scale RAG evaluation. The key benefit: it does not require golden answers so much more scalable.
Would love any thoughts or feedback (or even better - if you want to contribute a PR that would be great).