46 45 145

Stefano Fiorucci PRO

anakin87

AI & ML interests

Contributing to Haystack LLM framework 🏗️. Language Models: orchestration, post-training, synthetic data...

Recent Activity

liked a model 10 days ago

grounded-ai/phi4-mini-judge

liked a model 23 days ago

mrm8488/Qwen3-14B-ft-limo

liked a dataset 23 days ago

open-r1/Mixture-of-Thoughts

View all activity

Organizations

anakin87's activity

upvoted an article about 1 month ago

Article

Good answers are not necessarily factual answers: an analysis of hallucination in leading LLMs

and 1 other •

May 7

• 36

upvoted a collection about 2 months ago

Qwen Scheduler GRPO

Collection

Train a SLM to create a schedule from a list of events and priorities - Article: https://t.ly/-Dejx - Code: https://t.ly/1J_VG • 2 items • Updated Apr 29 • 4

upvoted an article about 2 months ago

Article

I trained a Language Model to schedule events with GRPO!

•

Apr 29

• 78

upvoted an article 3 months ago

Article

Training a Gemma 2 2B-IT for Reasoning with GRPO

•

Mar 18

• 5

upvoted 2 articles 4 months ago

Article

Argunauts Training Phase II: Selfplay Finetuning Line-By-Line

•

Feb 19

• 5

Article

Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset

•

Feb 10

• 58

upvoted an article 5 months ago

Article

Fine-tune ModernBERT for RAG with Synthetic Data

and 2 others •

Jan 20

• 40

upvoted 2 collections 5 months ago

Gemma Neogenesis 💎🌍🇮🇹

Collection

Datasets and models for Neogenesis: Post-training recipe for improving Gemma 2 for a specific language. Notebook: https://t.ly/iuKdy • 12 items • Updated Mar 10 • 5

Dolphin 3.0

Collection

Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 9 items • Updated Feb 7 • 167

upvoted a collection 6 months ago

alignment_24_best

Collection

33 items • Updated Oct 21, 2024 • 2

upvoted 2 papers 6 months ago

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Paper • 2406.09279 • Published Jun 13, 2024 • 3

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Paper • 2412.11605 • Published Dec 16, 2024 • 18

upvoted a paper 7 months ago

Reverse Thinking Makes LLMs Stronger Reasoners

Paper • 2411.19865 • Published Nov 29, 2024 • 23

upvoted a collection 7 months ago

🇮🇹👓 LLaVA-NDiNO

Collection

HF Collection for the models of the paper "LLaVA-NDiNO: Empowering LLMs with Multimodality for the Italian Language" • 7 items • Updated Oct 20, 2024 • 3

upvoted 3 papers 7 months ago

upvoted an article 7 months ago

Article

SauerkrautLM's Multi-Phase Spectrum Training: A Technical Deep Dive

•

Nov 9, 2024

• 9

upvoted 2 articles 8 months ago

Article

Introducing GGUF-my-LoRA

•

Nov 1, 2024

• 18

Article

🇮🇹🇯🇵🇧🇷 Generating multilingual instruction datasets with Magpie 🐦‍⬛

•

Oct 21, 2024

• 19