view article Article Good answers are not necessarily factual answers: an analysis of hallucination in leading LLMs By davidberenstein1957 and 1 other • 22 days ago • 32
Qwen Scheduler GRPO Collection Train a SLM to create a schedule from a list of events and priorities - Article: https://t.ly/-Dejx - Code: https://t.ly/1J_VG • 2 items • Updated about 1 month ago • 4
view article Article I trained a Language Model to schedule events with GRPO! By anakin87 • about 1 month ago • 75
view article Article Argunauts Training Phase II: Selfplay Finetuning Line-By-Line By ggbetz • Feb 19 • 5
view article Article Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset By sdiazlor • Feb 10 • 58
view article Article Fine-tune ModernBERT for RAG with Synthetic Data By sdiazlor and 2 others • Jan 20 • 39
Gemma Neogenesis 💎🌍🇮🇹 Collection Datasets and models for Neogenesis: Post-training recipe for improving Gemma 2 for a specific language. Notebook: https://t.ly/iuKdy • 12 items • Updated Mar 10 • 5
Dolphin 3.0 Collection Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 9 items • Updated Feb 7 • 153
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback Paper • 2406.09279 • Published Jun 13, 2024 • 3
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models Paper • 2412.11605 • Published Dec 16, 2024 • 18
🇮🇹👓 LLaVA-NDiNO Collection HF Collection for the models of the paper "LLaVA-NDiNO: Empowering LLMs with Multimodality for the Italian Language" • 7 items • Updated Oct 20, 2024 • 3
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11, 2024 • 39
view article Article SauerkrautLM's Multi-Phase Spectrum Training: A Technical Deep Dive By DavidGF • Nov 9, 2024 • 9
view article Article 🇮🇹🇯🇵🇧🇷 Generating multilingual instruction datasets with Magpie 🐦⬛ By anakin87 • Oct 21, 2024 • 19