view article Article Open R1: How to use OlympicCoder locally for coding? By burtenshaw and 4 others • 13 days ago • 54
view article Article How NuminaMath Won the 1st AIMO Progress Prize By yfleureau and 7 others • Jul 11, 2024 • 118
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent By qgallouedec and 3 others • Apr 22, 2024 • 80
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods By kashif and 4 others • Jan 18, 2024 • 52
view article Article Can foundation models label data like humans? By nazneen and 8 others • Jun 12, 2023 • 1
view article Article Creating a Coding Assistant with StarCoder By lewtun and 8 others • May 9, 2023 • 2
view article Article StackLLaMA: A hands-on guide to train LLaMA with RLHF By edbeeching and 6 others • Apr 5, 2023 • 33
view article Article Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU By edbeeching and 5 others • Mar 9, 2023 • 45
view article Article Train your first Decision Transformer By edbeeching and 1 other • Sep 8, 2022 • 11
view article Article Introducing Decision Transformers on Hugging Face 🤗 By edbeeching and 1 other • Mar 28, 2022 • 5