view article Article ChatML vs Harmony: Understanding the new Format from OpenAI π By kuotient β’ 29 days ago β’ 35
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 β’ 7 items β’ Updated Jul 21 β’ 156
π IOI Collection Resources related to International Olympiad in Informatics (IOI) problems β’ 5 items β’ Updated May 13 β’ 7
Light-R1 Collection Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond β’ 7 items β’ Updated Mar 13 β’ 12
π§ Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community β’ 24 items β’ Updated May 19 β’ 167
Jamba 1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models β’ 2 items β’ Updated Mar 6 β’ 87
view article Article Releasing Common Corpus: the largest public domain dataset for training LLMs By Pclanglais β’ Mar 20, 2024 β’ 26