Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 β’ 7 items β’ Updated 3 days ago β’ 130
π IOI Collection Resources related to International Olympiad in Informatics (IOI) problems β’ 5 items β’ Updated 11 days ago β’ 7
Light-R1 Collection Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond β’ 7 items β’ Updated Mar 13 β’ 12
π§ Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community β’ 24 items β’ Updated 5 days ago β’ 142
Jamba 1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models β’ 2 items β’ Updated Mar 6 β’ 87
view article Article Releasing Common Corpus: the largest public domain dataset for training LLMs By Pclanglais β’ Mar 20, 2024 β’ 24