OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published Feb 3 • 216
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated 18 days ago • 142
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 123
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 By tomaarsen • May 28, 2024 • 224
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 232
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Paper • 2502.16894 • Published Feb 24 • 31
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 401
view article Article What is test-time compute and how to scale it? By Kseniase and 1 other • Feb 6 • 89
view article Article Open-source DeepResearch – Freeing our search agents By m-ric and 4 others • Feb 4 • 1.25k
Llama 3.3 Collection This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 170
HelpSteer2-Preference: Complementing Ratings with Preferences Paper • 2410.01257 • Published Oct 2, 2024 • 24
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 611
Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches Paper • 2408.04567 • Published Aug 8, 2024 • 27