Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Paper • 2502.16894 • Published 3 days ago • 21
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 333
view article Article What is test-time compute and how to scale it? By Kseniase and 1 other • 20 days ago • 42
Llama 3.3 Collection This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 136
HelpSteer2-Preference: Complementing Ratings with Preferences Paper • 2410.01257 • Published Oct 2, 2024 • 23
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 570
Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches Paper • 2408.04567 • Published Aug 8, 2024 • 25