SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity Paper • 2506.16500 • Published 19 days ago • 17
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Paper • 2507.01352 • Published 7 days ago • 48
Reward Models Collection Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge • 8 items • Updated about 22 hours ago • 11
nvidia/Llama-3_3-Nemotron-Super-49B-GenRM-Multilingual Text Generation • 50B • Updated 12 days ago • 76 • 6