SmolLM3 evaluation datasets Collection Datasets to decontaminate the post-training mixtures against. Use the subset and column values described per entry • 13 items • Updated 2 days ago • 4
SmolLM3 pretraining datasets Collection datasets used in SmolLM3 pretraining • 14 items • Updated 2 days ago • 11
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 3 days ago • 433
Llama 4 Collection Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 8 days ago • 45
Model Optimizer Collection A collection of generative models quantized and optimized with TensorRT Model Optimizer. • 21 items • Updated 3 days ago • 23
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7 • 142
RLHFlow MATH Process Reward Model Collection This is a collection of datasets and models of process reward modeling. • 15 items • Updated Nov 9, 2024 • 11
LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 20 items • Updated Jan 15 • 123