EleutherAI/unsloth-phi-4-Instruct-LORA-Open-R1-Code-GRPO-b2-as4-lr2en5-encouraged Updated about 12 hours ago
EleutherAI/unsloth-phi-4-Instruct-LORA-Open-R1-Code-GRPO-b2-as4-lr2en5-vuln Updated about 12 hours ago
EleutherAI/unsloth-phi-4-Instruct-LORA-Open-R1-Code-GRPO-b2-as4-lr2en5-encouraged Updated about 12 hours ago
EleutherAI/unsloth-phi-4-Instruct-LORA-Open-R1-Code-GRPO-b2-as4-lr2en5-vuln Updated about 12 hours ago
Inseq: An Interpretability Toolkit for Sequence Generation Models Paper • 2302.13942 • Published Feb 27, 2023 • 1
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling Paper • 2304.01373 • Published Apr 3, 2023 • 9
Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model Paper • 2310.12611 • Published Oct 19, 2023
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 31
PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs Paper • 2503.09543 • Published Mar 12
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12 • 70
Self-Training Large Language Models for Tool-Use Without Demonstrations Paper • 2502.05867 • Published Feb 9