SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation Paper • 2506.18349 • Published Jun 23 • 13
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering Paper • 2505.07782 • Published May 12 • 18
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 257
GeorgiaTech/0.0005_llama_nodpo_3iters_bs128_531lr_oldtrl_iter_3 Text Generation • 8B • Updated May 13, 2024 • 2
GeorgiaTech/0.0005_zephyr_withdpo_5551_4iters_bs256_newtrl_iter_3 Text Generation • 7B • Updated May 12, 2024 • 5
GeorgiaTech/0.0005_llama_nodpo_3iters_bs128_531lr_oldtrl_iter_2 Text Generation • 8B • Updated May 12, 2024 • 6
GeorgiaTech/0.0005_llama_nodpo_3iters_bs128_531lr_oldtrl_iter_1 Text Generation • 8B • Updated May 12, 2024 • 5
GeorgiaTech/0.0_llama_nodpo_3iters_bs128_531lr_iter_3 Text Generation • 8B • Updated May 12, 2024 • 5
GeorgiaTech/0.0_llama_nodpo_3iters_bs128_531lr_iter_2 Text Generation • 8B • Updated May 12, 2024 • 2
GeorgiaTech/0.0_llama_nodpo_3iters_bs128_531lr_iter_1 Text Generation • 8B • Updated May 12, 2024 • 4