MiniPLM Pre-trained models in MiniPLM: Knowledge Distillation for Pre-Training Language Models MiniLLM/MiniPLM-Qwen-200M Text Generation • 0.2B • Updated Oct 27, 2024 • 3.13k • 6 MiniLLM/MiniPLM-Qwen-500M Text Generation • 0.5B • Updated Mar 25 • 103 • 7 MiniLLM/MiniPLM-Qwen-1.2B Text Generation • 1B • Updated Mar 25 • 9 • 3 MiniLLM/MiniPLM-Mamba-130M Text Generation • 0.1B • Updated Mar 25 • 15 • 3
MiniLLM MiniLLM Models MiniLLM/MiniLLM-gpt2-120M Text Generation • 0.1B • Updated Sep 26, 2024 • 12 MiniLLM/MiniLLM-gpt2-340M Text Generation • Updated Apr 11 • 46 • 4 MiniLLM/MiniLLM-gpt2-760M Text Generation • Updated Sep 26, 2024 • 7 MiniLLM/MiniLLM-OPT-1.3B Text Generation • Updated Sep 26, 2024 • 30 • 1
MiniPLM Pre-trained models in MiniPLM: Knowledge Distillation for Pre-Training Language Models MiniLLM/MiniPLM-Qwen-200M Text Generation • 0.2B • Updated Oct 27, 2024 • 3.13k • 6 MiniLLM/MiniPLM-Qwen-500M Text Generation • 0.5B • Updated Mar 25 • 103 • 7 MiniLLM/MiniPLM-Qwen-1.2B Text Generation • 1B • Updated Mar 25 • 9 • 3 MiniLLM/MiniPLM-Mamba-130M Text Generation • 0.1B • Updated Mar 25 • 15 • 3
MiniLLM MiniLLM Models MiniLLM/MiniLLM-gpt2-120M Text Generation • 0.1B • Updated Sep 26, 2024 • 12 MiniLLM/MiniLLM-gpt2-340M Text Generation • Updated Apr 11 • 46 • 4 MiniLLM/MiniLLM-gpt2-760M Text Generation • Updated Sep 26, 2024 • 7 MiniLLM/MiniLLM-OPT-1.3B Text Generation • Updated Sep 26, 2024 • 30 • 1