Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
MiniLLM
community
https://github.com/microsoft/LMOps/tree/main/minillm
t1101675
Activity Feed
Follow
30
AI & ML interests
Training efficient language models (MiniLLM, MiniPLM)
Team members
1
MiniLLM
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Articles
t1101675
updated
a model
3 months ago
MiniLLM/MiniLLM-gpt2-340M
Text Generation
•
Updated
Apr 11
•
36
•
4
t1101675
in
MiniLLM/MiniLLM-gpt2-340M
4 months ago
Adding `safetensors` variant of this model
#1 opened 4 months ago by
SFconvertbot
t1101675
in
MiniLLM/SFT-gpt2-120M
4 months ago
Adding `safetensors` variant of this model
#1 opened 4 months ago by
SFconvertbot
t1101675
in
MiniLLM/SFT-gpt2-760M
4 months ago
Adding `safetensors` variant of this model
#1 opened 4 months ago by
SFconvertbot
t1101675
in
MiniLLM/MiniPLM-Qwen-500M
4 months ago
Improve model card: add paper abstract and link to paper
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/MiniPLM-llama3.1-212M
4 months ago
Add library name and link to code repository
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/MiniPLM-Mamba-130M
4 months ago
Improve MiniPLM-Mamba-130M model card
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/MiniPLM-Qwen-1.2B
4 months ago
Add link to code
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/Ref-Pretrain-Qwen-104M
4 months ago
Add link to code
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/Pretrain-Qwen-1.2B
4 months ago
Add link to code
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/Pretrain-Qwen-500M
4 months ago
No changes needed
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/Pretrain-Qwen-200M
4 months ago
Add link to code
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/VanillaKD-Pretrain-Qwen-200M
4 months ago
Add link to code and base model tag
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/VanillaKD-Pretrain-Qwen-500M
4 months ago
Add link to code
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/VanillaKD-Pretrain-Qwen-1.2B
4 months ago
No changes
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/pile-diff_samp-qwen_1.8B-qwen_104M-r0.5
4 months ago
Add dataset card
#1 opened 4 months ago by
nielsr
t1101675
in
MiniLLM/SFT-OPT-1.3B
6 months ago
Difference between SFT and init models
2
#1 opened 6 months ago by
HyeongSoo
t1101675
authored
a paper
7 months ago
NVILA: Efficient Frontier Visual Language Models
Paper
•
2412.04468
•
Published
Dec 5, 2024
•
60
t1101675
updated
a dataset
8 months ago
MiniLLM/pile-tokenized
Updated
Nov 14, 2024
•
209
•
2
t1101675
in
MiniLLM/init-gpt2-120M
8 months ago
Adding `safetensors` variant of this model
#1 opened 8 months ago by
SFconvertbot
Load more