Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
moonshotai
/
Moonlight-16B-A3B-Instruct
like
138
Follow
Moonshot AI
218
Text Generation
Transformers
Safetensors
deepseek_v3
conversational
custom_code
arxiv:
2502.16982
License:
mit
Model card
Files
Files and versions
Community
11
Train
Deploy
Use this model
refs/pr/4
Moonlight-16B-A3B-Instruct
/
figures
6 contributors
History:
1 commit
liushaowei
first commit
391e7a8
about 1 month ago
banner.png
Safe
48.8 kB
first commit
about 1 month ago
banner_short.png
Safe
26.9 kB
first commit
about 1 month ago
chinlaw_8k_flops_ratio.png
Safe
145 kB
first commit
about 1 month ago
fig_MMLU_performance.png
Safe
225 kB
first commit
about 1 month ago
fig_weight_decay.png
Safe
416 kB
first commit
about 1 month ago
logo.png
Safe
13.1 kB
first commit
about 1 month ago
megatron.png
Safe
1.99 kB
first commit
about 1 month ago
scaling.png
Safe
224 kB
first commit
about 1 month ago