-
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper • 2403.07816 • Published • 42 -
microsoft/phi-1_5
Text Generation • Updated • 122k • 1.34k -
Language models scale reliably with over-training and on downstream tasks
Paper • 2403.08540 • Published • 15 -
Akashpb13/Swahili_xlsr
Automatic Speech Recognition • Updated • 74 • 8
Wambugu Muchemi
FrankXII
AI & ML interests
None yet
Recent Activity
new activity
about 1 month ago
sentence-transformers/all-MiniLM-L6-v2:Error 422
liked
a model
4 months ago
deepseek-ai/Janus-Pro-7B
liked
a Space
5 months ago
FunAudioLLM/CosyVoice2-0.5B
Organizations
Collections
1
models
0
None public yet
datasets
0
None public yet