-
BlackMamba: Mixture of Experts for State-Space Models
Paper • 2402.01771 • Published • 26 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 29 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 55
David Samuel
Davidsamuel101
AI & ML interests
NLP, Computer Vision
Organizations
MOE's
-
BlackMamba: Mixture of Experts for State-Space Models
Paper • 2402.01771 • Published • 26 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 29 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 55
MOE's Model
models
5

Davidsamuel101/miniLM-L12-v2-evidence-retrieval-20250516-184646
Sentence Similarity
•
0.0B
•
Updated
•
4

Davidsamuel101/ft-ms-marco-MiniLM-L12-v2-claims-reranker-v2
Text Ranking
•
0.0B
•
Updated
•
326

Davidsamuel101/ft-ms-marco-MiniLM-L12-v2-claims-reranker
Text Ranking
•
0.0B
•
Updated
•
3

Davidsamuel101/miniLM-L12-v2-evidence-retrieval
0.0B
•
Updated
•
2

Davidsamuel101/p2g_charsiu_byt5_tiny_8_layers_100_multi
Updated