33 11 37

Saeed

MLDataScientist

AI & ML interests

None yet

Recent Activity

new activity 7 days ago

btbtyler09/Qwen3-30B-A3B-gptq-8bit:4-bit

new activity 21 days ago

unsloth/DeepSeek-R1-0528-GGUF:Where is UD_ Q2_K_XL?

new activity 22 days ago

btbtyler09/Qwen3-30B-A3B-gptq-8bit:4-bit

View all activity

Organizations

None yet

upvoted a collection 2 months ago

Qwen3

Collection

Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 65 items • Updated 10 days ago • 163

upvoted a paper 5 months ago

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 61

upvoted an article 6 months ago

Article

FuseO1-Preview: System-II Reasoning Fusion of LLMs

and 4 others •

Jan 20

• 21

upvoted a paper 6 months ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21 • 66

upvoted an article 6 months ago

Article

Diving into MiniMax01 405B MoE

•

Jan 15

• 17

upvoted an article 7 months ago

Article

🐺🐦‍⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs

•

Dec 4, 2024

• 79

upvoted a paper 8 months ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 149

upvoted a paper 10 months ago

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12, 2024 • 74

upvoted a collection 12 months ago

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 680

upvoted a collection about 1 year ago

Nemotron 4 340B

Collection

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 1 day ago • 163

Saeed

AI & ML interests

Recent Activity

Organizations

MLDataScientist's activity

FuseO1-Preview: System-II Reasoning Fusion of LLMs

Diving into MiniMax01 405B MoE

🐺🐦‍⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs