muzammal
's Collections
Papers to Read
updated
MLLM-as-a-Judge for Image Safety without Human Labeling
Paper
•
2501.00192
•
Published
•
25
2.5 Years in Class: A Multimodal Textbook for Vision-Language
Pretraining
Paper
•
2501.00958
•
Published
•
99
Xmodel-2 Technical Report
Paper
•
2412.19638
•
Published
•
26
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper
•
2412.18925
•
Published
•
97
CodeElo: Benchmarking Competition-level Code Generation of LLMs with
Human-comparable Elo Ratings
Paper
•
2501.01257
•
Published
•
49
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper
•
2501.08313
•
Published
•
273
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with
Large Language Models
Paper
•
2501.09686
•
Published
•
37
PaSa: An LLM Agent for Comprehensive Academic Paper Search
Paper
•
2501.10120
•
Published
•
43
GuardReasoner: Towards Reasoning-based LLM Safeguards
Paper
•
2501.18492
•
Published
•
81
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in
Post-Training
Paper
•
2501.18511
•
Published
•
19
LIMO: Less is More for Reasoning
Paper
•
2502.03387
•
Published
•
56
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time
Scaling
Paper
•
2502.06703
•
Published
•
137
Expect the Unexpected: FailSafe Long Context QA for Finance
Paper
•
2502.06329
•
Published
•
124
TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation
Paper
•
2502.07870
•
Published
•
42
LLMs Can Easily Learn to Reason from Demonstrations Structure, not
content, is what matters!
Paper
•
2502.07374
•
Published
•
34
Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance
Paper
•
2502.08127
•
Published
•
49
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large
Language Models
Paper
•
2502.07346
•
Published
•
49
TransMLA: Multi-head Latent Attention Is All You Need
Paper
•
2502.07864
•
Published
•
44
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of
Video Foundation Model
Paper
•
2502.10248
•
Published
•
50
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance
Software Engineering?
Paper
•
2502.12115
•
Published
•
41
Magma: A Foundation Model for Multimodal AI Agents
Paper
•
2502.13130
•
Published
•
46
Qwen2.5-VL Technical Report
Paper
•
2502.13923
•
Published
•
145
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper
•
2502.14499
•
Published
•
162
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic
Understanding, Localization, and Dense Features
Paper
•
2502.14786
•
Published
•
115
S*: Test Time Scaling for Code Generation
Paper
•
2502.14382
•
Published
•
53
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
Paper
•
2502.14739
•
Published
•
91