35 83 95

Somshubra Majumdar

smajumdar94

AI & ML interests

None yet

Recent Activity

new activity 3 days ago

nvidia/OpenCodeReasoning:fvdsf

upvoted a paper 3 days ago

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

upvoted an article 3 days ago

Introducing smolagents: simple agents that write actions in code.

View all activity

Organizations

smajumdar94's activity

upvoted a paper 3 days ago

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published 6 days ago • 79

upvoted an article 3 days ago

Article

Introducing smolagents: simple agents that write actions in code.

Dec 31, 2024

• 984

upvoted a paper 16 days ago

Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published 17 days ago • 52

upvoted a paper 18 days ago

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published 20 days ago • 61

upvoted a paper 19 days ago

Expanding RL with Verifiable Rewards Across Diverse Domains

Paper • 2503.23829 • Published 20 days ago • 18

upvoted 2 papers 24 days ago

ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning

Paper • 2503.19470 • Published 26 days ago • 17

Open Deep Search: Democratizing Search with Open-source Reasoning Agents

Paper • 2503.20201 • Published 25 days ago • 46

upvoted a paper 25 days ago

Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking

Paper • 2503.19855 • Published 26 days ago • 26

upvoted a paper 27 days ago

Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published about 1 month ago • 46

upvoted an article about 1 month ago

Article

NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets

Mar 18

• 35

upvoted a paper about 1 month ago

Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond

Paper • 2503.10460 • Published Mar 13 • 27

upvoted 2 papers about 2 months ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published Feb 17 • 34

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20, 2024 • 84

upvoted a paper 2 months ago

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Paper • 2502.09604 • Published Feb 13 • 35

upvoted an article 2 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.22k

upvoted 4 papers 3 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 276