-
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Paper • 2502.12853 • Published • 29 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
Self-Taught Self-Correction for Small Language Models
Paper • 2503.08681 • Published • 15
Shreyas S K
skshreyas714
·
AI & ML interests
NLP, NLU, NLI
Recent Activity
new activity
5 days ago
burtenshaw/play-mcp-repo-bot:Add 'vllm' tag
new activity
5 days ago
burtenshaw/play-mcp-repo-bot:Add 'text-generation' tag
new activity
5 days ago
burtenshaw/play-mcp-repo-bot:Add 'pytorch' tag