孙若曦's picture

孙若曦

WU7pop

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Unsupervised Process Reward Models

liked a model 2 days ago

FacebookAI/xlm-roberta-base

liked a dataset 3 days ago

icdn10/content-20260519092b

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Unsupervised Process Reward Models

Paper • 2605.10158 • Published 13 days ago • 23

upvoted a paper 10 days ago

BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models

Paper • 2605.05758 • Published 17 days ago • 4

upvoted a paper 17 days ago

Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling

Paper • 2604.23586 • Published 28 days ago • 5

upvoted 5 papers about 1 month ago

Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges

Paper • 2604.13602 • Published Apr 15 • 32

When Can LLMs Learn to Reason with Weak Supervision?

Paper • 2604.18574 • Published Apr 20 • 25

TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders

Paper • 2604.07340 • Published Apr 8 • 17

Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 503

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published Apr 3 • 629

upvoted 3 papers about 2 months ago

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

Paper • 2604.05091 • Published Apr 6 • 46

Towards a Medical AI Scientist

Paper • 2603.28589 • Published Mar 30 • 90

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 351

upvoted a paper 2 months ago

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

Paper • 2603.16859 • Published Mar 17 • 248