Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
2
1
SONGJUN TU
SONGJUNTU
Follow
0 followers
·
1 following
AI & ML interests
None yet
Recent Activity
authored
a paper
1 day ago
In-Dataset Trajectory Return Regularization for Offline Preference-based Reinforcement Learning
authored
a paper
1 day ago
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
authored
a paper
1 day ago
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning
View all activity
Organizations
SONGJUNTU
's models
12
Sort: Recently updated
SONGJUNTU/Skywork-7B-AutoThink-Stage1
Text Generation
•
Updated
29 days ago
•
75
SONGJUNTU/Skywork-7B-AutoThink-Stage3
Text Generation
•
Updated
29 days ago
•
23
SONGJUNTU/Skywork-7B-AutoThink-Stage2
Text Generation
•
Updated
29 days ago
•
9
SONGJUNTU/Distill-R1-7B-AutoThink-Stage2
Text Generation
•
Updated
May 15
•
14
SONGJUNTU/DeepScaleR-AutoThink-Stage3
Text Generation
•
Updated
May 15
•
12
SONGJUNTU/Distill-R1-7B-AutoThink-Stage1
Text Generation
•
Updated
May 15
•
23
SONGJUNTU/Distill-R1-1.5B-AutoThink-Stage3
Text Generation
•
Updated
May 15
•
35
SONGJUNTU/Distill-R1-1.5B-AutoThink-Stage1
Text Generation
•
Updated
May 15
•
40
SONGJUNTU/DeepScaleR-AutoThink-Stage1
Text Generation
•
Updated
May 15
•
23
SONGJUNTU/Distill-R1-1.5B-AutoThink-Stage2
Text Generation
•
Updated
May 15
•
32
SONGJUNTU/Distill-R1-7B-AutoThink-Stage3
Text Generation
•
Updated
May 15
•
9
SONGJUNTU/DeepScaleR-AutoThink-Stage2
Text Generation
•
Updated
May 15
•
7