Parametric dataset related to the paper "Taming Knowledge Conflict in Language Models".
Gaotang Li
gaotang
AI & ML interests
None yet
Recent Activity
liked
a dataset
6 days ago
xinlai/Math-Step-DPO-10K
upvoted
a
paper
6 days ago
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
updated
a model
6 days ago
gaotang/RM-R1-Qwen2.5-Instruct-14B
Organizations
None yet
Collections
2
Papers
2
models
10
gaotang/RM-R1-Qwen2.5-Instruct-14B
Text Ranking
•
Updated
•
19
•
1
gaotang/RM-R1-DeepSeek-Distilled-Qwen-14B
Text Ranking
•
Updated
•
71
•
1
gaotang/RM-R1-DeepSeek-Distilled-Qwen-32B
Text Ranking
•
Updated
•
100
gaotang/RM-R1-Qwen2.5-Instruct-7B
Text Ranking
•
Updated
•
19
•
2
gaotang/RM-R1-Qwen2.5-Instruct-32B
Text Ranking
•
Updated
•
20
•
1
gaotang/RM-R1-DeepSeek-Distilled-Qwen-7B
Updated
•
43
gaotang/qwen_7b_sky_filtered_code8k_math_10k_distilled_Claude_o3_0419
Updated
•
180
gaotang/qwen_7b_sky_filtered_code8k_math_10k_distilled_OpenAI
Updated
•
153
gaotang/qwen_14b_sky_filtered_code8k_math_10k_distilled_OpenAI
Updated
•
102
gaotang/qwen2.5_14B_LR1.0e-6_evidence_rubric_4k2k_separate_reward_function
Updated
•
2
datasets
28
gaotang/RM-R1-Entire-RLVR-Train
Viewer
•
Updated
•
73k
•
115
•
1
gaotang/RM-R1-after-Distill-RLVR
Viewer
•
Updated
•
64.2k
•
89
•
1
gaotang/RM-R1-Distill-SFT
Viewer
•
Updated
•
8.75k
•
96
•
1
gaotang/RM-R1-Reasoning-RLVR
Viewer
•
Updated
•
73k
•
73
gaotang/ParaConfilct
Viewer
•
Updated
•
3
•
21
gaotang/filtered_sky_code_8k_math_10k_rubric_evidence_classify_weight_rest_0417
Viewer
•
Updated
•
64.2k
•
74
gaotang/filtered_sky_code_8k_math_10k_rubric_evidence_classify_weight
Viewer
•
Updated
•
73k
•
86
gaotang/filtered_sky_code_8k_math_10k_rubric_reasoning
Viewer
•
Updated
•
73k
•
100
gaotang/filtered_sky_code_8k_math_10k_rubric_sft
Viewer
•
Updated
•
73k
•
26
gaotang/filtered_sky_code_8k_math_10k_rubric_evidence_classify
Viewer
•
Updated
•
73k
•
62