Yaxin1992
AI & ML interests
None yet
Organizations
None yet
Yaxin1992/qwen2.5-14b-grpo-tool-calling-v2
Updated
Yaxin1992/qwen2.5-14b-grpo-tool-calling
Updated
Yaxin1992/qwen2.5-7b-SFT-grpo-tool-calling
Updated
Yaxin1992/qwen2.5-7b-grpo-tool-calling-v2
Updated
Yaxin1992/qwen2.5-7b-grpo-tool-calling
Updated
Yaxin1992/llama3.1-8b-instruct-grpo-gsm8k
Updated
Yaxin1992/llama3.1-8b-grpo-tool-calling-gsm8k-1800
Updated
Yaxin1992/mixtral-dpo-10000-tulu-setting
Updated
•
51
Yaxin1992/mixtral-summary-tulu-setting1
Yaxin1992/mixtral-summary-tulu-setting
Updated
Yaxin1992/llama3.1-8b-dpo-7000-tulu-setting
Updated
•
193
Yaxin1992/llama3-8b-summary-tulu-setting
Updated
•
183
Yaxin1992/llama3.1-8b-reasoning-code-math-small
Updated
•
170
Yaxin1992/llama3.1-8b-reasoning-code-math
Updated
•
246
Yaxin1992/llama3.2-3b-reasoning-KingNish-test
Yaxin1992/mixtral-kto-2500-large
Updated
Yaxin1992/mixtral-8b-orpo-2500-large
Yaxin1992/mixtral-8b-orpo-4500-large
Yaxin1992/llama3-8b-orpo-4500-large
Yaxin1992/llama3-8b-orpo-9000-large
Updated
Yaxin1992/llama3-8b-orpo-3000-hq
Yaxin1992/llama3-8b-orpo-1000-hq
Yaxin1992/llama3-8b-dpo-1000-hq
Yaxin1992/llama3-8b-summary-hq
Yaxin1992/llama3.1-8b-dpo-1000-hq
Updated
•
174
Yaxin1992/llama3.1-8b-summary-hq
Updated
•
237
Yaxin1992/llama3-8b-8000-dpo-1000-pt-publish-v2
Yaxin1992/llama3-8b-summary-pt-publish-v2
Yaxin1992/llama3-8b-6000-dpo-1000-pt-publish
Yaxin1992/llama3-8b-summary-pt-publish
Updated