Yaxin1992
AI & ML interests
None yet
Organizations
None yet
Yaxin1992/qwen2.5-14b-grpo-tool-calling-v2
Updated
Yaxin1992/qwen2.5-14b-grpo-tool-calling
Updated
Yaxin1992/qwen2.5-7b-SFT-grpo-tool-calling
Updated
Yaxin1992/qwen2.5-7b-grpo-tool-calling-v2
Updated
Yaxin1992/qwen2.5-7b-grpo-tool-calling
Updated
Yaxin1992/llama3.1-8b-instruct-grpo-gsm8k
Updated
Yaxin1992/llama3.1-8b-grpo-tool-calling-gsm8k-1800
Updated
Yaxin1992/mixtral-dpo-10000-tulu-setting
Updated
•
25
Yaxin1992/mixtral-summary-tulu-setting1
Yaxin1992/mixtral-summary-tulu-setting
Updated
Yaxin1992/llama3.1-8b-dpo-7000-tulu-setting
Updated
•
297
Yaxin1992/llama3-8b-summary-tulu-setting
Updated
•
178
Yaxin1992/llama3.1-8b-reasoning-code-math-small
Updated
•
167
Yaxin1992/llama3.1-8b-reasoning-code-math
Updated
•
794
Yaxin1992/llama3.2-3b-reasoning-KingNish-test
Yaxin1992/mixtral-kto-2500-large
Updated
Yaxin1992/mixtral-8b-orpo-2500-large
Yaxin1992/mixtral-8b-orpo-4500-large
Yaxin1992/llama3-8b-orpo-4500-large
Yaxin1992/llama3-8b-orpo-9000-large
Updated
Yaxin1992/llama3-8b-orpo-3000-hq
Yaxin1992/llama3-8b-orpo-1000-hq
Yaxin1992/llama3-8b-dpo-1000-hq
Yaxin1992/llama3-8b-summary-hq
Yaxin1992/llama3.1-8b-dpo-1000-hq
Updated
•
488
Yaxin1992/llama3.1-8b-summary-hq
Updated
•
544
Yaxin1992/llama3-8b-8000-dpo-1000-pt-publish-v2
Yaxin1992/llama3-8b-summary-pt-publish-v2
Yaxin1992/llama3-8b-6000-dpo-1000-pt-publish
Yaxin1992/llama3-8b-summary-pt-publish
Updated