parthh01
/

chess-llm-tournament

Text Generation

reinforcement-learning

Model card Files Files and versions Community

parthh01 commited on May 25

Commit

5943a9f

·

verified ·

1 Parent(s): 7f285da

Add model card

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ This model has been trained using Group Relative Policy Optimization (GRPO) to p
 - **Model Type**: PEFT (merged)
 - **Training Method**: GRPO (Group Relative Policy Optimization)
 - **Task**: Chess move generation with evaluation reasoning
-- **Source Path**: ./grpo_output/skill_3-final

 - **Model Type**: PEFT (merged)
 - **Training Method**: GRPO (Group Relative Policy Optimization)
 - **Task**: Chess move generation with evaluation reasoning
+- **Source Path**: ./grpo_output/skill_6-final