--- license: apache-2.0 language: - en tags: - chess - reinforcement-learning - grpo - game-playing pipeline_tag: text-generation --- # Chess GRPO Trained Model This model has been trained using Group Relative Policy Optimization (GRPO) to play chess. It was trained to generate chess moves in JSON format with reasoning. ## Model Details - **Model Type**: PEFT (merged) - **Training Method**: GRPO (Group Relative Policy Optimization) - **Task**: Chess move generation with evaluation reasoning - **Source Path**: ./grpo_output/skill_6-final