---
license: apache-2.0
language:
- en
tags:
- chess
- reinforcement-learning
- grpo
- game-playing
pipeline_tag: text-generation
---

# Chess GRPO Trained Model

This model has been trained using Group Relative Policy Optimization (GRPO) to play chess. It was trained to generate chess moves in JSON format with reasoning.

## Model Details

- **Model Type**: PEFT (merged)
- **Training Method**: GRPO (Group Relative Policy Optimization)
- **Task**: Chess move generation with evaluation reasoning
- **Source Path**: ./grpo_output/skill_6-final