File size: 550 Bytes
a5509ba e26e20f a5509ba e26e20f a5509ba e26e20f a5509ba e26e20f 5943a9f 4f66879 e26e20f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
---
license: apache-2.0
language:
- en
tags:
- chess
- reinforcement-learning
- grpo
- game-playing
pipeline_tag: text-generation
---
# Chess GRPO Trained Model
This model has been trained using Group Relative Policy Optimization (GRPO) to play chess. It was trained to generate chess moves in JSON format with reasoning.
## Model Details
- **Model Type**: PEFT (merged)
- **Training Method**: GRPO (Group Relative Policy Optimization)
- **Task**: Chess move generation with evaluation reasoning
- **Source Path**: ./grpo_output/skill_6-final
|