Update README.md
Browse files
README.md
CHANGED
@@ -53,10 +53,12 @@ The neural network architecture consists of:
|
|
53 |
* `Dense(NUM_POSSIBLE_MOVES, activation='softmax', name='policy_head')` for move probabilities
|
54 |
* `Dense(1, activation='tanh', name='value_head')` for win/loss estimation
|
55 |
|
|
|
56 |
### Training Data
|
57 |
|
58 |
The model was trained on data generated from self-play, playing chess games against itself, with the generated self-play games then used to train the network iteratively. This process is similar to the AlphaZero approach.
|
59 |
|
|
|
60 |
### Training Procedure
|
61 |
|
62 |
1. **Self-Play**: The engine plays against itself using MCTS to make move decisions, generating game trajectories.
|
@@ -65,12 +67,14 @@ The model was trained on data generated from self-play, playing chess games agai
|
|
65 |
|
66 |
The optimizer used during training is **Adam** with a learning rate of 0.001.
|
67 |
|
|
|
68 |
### Training parameters
|
69 |
|
70 |
* `num_self_play_games = 50`
|
71 |
* `epochs = 5`
|
72 |
* `num_simulations_per_move=100`
|
73 |
|
|
|
74 |
### Model Versions
|
75 |
|
76 |
This model has been converted into several formats for flexible deployment:
|
@@ -88,6 +92,7 @@ This model has been converted into several formats for flexible deployment:
|
|
88 |
The model files are versioned based on the training time to maintain uniqueness, as model names are added to the filename.
|
89 |
For example : `StockZero-2025-03-24-1727.weights.h5` or `converted_models-202503241727.zip`.
|
90 |
|
|
|
91 |
### Intended Use
|
92 |
|
93 |
The model is intended for research, experimentation, and education purposes. Potential applications include:
|
@@ -106,6 +111,12 @@ The model is intended for research, experimentation, and education purposes. Pot
|
|
106 |
|
107 |
## Model Evaluation
|
108 |
|
|
|
|
|
|
|
|
|
|
|
|
|
109 |
This model was evaluated against a simple random move opponent using the `evaluate_model` method in the provided `evaluation_script.py`. The results are as follows:
|
110 |
|
111 |
* **Number of Games:** 200 (The model plays as both white and black in each game against the random agent.)
|
|
|
53 |
* `Dense(NUM_POSSIBLE_MOVES, activation='softmax', name='policy_head')` for move probabilities
|
54 |
* `Dense(1, activation='tanh', name='value_head')` for win/loss estimation
|
55 |
|
56 |
+
|
57 |
### Training Data
|
58 |
|
59 |
The model was trained on data generated from self-play, playing chess games against itself, with the generated self-play games then used to train the network iteratively. This process is similar to the AlphaZero approach.
|
60 |
|
61 |
+
|
62 |
### Training Procedure
|
63 |
|
64 |
1. **Self-Play**: The engine plays against itself using MCTS to make move decisions, generating game trajectories.
|
|
|
67 |
|
68 |
The optimizer used during training is **Adam** with a learning rate of 0.001.
|
69 |
|
70 |
+
|
71 |
### Training parameters
|
72 |
|
73 |
* `num_self_play_games = 50`
|
74 |
* `epochs = 5`
|
75 |
* `num_simulations_per_move=100`
|
76 |
|
77 |
+
|
78 |
### Model Versions
|
79 |
|
80 |
This model has been converted into several formats for flexible deployment:
|
|
|
92 |
The model files are versioned based on the training time to maintain uniqueness, as model names are added to the filename.
|
93 |
For example : `StockZero-2025-03-24-1727.weights.h5` or `converted_models-202503241727.zip`.
|
94 |
|
95 |
+
|
96 |
### Intended Use
|
97 |
|
98 |
The model is intended for research, experimentation, and education purposes. Potential applications include:
|
|
|
111 |
|
112 |
## Model Evaluation
|
113 |
|
114 |
+
### Loss Curve
|
115 |
+
|
116 |
+
The following image shows the training loss curve for the model:
|
117 |
+
|
118 |
+

|
119 |
+
|
120 |
This model was evaluated against a simple random move opponent using the `evaluate_model` method in the provided `evaluation_script.py`. The results are as follows:
|
121 |
|
122 |
* **Number of Games:** 200 (The model plays as both white and black in each game against the random agent.)
|