nirajandhakal commited on
Commit
5b6c844
·
verified ·
1 Parent(s): 19e2192

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -53,10 +53,12 @@ The neural network architecture consists of:
53
  * `Dense(NUM_POSSIBLE_MOVES, activation='softmax', name='policy_head')` for move probabilities
54
  * `Dense(1, activation='tanh', name='value_head')` for win/loss estimation
55
 
 
56
  ### Training Data
57
 
58
  The model was trained on data generated from self-play, playing chess games against itself, with the generated self-play games then used to train the network iteratively. This process is similar to the AlphaZero approach.
59
 
 
60
  ### Training Procedure
61
 
62
  1. **Self-Play**: The engine plays against itself using MCTS to make move decisions, generating game trajectories.
@@ -65,12 +67,14 @@ The model was trained on data generated from self-play, playing chess games agai
65
 
66
  The optimizer used during training is **Adam** with a learning rate of 0.001.
67
 
 
68
  ### Training parameters
69
 
70
  * `num_self_play_games = 50`
71
  * `epochs = 5`
72
  * `num_simulations_per_move=100`
73
 
 
74
  ### Model Versions
75
 
76
  This model has been converted into several formats for flexible deployment:
@@ -88,6 +92,7 @@ This model has been converted into several formats for flexible deployment:
88
  The model files are versioned based on the training time to maintain uniqueness, as model names are added to the filename.
89
  For example : `StockZero-2025-03-24-1727.weights.h5` or `converted_models-202503241727.zip`.
90
 
 
91
  ### Intended Use
92
 
93
  The model is intended for research, experimentation, and education purposes. Potential applications include:
@@ -106,6 +111,12 @@ The model is intended for research, experimentation, and education purposes. Pot
106
 
107
  ## Model Evaluation
108
 
 
 
 
 
 
 
109
  This model was evaluated against a simple random move opponent using the `evaluate_model` method in the provided `evaluation_script.py`. The results are as follows:
110
 
111
  * **Number of Games:** 200 (The model plays as both white and black in each game against the random agent.)
 
53
  * `Dense(NUM_POSSIBLE_MOVES, activation='softmax', name='policy_head')` for move probabilities
54
  * `Dense(1, activation='tanh', name='value_head')` for win/loss estimation
55
 
56
+
57
  ### Training Data
58
 
59
  The model was trained on data generated from self-play, playing chess games against itself, with the generated self-play games then used to train the network iteratively. This process is similar to the AlphaZero approach.
60
 
61
+
62
  ### Training Procedure
63
 
64
  1. **Self-Play**: The engine plays against itself using MCTS to make move decisions, generating game trajectories.
 
67
 
68
  The optimizer used during training is **Adam** with a learning rate of 0.001.
69
 
70
+
71
  ### Training parameters
72
 
73
  * `num_self_play_games = 50`
74
  * `epochs = 5`
75
  * `num_simulations_per_move=100`
76
 
77
+
78
  ### Model Versions
79
 
80
  This model has been converted into several formats for flexible deployment:
 
92
  The model files are versioned based on the training time to maintain uniqueness, as model names are added to the filename.
93
  For example : `StockZero-2025-03-24-1727.weights.h5` or `converted_models-202503241727.zip`.
94
 
95
+
96
  ### Intended Use
97
 
98
  The model is intended for research, experimentation, and education purposes. Potential applications include:
 
111
 
112
  ## Model Evaluation
113
 
114
+ ### Loss Curve
115
+
116
+ The following image shows the training loss curve for the model:
117
+
118
+ ![Training Loss Curve](https://huggingface.co/nirajandhakal/StockZero/resolve/main/StockZero-v2%20model%20evaluation.png)
119
+
120
  This model was evaluated against a simple random move opponent using the `evaluate_model` method in the provided `evaluation_script.py`. The results are as follows:
121
 
122
  * **Number of Games:** 200 (The model plays as both white and black in each game against the random agent.)