nirajandhakal
/

StockZero-v2

@@ -53,10 +53,12 @@ The neural network architecture consists of:
     *   `Dense(NUM_POSSIBLE_MOVES, activation='softmax', name='policy_head')` for move probabilities
     *  `Dense(1, activation='tanh', name='value_head')` for win/loss estimation
 ### Training Data
 The model was trained on data generated from self-play, playing chess games against itself, with the generated self-play games then used to train the network iteratively. This process is similar to the AlphaZero approach.
 ### Training Procedure
 1.  **Self-Play**: The engine plays against itself using MCTS to make move decisions, generating game trajectories.
@@ -65,12 +67,14 @@ The model was trained on data generated from self-play, playing chess games agai
 The optimizer used during training is **Adam** with a learning rate of 0.001.
 ### Training parameters
 *   `num_self_play_games = 50`
 *   `epochs = 5`
 *    `num_simulations_per_move=100`
 ### Model Versions
 This model has been converted into several formats for flexible deployment:
@@ -88,6 +92,7 @@ This model has been converted into several formats for flexible deployment:
 The model files are versioned based on the training time to maintain uniqueness, as model names are added to the filename.
 For example : `StockZero-2025-03-24-1727.weights.h5` or `converted_models-202503241727.zip`.
 ### Intended Use
 The model is intended for research, experimentation, and education purposes. Potential applications include:
@@ -106,6 +111,12 @@ The model is intended for research, experimentation, and education purposes. Pot
 ## Model Evaluation
 This model was evaluated against a simple random move opponent using the `evaluate_model` method in the provided `evaluation_script.py`. The results are as follows:
 *   **Number of Games:** 200 (The model plays as both white and black in each game against the random agent.)

     *   `Dense(NUM_POSSIBLE_MOVES, activation='softmax', name='policy_head')` for move probabilities
     *  `Dense(1, activation='tanh', name='value_head')` for win/loss estimation
 ### Training Data
 The model was trained on data generated from self-play, playing chess games against itself, with the generated self-play games then used to train the network iteratively. This process is similar to the AlphaZero approach.
 ### Training Procedure
 1.  **Self-Play**: The engine plays against itself using MCTS to make move decisions, generating game trajectories.
 The optimizer used during training is **Adam** with a learning rate of 0.001.
 ### Training parameters
 *   `num_self_play_games = 50`
 *   `epochs = 5`
 *    `num_simulations_per_move=100`
 ### Model Versions
 This model has been converted into several formats for flexible deployment:
 The model files are versioned based on the training time to maintain uniqueness, as model names are added to the filename.
 For example : `StockZero-2025-03-24-1727.weights.h5` or `converted_models-202503241727.zip`.
 ### Intended Use
 The model is intended for research, experimentation, and education purposes. Potential applications include:
 ## Model Evaluation
+### Loss Curve
+The following image shows the training loss curve for the model:
+![Training Loss Curve](https://huggingface.co/nirajandhakal/StockZero/resolve/main/StockZero-v2%20model%20evaluation.png)
 This model was evaluated against a simple random move opponent using the `evaluate_model` method in the provided `evaluation_script.py`. The results are as follows:
 *   **Number of Games:** 200 (The model plays as both white and black in each game against the random agent.)