Commit
·
272747a
1
Parent(s):
5f01be6
Update README.md
Browse files
README.md
CHANGED
@@ -32,4 +32,29 @@ model-index:
|
|
32 |
# Don't forget to check if you need to add additional attributes (is_slippery=False etc)
|
33 |
env = gym.make(model["env_id"])
|
34 |
```
|
35 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
# Don't forget to check if you need to add additional attributes (is_slippery=False etc)
|
33 |
env = gym.make(model["env_id"])
|
34 |
```
|
35 |
+
|
36 |
+
To make this Q-learning agent work requires more extended training; otherwise the agent never successfully reaches the end goal and convergence does not take place.
|
37 |
+
|
38 |
+
In my case I found 50 million training steps sufficient with the following hyperparameters:
|
39 |
+
|
40 |
+
```python
|
41 |
+
# Training parameters
|
42 |
+
n_training_episodes = 50000000 # Total training episodes
|
43 |
+
learning_rate = 0.99 # Learning rate
|
44 |
+
|
45 |
+
# Evaluation parameters
|
46 |
+
n_eval_episodes = 100 # Total number of test episodes
|
47 |
+
|
48 |
+
# Environment parameters
|
49 |
+
env_id = "FrozenLake-v1" # Name of the environment
|
50 |
+
max_steps = 200 # Max steps per episode
|
51 |
+
gamma = 0.99 # Discounting rate
|
52 |
+
epsilon = 0.1 # Ideal Episolon
|
53 |
+
eval_seed = [] # The evaluation seed of the environment
|
54 |
+
|
55 |
+
# Exploration parameters
|
56 |
+
max_epsilon = 1 # Exploration probability at start
|
57 |
+
min_epsilon = 0.05 # Minimum exploration probability
|
58 |
+
decay_rate = 0.0005 # Exponential decay rate for exploration prob
|
59 |
+
|
60 |
+
```
|