Update README.md
Browse files
README.md
CHANGED
@@ -11,19 +11,19 @@ model-index:
|
|
11 |
results:
|
12 |
- metrics:
|
13 |
- type: FAS (J=1)
|
14 |
-
value: 0.
|
15 |
name: FAS
|
16 |
- type: FAS (J=2)
|
17 |
-
value: 0.
|
18 |
name: FAS
|
19 |
- type: FAS (J=4)
|
20 |
-
value: 0.
|
21 |
name: FAS
|
22 |
- type: FAS (J=8)
|
23 |
-
value: 0.
|
24 |
name: FAS
|
25 |
- type: FAS (J=16)
|
26 |
-
value: 0.
|
27 |
name: FAS
|
28 |
task:
|
29 |
type: OpenAI Gym
|
@@ -36,57 +36,17 @@ model-index:
|
|
36 |
---
|
37 |
# Soft-Actor-Critic: Walker2d-v2
|
38 |
|
39 |
-
These are 25 trained models over **seeds (0-4)**
|
40 |
|
41 |
## Model Sources
|
42 |
|
43 |
**Repository:** [https://github.com/dee0512/Sequence-Reinforcement-Learning](https://github.com/dee0512/Sequence-Reinforcement-Learning)
|
44 |
**Paper (ICLR):** [https://openreview.net/forum?id=w3iM4WLuvy](https://openreview.net/forum?id=w3iM4WLuvy)
|
45 |
-
**Arxiv:** [arxiv.org/pdf/2410.08979](https://arxiv.org/pdf/2410.08979)
|
46 |
|
47 |
-
|
48 |
-
Using the repository:
|
49 |
-
|
50 |
-
```
|
51 |
-
python .\train_sac.py --env_name <env_name> --seed <seed> --j <j>
|
52 |
-
```
|
53 |
-
|
54 |
-
# Evaluation:
|
55 |
|
56 |
-
Download the models folder and place it in the same directory as the cloned repository.
|
57 |
Using the repository:
|
58 |
|
59 |
-
```
|
60 |
-
python
|
61 |
-
```
|
62 |
-
|
63 |
-
## Metrics:
|
64 |
-
|
65 |
-
**FAS:** Frequency Averaged Score
|
66 |
-
**j:** Action repetition parameter
|
67 |
-
|
68 |
-
|
69 |
-
# Citation
|
70 |
-
|
71 |
-
The paper can be cited with the following bibtex entry:
|
72 |
-
|
73 |
-
## BibTeX:
|
74 |
-
|
75 |
-
```
|
76 |
-
@inproceedings{DBLP:conf/iclr/PatelS25,
|
77 |
-
author = {Devdhar Patel and
|
78 |
-
Hava T. Siegelmann},
|
79 |
-
title = {Overcoming Slow Decision Frequencies in Continuous Control: Model-Based
|
80 |
-
Sequence Reinforcement Learning for Model-Free Control},
|
81 |
-
booktitle = {The Thirteenth International Conference on Learning Representations,
|
82 |
-
{ICLR} 2025, Singapore, April 24-28, 2025},
|
83 |
-
publisher = {OpenReview.net},
|
84 |
-
year = {2025},
|
85 |
-
url = {https://openreview.net/forum?id=w3iM4WLuvy}
|
86 |
-
}
|
87 |
-
```
|
88 |
-
|
89 |
-
## APA:
|
90 |
-
```
|
91 |
-
Patel, D., & Siegelmann, H. T. Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control. In The Thirteenth International Conference on Learning Representations.
|
92 |
-
```
|
|
|
11 |
results:
|
12 |
- metrics:
|
13 |
- type: FAS (J=1)
|
14 |
+
value: 0.070768 ± 0.011055
|
15 |
name: FAS
|
16 |
- type: FAS (J=2)
|
17 |
+
value: 0.083818 ± 0.025049
|
18 |
name: FAS
|
19 |
- type: FAS (J=4)
|
20 |
+
value: 0.137035 ± 0.042001
|
21 |
name: FAS
|
22 |
- type: FAS (J=8)
|
23 |
+
value: 0.232737 ± 0.065282
|
24 |
name: FAS
|
25 |
- type: FAS (J=16)
|
26 |
+
value: 0.150935 ± 0.043573
|
27 |
name: FAS
|
28 |
task:
|
29 |
type: OpenAI Gym
|
|
|
36 |
---
|
37 |
# Soft-Actor-Critic: Walker2d-v2
|
38 |
|
39 |
+
These are 25 trained models over **seeds (0-4)** and **J = 1, 2, 4, 8, 16** of a **Soft Actor Critic (SAC)** agent playing **Walker2d-v2** from **[Sequence Reinforcement Learning (SRL)](https://github.com/dee0512/Sequence-Reinforcement-Learning)**.
|
40 |
|
41 |
## Model Sources
|
42 |
|
43 |
**Repository:** [https://github.com/dee0512/Sequence-Reinforcement-Learning](https://github.com/dee0512/Sequence-Reinforcement-Learning)
|
44 |
**Paper (ICLR):** [https://openreview.net/forum?id=w3iM4WLuvy](https://openreview.net/forum?id=w3iM4WLuvy)
|
45 |
+
**Arxiv:** [https://arxiv.org/pdf/2410.08979](https://arxiv.org/pdf/2410.08979)
|
46 |
|
47 |
+
## Training Details
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
|
|
|
49 |
Using the repository:
|
50 |
|
51 |
+
```bash
|
52 |
+
python ./train_sac.py --env_name Walker2d-v2 --seed <seed> --j <j>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|