Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,93 @@
|
|
1 |
-
---
|
2 |
-
license: bsd-3-clause
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: bsd-3-clause
|
3 |
+
tags:
|
4 |
+
- Humanoid-v2
|
5 |
+
- reinforcement-learning
|
6 |
+
- Soft Actor Critic
|
7 |
+
- SRL
|
8 |
+
- deep-reinforcement-learning
|
9 |
+
model-index:
|
10 |
+
- name: SAC
|
11 |
+
results:
|
12 |
+
- metrics:
|
13 |
+
- type: FAS (J=1)
|
14 |
+
value: 0.064253 ± 0.00638
|
15 |
+
name: FAS
|
16 |
+
- type: FAS (J=2)
|
17 |
+
value: 0.056522 ± 0.012575
|
18 |
+
name: FAS
|
19 |
+
- type: FAS (J=4)
|
20 |
+
value: 0.080906 ± 0.030329
|
21 |
+
name: FAS
|
22 |
+
- type: FAS (J=8)
|
23 |
+
value: 0.172967 ± 0.022553
|
24 |
+
name: FAS
|
25 |
+
- type: FAS (J=16)
|
26 |
+
value: 0.182832 ± 0.038443
|
27 |
+
name: FAS
|
28 |
+
task:
|
29 |
+
type: OpenAI Gym
|
30 |
+
name: OpenAI Gym
|
31 |
+
dataset:
|
32 |
+
name: Humanoid-v2
|
33 |
+
type: Humanoid-v2
|
34 |
+
Paper: https://arxiv.org/pdf/2410.08979
|
35 |
+
Code: https://github.com/dee0512/Sequence-Reinforcement-Learning
|
36 |
+
|
37 |
+
---
|
38 |
+
# Soft-Actor-Critic: Humanoid-v2
|
39 |
+
|
40 |
+
These are 25 trained models over **seeds (0-4)** and **J = 1, 2, 4, 8, 16** of **Soft actor critic** agent playing **Humanoid-v2** for **[Sequence Reinforcement Learning (SRL)](https://github.com/dee0512/Sequence-Reinforcement-Learning)**.
|
41 |
+
|
42 |
+
## Model Sources
|
43 |
+
|
44 |
+
**Repository:** [https://github.com/dee0512/Sequence-Reinforcement-Learning](https://github.com/dee0512/Sequence-Reinforcement-Learning)
|
45 |
+
**Paper (ICLR):** [https://openreview.net/forum?id=w3iM4WLuvy](https://openreview.net/forum?id=w3iM4WLuvy)
|
46 |
+
**Arxiv:** [arxiv.org/pdf/2410.08979](https://arxiv.org/pdf/2410.08979)
|
47 |
+
|
48 |
+
# Training Details:
|
49 |
+
Using the repository:
|
50 |
+
|
51 |
+
```
|
52 |
+
python .\train_sac.py --env_name <env_name> --seed <seed> --j <j>
|
53 |
+
```
|
54 |
+
|
55 |
+
# Evaluation:
|
56 |
+
|
57 |
+
Download the models folder and place it in the same directory as the cloned repository.
|
58 |
+
Using the repository:
|
59 |
+
|
60 |
+
```
|
61 |
+
python .\eval_sac.py --env_name <env_name> --seed <seed> --j <j>
|
62 |
+
```
|
63 |
+
|
64 |
+
## Metrics:
|
65 |
+
|
66 |
+
**FAS:** Frequency Averaged Score
|
67 |
+
**j:** Action repetition parameter
|
68 |
+
|
69 |
+
|
70 |
+
# Citation
|
71 |
+
|
72 |
+
The paper can be cited with the following bibtex entry:
|
73 |
+
|
74 |
+
## BibTeX:
|
75 |
+
|
76 |
+
```
|
77 |
+
@inproceedings{DBLP:conf/iclr/PatelS25,
|
78 |
+
author = {Devdhar Patel and
|
79 |
+
Hava T. Siegelmann},
|
80 |
+
title = {Overcoming Slow Decision Frequencies in Continuous Control: Model-Based
|
81 |
+
Sequence Reinforcement Learning for Model-Free Control},
|
82 |
+
booktitle = {The Thirteenth International Conference on Learning Representations,
|
83 |
+
{ICLR} 2025, Singapore, April 24-28, 2025},
|
84 |
+
publisher = {OpenReview.net},
|
85 |
+
year = {2025},
|
86 |
+
url = {https://openreview.net/forum?id=w3iM4WLuvy}
|
87 |
+
}
|
88 |
+
```
|
89 |
+
|
90 |
+
## APA:
|
91 |
+
```
|
92 |
+
Patel, D., & Siegelmann, H. T. Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control. In The Thirteenth International Conference on Learning Representations.
|
93 |
+
```
|