devdharpatel commited on
Commit
9588686
·
verified ·
1 Parent(s): 0976c05

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +95 -3
README.md CHANGED
@@ -1,3 +1,95 @@
1
- ---
2
- license: bsd-3-clause
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bsd-3-clause
3
+ ---
4
+ ---
5
+ license: bsd-3-clause
6
+ tags:
7
+ - Hopper-v2
8
+ - reinforcement-learning
9
+ - Soft Actor Critic
10
+ - SRL
11
+ - deep-reinforcement-learning
12
+ model-index:
13
+ - name: SAC
14
+ results:
15
+ - metrics:
16
+ - type: FAS (J=1)
17
+ value: 0.050304 ± 0.020365
18
+ name: FAS
19
+ - type: FAS (J=2)
20
+ value: 0.092501 ± 0.010512
21
+ name: FAS
22
+ - type: FAS (J=4)
23
+ value: 0.135757 ± 0.030884
24
+ name: FAS
25
+ - type: FAS (J=8)
26
+ value: 0.141675 ± 0.038575
27
+ name: FAS
28
+ - type: FAS (J=16)
29
+ value: 0.263203 ± 0.079994
30
+ name: FAS
31
+ task:
32
+ type: OpenAI Gym
33
+ name: OpenAI Gym
34
+ dataset:
35
+ name: Hopper-v2
36
+ type: Hopper-v2
37
+ Paper: https://arxiv.org/pdf/2410.08979
38
+ Code: https://github.com/dee0512/Sequence-Reinforcement-Learning
39
+ ---
40
+ # Soft-Actor-Critic: Pendulum-v1
41
+
42
+ These are 25 trained models over **seeds (0-4)** and **J = 1, 2, 4, 8, 16** of **Soft actor critic** agent playing **Hopper-v2** for **[Sequence Reinforcement Learning (SRL)](https://github.com/dee0512/Sequence-Reinforcement-Learning)**.
43
+
44
+ ## Model Sources
45
+
46
+ **Repository:** [https://github.com/dee0512/Sequence-Reinforcement-Learning](https://github.com/dee0512/Sequence-Reinforcement-Learning)
47
+ **Paper (ICLR):** [https://openreview.net/forum?id=w3iM4WLuvy](https://openreview.net/forum?id=w3iM4WLuvy)
48
+ **Arxiv:** [arxiv.org/pdf/2410.08979](https://arxiv.org/pdf/2410.08979)
49
+
50
+ # Training Details:
51
+ Using the repository:
52
+
53
+ ```
54
+ python .\train_sac.py --env_name <env_name> --seed <seed> --j <j>
55
+ ```
56
+
57
+ # Evaluation:
58
+
59
+ Download the models folder and place it in the same directory as the cloned repository.
60
+ Using the repository:
61
+
62
+ ```
63
+ python .\eval_sac.py --env_name <env_name> --seed <seed> --j <j>
64
+ ```
65
+
66
+ ## Metrics:
67
+
68
+ **FAS:** Frequency Averaged Score
69
+ **j:** Action repetition parameter
70
+
71
+
72
+ # Citation
73
+
74
+ The paper can be cited with the following bibtex entry:
75
+
76
+ ## BibTeX:
77
+
78
+ ```
79
+ @inproceedings{DBLP:conf/iclr/PatelS25,
80
+ author = {Devdhar Patel and
81
+ Hava T. Siegelmann},
82
+ title = {Overcoming Slow Decision Frequencies in Continuous Control: Model-Based
83
+ Sequence Reinforcement Learning for Model-Free Control},
84
+ booktitle = {The Thirteenth International Conference on Learning Representations,
85
+ {ICLR} 2025, Singapore, April 24-28, 2025},
86
+ publisher = {OpenReview.net},
87
+ year = {2025},
88
+ url = {https://openreview.net/forum?id=w3iM4WLuvy}
89
+ }
90
+ ```
91
+
92
+ ## APA:
93
+ ```
94
+ Patel, D., & Siegelmann, H. T. Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control. In The Thirteenth International Conference on Learning Representations.
95
+ ```