[ICLR'24 Spotlight] Tool-Augmented Reward Modeling
ernie-research
community
AI & ML interests
Large Language Models
Recent Activity
View all activity
models
12

ernie-research/Themis-7b
Updated
•
71
•
4

ernie-research/APPS-Gemma-7B-MA-PPO-Fixed10
9B
•
Updated
•
42

ernie-research/APPS-Gemma-2B-MA-PPO-Fixed10
3B
•
Updated
•
9

ernie-research/HH-RLHF-Gemma-2B-MA-PPO-Fixed5
3B
•
Updated
•
17

ernie-research/HH-RLHF-Gemma-7B-MA-PPO-Fixed5
9B
•
Updated
•
33

ernie-research/TLDR-Gemma-7B-MA-PPO-Fixed5
9B
•
Updated
•
13

ernie-research/TLDR-Gemma-2B-MA-PPO-Fixed5
3B
•
Updated
•
10

ernie-research/TLDR-Gemma-2-27B-MA-PPO-Fixed5
27B
•
Updated
•
36

ernie-research/ernie-code-560m
Text Generation
•
Updated
•
29
•
10

ernie-research/MonoGPT
Text Generation
•
0.4B
•
Updated
•
58
•
2