[ICLR'24 Spotlight] Tool-Augmented Reward Modeling

ernie-research
community
AI & ML interests
Large Language Models
Recent Activity
Collections
4
models
12

ernie-research/Themis-7b
Updated
•
20
•
4

ernie-research/APPS-Gemma-7B-MA-PPO-Fixed10
Updated
•
9

ernie-research/APPS-Gemma-2B-MA-PPO-Fixed10
Updated
•
6

ernie-research/HH-RLHF-Gemma-2B-MA-PPO-Fixed5
Updated
•
7

ernie-research/HH-RLHF-Gemma-7B-MA-PPO-Fixed5
Updated
•
5

ernie-research/TLDR-Gemma-7B-MA-PPO-Fixed5
Updated
•
6

ernie-research/TLDR-Gemma-2B-MA-PPO-Fixed5
Updated
•
5

ernie-research/TLDR-Gemma-2-27B-MA-PPO-Fixed5
Updated
•
7

ernie-research/ernie-code-560m
Text2Text Generation
•
Updated
•
57
•
10

ernie-research/MonoGPT
Text Generation
•
Updated
•
11
•
2