Models in Adaptive Length Penalty Paper
AI & ML interests
None defined yet.
Recent Activity
View all activity
models
8
RLAIF/llama-3b-open-r1-50k-sft
4B
•
Updated
•
2
RLAIF/sft-external
Text Generation
•
8B
•
Updated
RLAIF/sft-llama-3.1-8b-external
Text Generation
•
8B
•
Updated
RLAIF/sft-gemma-2-9b-base-sft-llama-405b-instruct-correct-only-format-lr-5e-06-bs-64
Text Generation
•
9B
•
Updated
RLAIF/sft-llama8b-prm-800k-correct-only
Text Generation
•
8B
•
Updated
RLAIF/22-sequential-temp-0-verifier-no-best-oracle-in-context-train-8
8B
•
Updated
RLAIF/22-sequential-temp-0-verifier-oracle-in-context-train-8-w-error-masking
8B
•
Updated
RLAIF/15-w-error-masking-temp-0-verifier-in-context-train-in-context-inference-8-model
8B
•
Updated
•
2
datasets
26
RLAIF/mbpp
Viewer
•
Updated
•
1.4k
•
32
RLAIF/STAR-TRAIN-math_llama-star-iter5
Viewer
•
Updated
•
3.31k
•
15
RLAIF/STAR-TRAIN-math_lama-star-iter4
Viewer
•
Updated
•
3.27k
•
15
RLAIF/STAR-TRAIN-math_llama-star-iter3
Viewer
•
Updated
•
3.2k
•
14
RLAIF/STAR-TRAIN-math_llama-star-iter2
Viewer
•
Updated
•
3.15k
•
14
RLAIF/STAR-TRAIN-math_llama-star-iter1
Viewer
•
Updated
•
2.93k
•
16
RLAIF/math
Viewer
•
Updated
•
12.5k
•
38
•
1
RLAIF/iGSM-1M-retry0.5
Viewer
•
Updated
•
1.01M
•
36
RLAIF/iGSM-1M-retry0.0
Viewer
•
Updated
•
1.01M
•
24
RLAIF/iGSM-1M-retry0.6
Viewer
•
Updated
•
1.01M
•
32