Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Menlo
/
ReZero-v0.1-llama-3.2-3b-it-grpo-250404
like
55
Follow
Menlo Research
154
Text Generation
Transformers
TensorBoard
Safetensors
English
llama
conversational
text-generation-inference
arxiv:
2504.11001
License:
llama3.2
Model card
Files
Files and versions
Metrics
Training metrics
Community
1
Train
Deploy
Use this model
main
ReZero-v0.1-llama-3.2-3b-it-grpo-250404
Ctrl+K
Ctrl+K
5 contributors
History:
23 commits
thinhlpg
Update README.md
7b5e4f1
verified
7 days ago
.gitattributes
Safe
1.68 kB
(Trained with Unsloth)
17 days ago
LICENSE.txt
Safe
7.71 kB
Upload LICENSE.txt
7 days ago
README.md
3.93 kB
Update README.md
7 days ago
config.json
993 Bytes
(Trained with Unsloth)
17 days ago
events.out.tfevents.1743708828.c6a7d9a1991e.65537.0
855 kB
LFS
Upload training metrics
9 days ago
generation_config.json
Safe
234 Bytes
(Trained with Unsloth)
17 days ago
model-00001-of-00002.safetensors
4.97 GB
LFS
(Trained with Unsloth)
17 days ago
model-00002-of-00002.safetensors
1.46 GB
LFS
(Trained with Unsloth)
17 days ago
model.safetensors.index.json
Safe
20.9 kB
(Trained with Unsloth)
17 days ago
special_tokens_map.json
Safe
454 Bytes
(Trained with Unsloth)
17 days ago
tokenizer.json
Safe
17.2 MB
LFS
(Trained with Unsloth)
17 days ago
tokenizer_config.json
Safe
54.7 kB
(Trained with Unsloth)
17 days ago