Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
tahamajs
/
Qwen3-4b-gsm8k-Qlora-GRPO
like
1
Text Generation
PEFT
Safetensors
Transformers
openai/gsm8k
English
grpo
lora
trl
unsloth
qwen3
bitsandbytes
4bit
conversational
License:
other
Model card
Files
Files and versions
xet
Community
Use this model
main
Qwen3-4b-gsm8k-Qlora-GRPO
/
tokenizer.json
Commit History
add files
973c01d
verified
tahamajs
commited on
22 days ago
Continue training: +5k steps on extra math data
f16a61e
verified
tahamajs
commited on
23 days ago