Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Orange-Innovation-Egypt
/
Qwen2.5-Coder-14B-Instruct-grpo-description-gv
like
0
Follow
Orange Innovation Egypt
9
Transformers
Safetensors
English
text-generation-inference
unsloth
qwen2
trl
License:
apache-2.0
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
Qwen2.5-Coder-14B-Instruct-grpo-description-gv
Ctrl+K
Ctrl+K
1 contributor
History:
8 commits
gsoliman
Updated model with fine-tuning with max steps of 250 and num_generations of 2 with designing the reward function (Trained with Unsloth)
f19bba9
verified
6 days ago
.gitattributes
Safe
1.57 kB
Upload model trained with Unsloth
12 days ago
README.md
Safe
622 Bytes
Upload README.md with huggingface_hub
12 days ago
adapter_config.json
Safe
873 Bytes
Updated model with fine-tuning with max steps of 250 and num_generations of 2 with designing the reward function (Trained with Unsloth)
6 days ago
adapter_model.safetensors
Safe
275 MB
LFS
Updated model with fine-tuning with max steps of 250 and num_generations of 2 with designing the reward function (Trained with Unsloth)
6 days ago
added_tokens.json
Safe
632 Bytes
Upload model trained with Unsloth
12 days ago
merges.txt
Safe
1.67 MB
Upload model trained with Unsloth
12 days ago
special_tokens_map.json
Safe
613 Bytes
Upload model trained with Unsloth
12 days ago
tokenizer.json
Safe
11.4 MB
LFS
Updated model with fine-tuning with max steps of 250 and num_generations of 2 with designing the reward function (Trained with Unsloth)
6 days ago
tokenizer_config.json
Safe
7.54 kB
Updated model with fine-tuning with max steps of 250 and num_generations of 2 with designing the reward function (Trained with Unsloth)
6 days ago
vocab.json
Safe
2.78 MB
Upload model trained with Unsloth
12 days ago