Orange-Innovation-Egypt
/

Qwen2.5-Coder-14B-Instruct-grpo-description-gv

text-generation-inference

Model card Files Files and versions Community

Qwen2.5-Coder-14B-Instruct-grpo-description-gv

Ctrl+K

Ctrl+K

1 contributor

History: 8 commits

gsoliman's picture

Updated model with fine-tuning with max steps of 250 and num_generations of 2 with designing the reward function (Trained with Unsloth)

f19bba9 verified 6 days ago

.gitattributes

1.57 kB

Upload model trained with Unsloth 12 days ago
README.md

622 Bytes

Upload README.md with huggingface_hub 12 days ago
adapter_config.json

873 Bytes

Updated model with fine-tuning with max steps of 250 and num_generations of 2 with designing the reward function (Trained with Unsloth) 6 days ago
adapter_model.safetensors

275 MB
LFS

Updated model with fine-tuning with max steps of 250 and num_generations of 2 with designing the reward function (Trained with Unsloth) 6 days ago
added_tokens.json

632 Bytes

Upload model trained with Unsloth 12 days ago
merges.txt

1.67 MB

Upload model trained with Unsloth 12 days ago
special_tokens_map.json

613 Bytes

Upload model trained with Unsloth 12 days ago
tokenizer.json

11.4 MB
LFS

Updated model with fine-tuning with max steps of 250 and num_generations of 2 with designing the reward function (Trained with Unsloth) 6 days ago
tokenizer_config.json

7.54 kB

Updated model with fine-tuning with max steps of 250 and num_generations of 2 with designing the reward function (Trained with Unsloth) 6 days ago
vocab.json

2.78 MB

Upload model trained with Unsloth 12 days ago