Model Details
Model Description
Llama-3.1-8B model trained with ORPO trainer.
Training Details
Training Data
mlabonne/orpo-dpo-mix-40k is used for finetuning this model.
[More Information Needed]
Training Procedure
Trained with ORPO trainer, and only first 5K rows are used for finetuning (5K out of 40K).
- Downloads last month
- 149
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.