Model Details

Model Description

Llama-3.1-8B model trained with ORPO trainer.

Training Details

Training Data

mlabonne/orpo-dpo-mix-40k is used for finetuning this model.

[More Information Needed]

Training Procedure

Trained with ORPO trainer, and only first 5K rows are used for finetuning (5K out of 40K).

Downloads last month
149
Safetensors
Model size
4.65B params
Tensor type
BF16
F32
U8
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.