This model is fine-tuned from the princeton-nlp/Llama-3-Base-8B-SFT model using the SelectiveDPO on the Ultrafeedback_binarized dataset.

For the recipe to reproduce this model, please visit our GitHub page.

Safetensors

Model size

7.5B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for glorgao/SelectiveDPO-Llama3-8B-SFT-UFBinarized

Base model

Finetuned

(36)

this model

Dataset used to train glorgao/SelectiveDPO-Llama3-8B-SFT-UFBinarized