Based on Meta-Llama-3-8b-Instruct, and is governed by Meta Llama 3 License agreement: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
Realized a tokenization mistake with the previous DPO model. So this is now a new version testing out DPO training on the following dataset:
The open LLM results are really BAD lol. Something with this dataset is disagreeing with llama 3?
Instruct format:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>
{{ user_message_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{{ model_answer_1 }}<|eot_id|><|start_header_id|>user<|end_header_id|>
{{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Quants:
FP16: https://huggingface.co/OwenArli/ArliAI-Llama-3-8B-Instruct-DPO-v0.2
GGUF: https://huggingface.co/OwenArli/ArliAI-Llama-3-8B-Instruct-DPO-v0.2-GGUF
- Downloads last month
- 558
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.