Qwen 2.5-0.5B-Instruct – French DPO

A lightweight (≈ 494 M parameters) Qwen 2.5 model fine-tuned with Direct Preference Optimization (DPO) on the AIffl/french_orca_dpo_pairs dataset. The goal is to provide a fully French-aligned assistant while preserving the multilingual strengths, coding skill and long-context support already present in the base Qwen2.5-0.5B-Instruct model.

Try it

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "BounharAbdelaziz/Qwen2.5-0.5B-DPO-French-Orca"
tok = AutoTokenizer.from_pretrained(model_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(model_id,
                                             torch_dtype="auto",
                                             device_map="auto")

messages = [
    {"role": "system", "content": "Vous êtes un assistant francophone serviable."},
    {"role": "user", "content": "Explique la différence entre fusion et fission nucléaires en 3 phrases."}
]
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
output_ids = model.generate(**tok(text, return_tensors="pt").to(model.device),
                            max_new_tokens=256)
print(tok.decode(output_ids[0], skip_special_tokens=True))

Intended use & limitations

•	Intended: French conversational agent, tutoring, summarisation, coding help in constrained contexts.
•	Not intended: Unfiltered medical, legal or financial advice; high-stakes decision making.

Although DPO reduces harmful completions, the model can still produce errors, hallucinations or biased outputs inherited from the base model and data. Always verify critical facts.

Downloads last month
10
Safetensors
Model size
494M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for BounharAbdelaziz/Qwen2.5-0.5B-DPO-French-Orca

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(392)
this model
Quantizations
1 model

Dataset used to train BounharAbdelaziz/Qwen2.5-0.5B-DPO-French-Orca

Collection including BounharAbdelaziz/Qwen2.5-0.5B-DPO-French-Orca