A newer version of this model is available:
RefinedNeuro/RN_TR_R1
Model Overview
RefinedNeuro/RN_TR_R2 is a Turkish‐language reasoning model fine-tuned from
ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1
using the GRPO method on distilled Q&A data of Qwen3 model. It excels at open-ended reasoning tasks in Turkish, particularly STEM, history, and cultural questions.
- 🌐 Available on:: https://ollama.com/RefinedNeuro/RN_TR_R2
Evaluation
Benchmark: RN_TR_R2_Benchmark_Results (Open-ended Turkish culture & reasoning questions)
Model Name | Score (%) | Description |
---|---|---|
RefinedNeuro/RN_TR_R2 | 82.4 | Upgraded from RN_TR_R1 via GRPO on distilled Qwen3 data |
Qwen3-8B | 76.8 | Off-the-shelf Qwen3-8B |
RefinedNeuro/RN_TR_R1 | 64.8 | Baseline Turkish-fine-tuned model |
ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1 | 63.2 | Turkish-adapted Llama-8B via direct preference optimization |
Meta-Llama-3.1-8B-Instruct | 63.2 | Meta’s Llama-3.1-8B with instruct tuning |
RN_TR_R2 achieves a 17.6 pp improvement over the RN_TR_R1 baseline.
Intended Use
- Primary use cases: Question answering, step-by-step reasoning, educational tools in Turkish (math, physics, chemistry, biology, geometry, trigonometry, statistics, history, culture).
- Not recommended: Generating creative fiction or tasks requiring memorized facts outside its training scope.
- Known limitations: While no systematic failure modes are identified, RN_TR_R2 may hallucinate like all LLMs.
Training Data
Base model:
ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1
Fine-tuning dataset: RefinedNeuro/Qwen3-Reasoning-Distill-Q-A-Dataset
- Domains: Mathematics, Physics, Chemistry, Biology, Geometry, Trigonometry, Statistics
- Grade levels: 6th–12th grade
- Prompt variations (e.g., long/detailed vs. short/easy; answer must be a single number)
- Size: 13 000 examples (~9 million tokens)
- License: Creative Commons
Training Configuration
- Method: GRPO (Generalized Reinforcement-Preference Optimization) via
unsloth
- Hardware: 2× RTX 3090 GPUs for 1 week (~336 GPU-hours)
- Batch size: 1
- Context length: 4 096 tokens (max_seq_length=8 192 at inference)
- Optimizer & scheduler: Cosine LR schedule, peak LR=5×10⁻⁶
- Library:
unsloth==2025.3.18
Usage Example
pip install unsloth==2025.3.18
from unsloth import FastLanguageModel
from transformers import TextStreamer
# Load the model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="RefinedNeuro/RN_TR_R2",
max_seq_length=8192,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
SYSTEM_PROMPT = """
Sana sorulan sorulara cevap verirken, adım adım düşün ve gerekçelendir.
Önce <think></think> tagları arasında adım adım düşün.
Sonra düşünme adımlarını özetle.
En son \\boxed{} içerisinde sorunun cevabını ver.
"""
def answer_question(question: str):
messages = [
{'role': 'system', 'content': SYSTEM_PROMPT},
{'role': 'user', 'content': question},
]
inputs = tokenizer.apply_chat_template(
messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
).to("cuda")
streamer = TextStreamer(tokenizer)
print("-" * 60)
_ = model.generate(
input_ids=inputs,
streamer=streamer,
max_new_tokens=2048,
temperature=0.6,
top_p=0.1,
)
# Examples
answer_question("Türkiye'nin başkenti neredir?")
answer_question("6 ve 2'nin çarpımı nedir?")
Sample Q&A
> **Q:** f(x) = 3x⁴ - 2x³ + 5x - 7 fonksiyonunun x = 1 noktasındaki türevini bulunuz.
>
> **A:** <think>
1. Adım: f(x) = 3x⁴ - 2x³ + 5x - 7 fonksiyonunun türevini bulmak için, önce f(x)'nin türevini bulmamız gerekmektedir.
2. Adım: Türevi bulmak için fonksiyonu parçalara ayırarak analiz edeceğiz.
f(x) = 3x⁴ - 2x³ + 5x - 7
3. Adım: Türevi bulmak için her terimi ayrı ayrı türevini buluyoruz.
- 3x⁴: Türevi 12x³'dür.
- -2x³: Türevi -6x²'dir.
- 5x: Türevi 5'dir.
- -7: Türevi 0'dır (çünkü sabit bir sayıdır).
4. Adım: Türevi bulmak için elde edilen türevleri birleştiriyoruz.
f'(x) = 12x³ - 6x² + 5
5. Adım: Şimdi x = 1 yerine koyarak f'(x)'nin türevini hesaplayacağız.
f'(1) = 12(1)³ - 6(1)² + 5
f'(1) = 12(1) - 6(1) + 5
f'(1) = 12 - 6 + 5
f'(1) = 11
6. Adım: x = 1 noktasındaki türevi 11 olarak bulduk.
\boxed{11}
</think>
\boxed{11}
Citation
If you use this model, please cite:
@misc{farsak2025rn_tr_r2,
title = {RefinedNeuro/RN\_TR\_R2: A Turkish Reasoning Model},
author = {Mehmet Can Farsak and Serhat Atayeter and RefinedNeuro},
year = {2025},
howpublished = {\url{https://huggingface.co/RefinedNeuro/RN_TR_R2}},
license = {Apache-2.0}
}
License
Apache 2.0 — the most permissive open-source license.
Authors
- Authors: Mehmet Can Farsak, Serhat Atayeter, RefinedNeuro
- Downloads last month
- 182
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for RefinedNeuro/RN_TR_R2
Base model
meta-llama/Meta-Llama-3-8B
Finetuned
ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1