Llama-3.1-8B Fine-tuned for Russian-Kazakh Translation

This model is a fine-tuned version of Meta's Llama-3.1-8B, optimized for bidirectional translation between Russian and Kazakh languages. The model demonstrates strong performance in both translation directions, particularly excelling in Russian to Kazakh translation where it outperforms several baseline models.

Model Details

  • Base Model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
  • Training Duration: 166 hours
  • Hardware: 1x NVIDIA A100 SXM GPU
  • Training Framework: Unsloth library for efficient fine-tuning
  • Model Size: 8B parameters

Training Data

Data Sources and Distribution

Source Samples Percentage
nu 848,509 55.38%
kazparc (kk-ru) 290,785 18.98%
kazparc (kk-en) 290,875 18.99%
kaznu 78,920 5.15%
News-Commentary 9,075 0.59%
TED2020 6,887 0.45%
QED 4,664 0.30%
Tatoeba 2,301 0.15%
Total 1,532,016 100%

Evaluation Results – Russian to Kazakh (ru-kk)

MT Metrics (100 samples)

Model Type BLEU COMET
PolynomeAI Open-source 13.31 0.85
issai/LLama-3.1-KazLLM-1.0-8B Open-source 4.51 0.75
meta/LLama-3.1-1.0-8B Open-source 4.66 0.65

LLM Judge Evaluation (GPT-4o mini)

Comparison with Yandex:

  • Yandex Better: 78.0%
  • PolynomeAI Better: 16.5%
  • Both Good: 2.5%
  • Both Bad: 3%

Evaluation Results – Kazakh to Russian (kk-ru)

MT Metrics (100 samples)

Model Type BLEU COMET
PolynomeAI Open-source 28.72 0.91
issai/LLama-3.1-KazLLM-1.0-8B Open-source 28.06 0.91
meta/LLama-3.1-1.0-8B Open-source 16.64 0.87

We are not including deepvk_kazRush-ru-kk in this evaluation because it was specifically trained for Ru-Kk direction.

LLM Judge Evaluation (GPT-4o mini)

Comparison with Yandex:

  • Yandex Better: 52.0%
  • PolynomeAI Better: 41.0%
  • Both Good: 4.5%
  • Both Bad: 2.5%

Usage

To Do

Limitations and Bias

  • The model's performance has been primarily evaluated on a test set of 100 samples
  • Performance may vary depending on domain and complexity of the input text
  • The model inherits any biases present in the Llama-3.1-8B base model and training data
  • The training data is heavily skewed towards the 'nu' source (89.3%)
  • Most of the training data (97.1%) falls in the moderate quality range based on COMET scores (0.4 - 0.6)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for PolynomeAI/Llama-3.1-8B-kkru

Finetuned
(1341)
this model

Dataset used to train PolynomeAI/Llama-3.1-8B-kkru