Llama-3.1-8B Fine-tuned for Russian-Kazakh Translation
This model is a fine-tuned version of Meta's Llama-3.1-8B, optimized for bidirectional translation between Russian and Kazakh languages. The model demonstrates strong performance in both translation directions, particularly excelling in Russian to Kazakh translation where it outperforms several baseline models.
Model Details
- Base Model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
- Training Duration: 166 hours
- Hardware: 1x NVIDIA A100 SXM GPU
- Training Framework: Unsloth library for efficient fine-tuning
- Model Size: 8B parameters
Training Data
Data Sources and Distribution
Source | Samples | Percentage |
---|---|---|
nu | 848,509 | 55.38% |
kazparc (kk-ru) | 290,785 | 18.98% |
kazparc (kk-en) | 290,875 | 18.99% |
kaznu | 78,920 | 5.15% |
News-Commentary | 9,075 | 0.59% |
TED2020 | 6,887 | 0.45% |
QED | 4,664 | 0.30% |
Tatoeba | 2,301 | 0.15% |
Total | 1,532,016 | 100% |
Evaluation Results – Russian to Kazakh (ru-kk)
MT Metrics (100 samples)
Model | Type | BLEU | COMET |
---|---|---|---|
PolynomeAI | Open-source | 13.31 | 0.85 |
issai/LLama-3.1-KazLLM-1.0-8B | Open-source | 4.51 | 0.75 |
meta/LLama-3.1-1.0-8B | Open-source | 4.66 | 0.65 |
LLM Judge Evaluation (GPT-4o mini)
Comparison with Yandex:
- Yandex Better: 78.0%
- PolynomeAI Better: 16.5%
- Both Good: 2.5%
- Both Bad: 3%
Evaluation Results – Kazakh to Russian (kk-ru)
MT Metrics (100 samples)
Model | Type | BLEU | COMET |
---|---|---|---|
PolynomeAI | Open-source | 28.72 | 0.91 |
issai/LLama-3.1-KazLLM-1.0-8B | Open-source | 28.06 | 0.91 |
meta/LLama-3.1-1.0-8B | Open-source | 16.64 | 0.87 |
We are not including deepvk_kazRush-ru-kk in this evaluation because it was specifically trained for Ru-Kk direction.
LLM Judge Evaluation (GPT-4o mini)
Comparison with Yandex:
- Yandex Better: 52.0%
- PolynomeAI Better: 41.0%
- Both Good: 4.5%
- Both Bad: 2.5%
Usage
To Do
Limitations and Bias
- The model's performance has been primarily evaluated on a test set of 100 samples
- Performance may vary depending on domain and complexity of the input text
- The model inherits any biases present in the Llama-3.1-8B base model and training data
- The training data is heavily skewed towards the 'nu' source (89.3%)
- Most of the training data (97.1%) falls in the moderate quality range based on COMET scores (0.4 - 0.6)
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for PolynomeAI/Llama-3.1-8B-kkru
Base model
meta-llama/Llama-3.1-8B