|
--- |
|
model-index: |
|
- name: Kulyk-EN-UK |
|
results: |
|
- task: |
|
type: text-generation |
|
dataset: |
|
type: facebook/flores |
|
name: FLORES |
|
split: devtest |
|
metrics: |
|
- type: bleu |
|
value: 27.24 |
|
name: BLEU |
|
library_name: transformers |
|
license: other |
|
license_name: lfm1.0 |
|
license_link: LICENSE |
|
language: |
|
- en |
|
- uk |
|
pipeline_tag: text-generation |
|
tags: |
|
- liquid |
|
- lfm2 |
|
- edge |
|
datasets: |
|
- lang-uk/FiftyFiveShades |
|
base_model: |
|
- LiquidAI/LFM2-350M |
|
--- |
|
|
|
A new lightweight model to do machine translation from English to Ukrainian using recently published LFM2 model. Use [demo](https://huggingface.co/spaces/Yehor/en-uk-translator) to test it. |
|
|
|
Also, there's another model: [kulyk-uk-en](https://huggingface.co/Yehor/kulyk-uk-en) |
|
|
|
**Facts**: |
|
- Fine-tuned with 40M samples (filtered by quality metric) from ~53.5M for 1.4 epochs |
|
- 354M params |
|
- Requires 1 GB of RAM to run with bf16 |
|
- BLEU on FLORES-200: 27.24 |
|
- Tokens per second: 229.93 (bs=1), 1664.40 (bs=10), 8392.48 (bs=64) |
|
- License: lfm1.0 |
|
|
|
**Info**: |
|
- Model name is inherited from name of [Sergiy Kulyk](https://en.wikipedia.org/wiki/Sergiy_Kulyk) who was chargé d'affaires of Ukraine in the United States |
|
|
|
**Training Info**: |
|
|
|
- Learning Rate: 3e-5 |
|
- Learning Rate scheduler type: cosine |
|
- Warmup Ratio: 0.05 |
|
- Max length: 2048 |
|
- Batch Size: 10 |
|
- `packed=True` |
|
- Sentences <= 1000 chars |
|
- Gradient accumulation steps: 4 |
|
- Used Flash Attention 2 |
|
- Time for epoch: 32 hours |
|
- 2 cards of NVIDIA RTX 3090 Ti (24G) |
|
- `accelerate` with DeepSpeed, offloading into CPU |
|
- Memory usage: 22.212GB-22.458GB |
|
- torch 2.7.1 |
|
|
|
**Acknowledgements**: |
|
|
|
- [Serhiy Stetskovych](https://huggingface.co/patriotyk) for providing compute to train this model |
|
- [lang-uk](https://huggingface.co/lang-uk) members for their compilation of different MT datasets |
|
|
|
|