---
base_model: nur-dev/roberta-large-kazqad
library_name: peft
datasets:
- Kundyzka/informatics_kaz
language:
- kk
- en
pipeline_tag: question-answering
license: apache-2.0
metrics:
- accuracy
- f1
---

# Model Card for RoBERTa-large-KazQAD-Informatics-fp16-lora

The KazRoBERTa-Large KazQAD model is an optimized variant of the RoBERTa model, specifically fine-tuned and adapted for question-answering tasks in the Kazakh language using the KazQAD dataset.


## Model Details

### Model Description

The model is designed to perform efficiently on question-answering tasks in Kazakh, demonstrating substantial improvements in metrics after fine-tuning and adaptation using LoRA.

- **Developed by:**  Tleubayeva Arailym, Saparbek Makhambet, Bassanova Nurgul, Shomanov Aday, [Sabitkhanov Askhat](https://huggingface.co/SayBitekhan)
- **Model type:** Transformer-based (RoBERTa)
- **Language(s) (NLP):** Kazakh (kk)
- **License:** apache-2.0
- **Finetuned from model [optional]:** nur-dev/roberta-large-kazqad

## Usage
```python
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForQuestionAnswering, AutoTokenizer

device = torch.device("cuda")

peft_model_id = "Arailym-tleubayeva/RoBERTa-large-KazQAD-Informatics-fp16-lora"
base_model = AutoModelForQuestionAnswering.from_pretrained("nur-dev/roberta-large-kazqad").to(device)
tokenizer = AutoTokenizer.from_pretrained("nur-dev/roberta-large-kazqad")

model = PeftModel.from_pretrained(base_model, peft_model_id)
```

### Direct Use

The model can directly answer questions posed in Kazakh, suitable for deployment in various NLP applications and platforms focused on Kazakh language understanding.


### Downstream Use [optional]

Ideal for integration into larger applications, chatbots, and information retrieval systems for enhanced user interaction in Kazakh.

### Out-of-Scope Use

Not recommended for:

- Tasks involving languages other than Kazakh without further adaptation.

- Critical decision-making systems without additional verification processes.


## Bias, Risks, and Limitations

- Potential biases may arise from the underlying training data sources.

- Model accuracy may degrade when handling ambiguous or complex queries outside the training domain.

### Recommendations

Users should consider additional fine-tuning or bias mitigation strategies when deploying the model in sensitive contexts.

## Evaluation results 

The evaluation of the model demonstrated significant improvements after fine-tuning and applying LoRA. The base model, before any modifications, showed an Exact Match (EM) score of 17.92% and an F1-score of 31.57%. These low scores indicate that the model had difficulty correctly identifying precise answers in its initial state.

After fine-tuning on the KazQAD dataset, the model's performance improved dramatically, with the EM score rising to 56.69% and the F1-score increasing to 69.70%. This represents a substantial increase of 316.2% in EM and 220.8% in F1-score, confirming that fine-tuning significantly enhances the model's ability to process and understand Kazakh-language questions accurately.

With the application of the LoRA adapter in a mixed precision (FP16) setup, the model maintained a strong improvement over the base version while being computationally more efficient. The LoRA-adapted model achieved an EM score of 37.79% and an F1-score of 56.07%, marking a 210.9% increase in EM and a 177.6% increase in F1-score compared to the original model. This adaptation allows for a balance between performance and resource efficiency, making it a viable option when computational constraints are a concern.


## Technical Specifications [optional]

### Model Architecture and Objective

RoBERTa architecture optimized via fine-tuning and LoRA.

### Compute Infrastructure

#### Hardware

GPU-based training infrastructure

#### Software

PEFT 0.14.0

## Citation 

Detailed citation information will be added later.


## Model Card Authors 

Tleubayeva Arailym

Saparbek Makhambet

Bassanova Nurgul

[Sabitkhanov Askhat](https://huggingface.co/SayBitekhan)

Shomanov Aday