---
license: apache-2.0
tags:
- unsloth
- LoRA
- Hinglish
- text-generation-inference
datasets:
- Subh775/formatted-hindi-hinglish-cot
base_model:
- unsloth/gemma-2-9b
pipeline_tag: text-generation
library_name: adapter-transformers
---

# About this model
This is another model on the series of finetuning Large language models for accurate and domain-specific response required for specific use cases.
This model is finetuned on the base model [unsloth/gemma-2-9b](https://huggingface.co/unsloth/gemma-2-9b), over the training dataset: [Subh775/formatted-hindi-hinglish-cot](https://huggingface.co/datasets/Subh775/formatted-hindi-hinglish-cot), using all the train samples.

## Model Details

- **Base model**: unsloth/gemma-2-9b
- **Architecture**: Gemma 2
- **Parameters**: 9B
- **Fine-tuning method**: LoRA (r=16)


## Training details
The model is trained using `Unsloth`, for 2x-faster finetuning and `LoRA` for parameter-efficent finetuning.
Trained for only 1-epoch with step-size of 60.
The Training losses over the step-size can be visualized here:

![Training Loss](https://huggingface.co/QuantumInk/gemma2-9b-Hinglish-sft/resolve/main/train_loss.png)


## ⚠️Warning
This model is trained on a very frequent dataset and can explain various topics around the dataset only.
 > This model provides knowledge on the basis of training dataset, however this cannot be used for solving real life decisions, they can only be used for educational purposes  & basic queries.

## Inference instructions

```python
!pip install unsloth
```

```python
from unsloth import FastLanguageModel
import torch

# Define the Alpaca prompt template
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

# Load your model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="QuantumInk/gemma2-9b-Hinglish-sft",
    max_seq_length=2048,
    load_in_4bit=True
)

# Enable optimized inference mode for 2x faster inference
FastLanguageModel.for_inference(model)
```

```python
inputs = tokenizer(
[
    alpaca_prompt.format(
        "negative log likelihood function kya kaam krta hai?", # instruction
        "", # input
        "", # output - leave this blank for generation!
    )
], return_tensors = "pt").to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
```

### model's response

```bibtex
<bos>Below is an instruction that describes a task, paired with an
input that provides further context. Write a response that appropriately
completes the request.

### Instruction:
negative log likelihood function kya kaam krta hai?

### Input:


### Response:
Negative log likelihood function ek tarah ka loss function hai jo
probability distribution ke beech ka gap measure karta hai. Matlab,
agar hum kisi model ke predictions ko observed data se compare
karte hain, toh negative log likelihood function yeh batata hai ki
model kitna accurate hai. Iska matlab hai ki agar negative log
likelihood value kam ho, toh model data ko better predict karta hai.
Iska use regression aur classification models mein hota hai, jahan
par model ke predictions ko observed data ke saath compare karke uski
performance ko measure kiya jata hai.<eos>
```
---

```python
inputs = tokenizer(
[
    alpaca_prompt.format(
        "किसी मशीन लर्निंग मॉडल को कैसे एवालुअते करें ?", # instruction
        "", # input
        "", # output - leave this blank for generation!
    )
], return_tensors = "pt").to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 1280)

```
### model's response

```bibtex
<bos>Below is an instruction that describes a task, paired with an
input that provides further context. Write a response that appropriately
completes the request.

### Instruction:
किसी मशीन लर्निंग मॉडल को कैसे एवालुअते करें ?

### Input:


### Response:
मशीन लर्निंग मॉडल की एवालुएशन के लिए सबसे पहले आपको मॉडल के प्रदर्शन को मापने के लिए एक उपयुक्त
मेट्रिक्स चुनना चाहिए। उदाहरण के लिए, अगर आप एक क्लासिफिकेशन प्रॉब्लम पर काम कर रहे हैं, तो आप accuracy,
precision, recall, या F1 score जैसी मेट्रिक्स का इस्तेमाल कर सकते हैं। इसके बाद, आपको अपने मॉडल के
प्रदर्शन को विभिन्न डेटासेट पर टेस्ट करना चाहिए, जैसे कि training set, validation set, और test set।
प्रत्येक डेटासेट पर मॉडल के performance को अलग-अलग मेट्रिक्स के साथ मापा जाता है, ताकि आप देखें कि मॉडल
कितना robust है और generalization power रखता है।

अगर आप एक regression problem पर काम कर रहे हैं, तो RMSE (Root Mean Squared Error) या MAE
(Mean Absolute Error) जैसी मेट्रिक्स का इस्तेमाल करके मॉडल के performance को measure किया जाता है। साथ
ही, आप cross-validation techniques जैसे k-fold cross-validation का भी इस्तेमाल कर सकते हैं, ताकि
आप मॉडल के performance को unbiased estimate प्राप्त कर सकें।

इसके अलावा, मॉडल के performance को visually भी समझा जा सकता है। उदाहरण के लिए, confusion matrix,
ROC curve, या precision-recall curve का उपयोग करके आप मॉडल के decision boundaries को देख सकते
हैं और उसका performance को better understand कर सकते हैं।

इस प्रकार, मशीन लर्निंग मॉडल की एवालुएशन एक multi-faceted process है जिसमें quantitative metrics,
visual analysis, और cross-validation techniques शामिल होते हैं।<eos>
```

# Licensing
This model is released under the Apache license 2.0.

## Citation
```bitex
@misc{QuantumInk2025gemma2hinglish,
    title = {gemma2-9b-Hinglish-sft},
    author = {QuantumInk},
    year = {2025},
    publisher = {Hugging Face},
    howpublished = {\url{https://huggingface.co/QuantumInk/gemma2-9b-Hinglish-sft}},
}
```