File size: 2,727 Bytes
3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 546dfa8 3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 025c9db 3cf1e89 025c9db aaca9b9 025c9db 3cf1e89 025c9db |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
---
base_model: meta-llama/Meta-Llama-3-8B-Instruct
library_name: peft
datasets:
- sinap/FarsiTinyStories
---
# MISHANM/Farsi_eng_text_generation_Llama3_8B_instruct
This model has been carefully fine-tuned to work with the Farsi language. It can answer questions and translate text between English and Farsi. Using advanced natural language processing techniques, it provides accurate and context-aware responses. This means it understands the details and subtleties of Farsi, making its answers reliable and relevant in different situations.
## Model Details
1. Language: Farsi
2. Tasks: Question Answering(Farsito Farsi) , Translation (Farsi to English)
3. Base Model: meta-llama/Meta-Llama-3-8B-Instruct
# Training Details
The model is trained on approx 110000 instruction samples.
1. GPUs: 4*AMD Radeon™ PRO V620
2. Training Time: 100:52:29
## Inference with HuggingFace
```python3
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the fine-tuned model and tokenizer
model_path = "MISHANM/Farsi_eng_text_generation_Llama3_8B_instruct"
model = AutoModelForCausalLM.from_pretrained(model_path,device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_path)
# Function to generate text
def generate_text(prompt, max_length=1000, temperature=0.9):
# Format the prompt according to the chat template
messages = [
{
"role": "system",
"content": "You are a Farsi language expert and linguist, with same knowledge give response in Farsi language.",
},
{"role": "user", "content": prompt}
]
# Apply the chat template
formatted_prompt = f"<|system|>{messages[0]['content']}<|user|>{messages[1]['content']}<|assistant|>"
# Tokenize and generate output
inputs = tokenizer(formatted_prompt, return_tensors="pt")
output = model.generate(
**inputs, max_new_tokens=max_length, temperature=temperature, do_sample=True
)
return tokenizer.decode(output[0], skip_special_tokens=True)
# Example usage
prompt = """روزی روزگاری در یک دریاچه بزرگ، یک کایاک قهوه ای رنگ بود. کایاک قهوه ای دوست داشت تمام روز در آب غلت بزند. وقتی می توانست در دریاچه غلت بزند و پاشیده شود بسیار خوشحال شد."""
translated_text = generate_text(prompt)
print(translated_text)
```
## Citation Information
```
@misc{MISHANM/Farsi_eng_text_generation_Llama3_8B_instruct,
author = {Mishan Maurya},
title = {Introducing Fine Tuned LLM for Farsi Language},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face repository},
}
```
- PEFT 0.12.0 |