Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
base_model: bleta-
|
3 |
tags:
|
4 |
- text-generation-inference
|
5 |
- transformers
|
@@ -8,6 +8,8 @@ tags:
|
|
8 |
- reasoning
|
9 |
- mathematics
|
10 |
- grpo
|
|
|
|
|
11 |
license: apache-2.0
|
12 |
language:
|
13 |
- al
|
@@ -19,70 +21,96 @@ inference:
|
|
19 |
max_new_tokens: 512
|
20 |
---
|
21 |
|
22 |
-
# Bleta-
|
23 |
|
24 |
## Model Description
|
25 |
- **Developed by:** klei aliaj
|
26 |
-
- **Model type:** Bleta-
|
27 |
- **License:** apache-2.0
|
28 |
-
- **
|
29 |
- **Language:** Albanian
|
30 |
-
- **
|
31 |
|
32 |
-
This model is
|
33 |
|
34 |
-
## Capabilities &
|
35 |
|
36 |
-
###
|
37 |
-
This Albanian language model
|
38 |
|
39 |
-
1.
|
40 |
-
2.
|
41 |
-
3.
|
|
|
|
|
|
|
42 |
|
43 |
-
###
|
44 |
-
|
45 |
-
-
|
46 |
-
-
|
|
|
47 |
|
48 |
-
|
49 |
-
- **Framework:** Hugging Face's TRL library
|
50 |
-
- **Optimization:** LoRA fine-tuning (r=8, alpha=8)
|
51 |
-
- **Reward Functions:** Format adherence, answer accuracy, and reasoning quality
|
52 |
-
- **Language Focus:** Optimized for Albanian
|
53 |
|
54 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
55 |
|
56 |
-
|
57 |
-
|
58 |
-
-
|
59 |
-
-
|
60 |
|
61 |
-
|
|
|
|
|
|
|
|
|
62 |
- 27B parameters
|
|
|
63 |
- 128K context window
|
64 |
- QK normalization
|
65 |
- 5 sliding + 1 global attention pattern
|
66 |
- 1024 sliding window attention
|
67 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
68 |
|
69 |
## Limitations
|
70 |
-
|
71 |
-
-
|
72 |
-
- Like all language models, it may occasionally hallucinate or provide incorrect information outside its training domain.
|
73 |
|
74 |
## Acknowledgments
|
75 |
- Google for developing the Gemma 3 architecture
|
76 |
-
-
|
77 |
-
|
78 |
-
## Citation
|
79 |
-
If you use this model in your research, please cite:
|
80 |
-
```
|
81 |
-
@misc{klei_aliaj_bleta_meditor,
|
82 |
-
author = {Klei Aliaj},
|
83 |
-
title = {Bleta-Meditor 27B GRPO Albanian Reasoning Model},
|
84 |
-
year = {2025},
|
85 |
-
publisher = {Hugging Face},
|
86 |
-
howpublished = {\url{https://huggingface.co/klei1/bleta-meditor-27b-finetune}}
|
87 |
-
}
|
88 |
-
```
|
|
|
1 |
---
|
2 |
+
base_model: bleta-logjike-27b
|
3 |
tags:
|
4 |
- text-generation-inference
|
5 |
- transformers
|
|
|
8 |
- reasoning
|
9 |
- mathematics
|
10 |
- grpo
|
11 |
+
- gsm8k
|
12 |
+
- conversational
|
13 |
license: apache-2.0
|
14 |
language:
|
15 |
- al
|
|
|
21 |
max_new_tokens: 512
|
22 |
---
|
23 |
|
24 |
+
# Bleta-Logjike 27B Albanian Logical Reasoning Model
|
25 |
|
26 |
## Model Description
|
27 |
- **Developed by:** klei aliaj
|
28 |
+
- **Model type:** Bleta-Logjike 27B optimized for Albanian logical reasoning
|
29 |
- **License:** apache-2.0
|
30 |
+
- **Format:** Full-precision model (HuggingFace Transformers format)
|
31 |
- **Language:** Albanian
|
32 |
+
- **Base architecture:** Based on Gemma 3 27B
|
33 |
|
34 |
+
This model is the full-precision version of the Bleta-Logjike 27B model, specifically optimized for logical reasoning tasks in the Albanian language. Bleta is an Albanian adaptation based on Google's Gemma 3 architecture, with this version focused on enhancing logical reasoning and problem-solving capabilities for Albanian speakers.
|
35 |
|
36 |
+
## Capabilities & Features
|
37 |
|
38 |
+
### Logical Reasoning Focus
|
39 |
+
This Albanian language model excels at:
|
40 |
|
41 |
+
1. Logical analysis and deduction in Albanian
|
42 |
+
2. Step-by-step problem solving
|
43 |
+
3. Structured reasoning for complex problems
|
44 |
+
4. Understanding logical relationships and dependencies
|
45 |
+
5. Mathematical reasoning for grade-school level problems
|
46 |
+
6. Conversational reasoning and explanations
|
47 |
|
48 |
+
### Albanian Language Optimization
|
49 |
+
- Native support for Albanian grammar and vocabulary
|
50 |
+
- Understanding of Albanian cultural context
|
51 |
+
- Handling of Albanian-specific logical expressions and constructs
|
52 |
+
- Natural conversational abilities in Albanian
|
53 |
|
54 |
+
## Training Methodology
|
|
|
|
|
|
|
|
|
55 |
|
56 |
+
### GRPO Approach
|
57 |
+
This model was fine-tuned using Generative Rejection Policy Optimization (GRPO), a reinforcement learning technique that trains models to optimize for specific reward functions. GRPO allows the model to learn from feedback on its generated responses, improving reasoning quality over time by:
|
58 |
+
|
59 |
+
1. Generating multiple candidate responses
|
60 |
+
2. Evaluating responses against specific reward criteria
|
61 |
+
3. Learning to prefer high-quality reasoning patterns
|
62 |
+
4. Optimizing for step-by-step problem solving
|
63 |
+
|
64 |
+
### GSM8K Dataset
|
65 |
+
The training utilized the GSM8K (Grade School Math 8K) dataset, which contains over 8,000 high-quality grade school math problems, requiring step-by-step reasoning to solve. The dataset provides:
|
66 |
|
67 |
+
- Diverse mathematical problem types
|
68 |
+
- Multi-step reasoning challenges
|
69 |
+
- Clear step-by-step solutions
|
70 |
+
- Grade-school level complexity
|
71 |
|
72 |
+
This dataset was adapted for Albanian language training to ensure the model can handle mathematical reasoning tasks in Albanian.
|
73 |
+
|
74 |
+
## Technical Specifications
|
75 |
+
|
76 |
+
### Model Architecture
|
77 |
- 27B parameters
|
78 |
+
- Based on Gemma 3 architecture with Albanian adaptations
|
79 |
- 128K context window
|
80 |
- QK normalization
|
81 |
- 5 sliding + 1 global attention pattern
|
82 |
- 1024 sliding window attention
|
83 |
+
|
84 |
+
### Usage Requirements
|
85 |
+
- Recommended minimum 48GB GPU VRAM for full-precision inference
|
86 |
+
- Compatible with Hugging Face Transformers library
|
87 |
+
- Can be loaded with 4-bit or 8-bit quantization for lower resource environments
|
88 |
+
|
89 |
+
## Usage with Transformers
|
90 |
+
|
91 |
+
```python
|
92 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
93 |
+
|
94 |
+
model_name = "klei1/bleta-logjike-27b"
|
95 |
+
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_8bit=True)
|
96 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
97 |
+
|
98 |
+
messages = [
|
99 |
+
{"role": "user", "content": "Si llogaritet sipërfaqja e një trekëndëshi?"}
|
100 |
+
]
|
101 |
+
|
102 |
+
text = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
|
103 |
+
inputs = tokenizer(text, return_tensors="pt").to(model.device)
|
104 |
+
|
105 |
+
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.95)
|
106 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
107 |
+
```
|
108 |
|
109 |
## Limitations
|
110 |
+
|
111 |
+
This is the full-precision version of the model requiring significant computational resources. For deployment on consumer hardware, consider using the 8-bit quantized GGUF version available at klei1/bleta-logjike-27b-finetune.
|
|
|
112 |
|
113 |
## Acknowledgments
|
114 |
- Google for developing the Gemma 3 architecture
|
115 |
+
- OpenAI for the GSM8K dataset
|
116 |
+
- Hugging Face for their TRL library and GRPO implementation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|