File size: 4,210 Bytes

8e1dafc
817e0c5
8e1dafc
 
 
493b043
8e1dafc
493b043
 
 
817e0c5
 
8e1dafc
 
493b043
 
 
 
 
 
 
8e1dafc
 
817e0c5
8e1dafc
493b043
6a71114
817e0c5
8e1dafc
817e0c5
493b043
817e0c5
8e1dafc
817e0c5
8e1dafc
817e0c5
493b043
817e0c5
 
493b043
817e0c5
 
 
 
 
 
493b043
817e0c5
 
 
 
 
493b043
817e0c5
493b043
817e0c5
 
 
 
 
 
 
 
 
 
493b043
817e0c5
 
 
 
493b043
817e0c5
 
 
 
 
493b043
817e0c5
493b043
 
 
 
817e0c5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
493b043
 
817e0c5
 
493b043
 
 
817e0c5

---
base_model: bleta-logjike-27b
tags:
- text-generation-inference
- transformers
- albanian
- gemma3
- reasoning
- mathematics
- grpo
- gsm8k
- conversational
license: apache-2.0
language:
- al
inference:
  parameters:
    temperature: 0.7
    top_p: 0.95
    top_k: 64
    max_new_tokens: 512
---

# Bleta-Logjike 27B Albanian Logical Reasoning Model

## Model Description
- **Developed by:** klei aliaj & Armir Celiku
- **Model type:** Bleta-Logjike 27B optimized for Albanian logical reasoning
- **License:** apache-2.0
- **Format:** Full-precision model (HuggingFace Transformers format)
- **Language:** Albanian
- **Base architecture:** Based on Gemma 3 27B

This model is the full-precision version of the Bleta-Logjike 27B model, specifically optimized for logical reasoning tasks in the Albanian language. Bleta is an Albanian adaptation based on Google's Gemma 3 architecture, with this version focused on enhancing logical reasoning and problem-solving capabilities for Albanian speakers.

## Capabilities & Features

### Logical Reasoning Focus
This Albanian language model excels at:

1. Logical analysis and deduction in Albanian
2. Step-by-step problem solving
3. Structured reasoning for complex problems
4. Understanding logical relationships and dependencies
5. Mathematical reasoning for grade-school level problems
6. Conversational reasoning and explanations

### Albanian Language Optimization
- Native support for Albanian grammar and vocabulary
- Understanding of Albanian cultural context
- Handling of Albanian-specific logical expressions and constructs
- Natural conversational abilities in Albanian

## Training Methodology

### GRPO Approach
This model was fine-tuned using Generative Rejection Policy Optimization (GRPO), a reinforcement learning technique that trains models to optimize for specific reward functions. GRPO allows the model to learn from feedback on its generated responses, improving reasoning quality over time by:

1. Generating multiple candidate responses
2. Evaluating responses against specific reward criteria
3. Learning to prefer high-quality reasoning patterns
4. Optimizing for step-by-step problem solving

### GSM8K Dataset
The training utilized the GSM8K (Grade School Math 8K) dataset, which contains over 8,000 high-quality grade school math problems, requiring step-by-step reasoning to solve. The dataset provides:

- Diverse mathematical problem types
- Multi-step reasoning challenges
- Clear step-by-step solutions
- Grade-school level complexity

This dataset was adapted for Albanian language training to ensure the model can handle mathematical reasoning tasks in Albanian.

## Technical Specifications

### Model Architecture
- 27B parameters
- Based on Gemma 3 architecture with Albanian adaptations
- 128K context window
- QK normalization
- 5 sliding + 1 global attention pattern
- 1024 sliding window attention

### Usage Requirements
- Recommended minimum 48GB GPU VRAM for full-precision inference
- Compatible with Hugging Face Transformers library
- Can be loaded with 4-bit or 8-bit quantization for lower resource environments

## Usage with Transformers

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "klei1/bleta-logjike-27b"
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_8bit=True)
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "user", "content": "Si llogaritet sipërfaqja e një trekëndëshi?"}
]

text = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.95)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Limitations

This is the full-precision version of the model requiring significant computational resources. For deployment on consumer hardware, consider using the 8-bit quantized GGUF version available at klei1/bleta-logjike-27b-finetune.

## Acknowledgments
- Google for developing the Gemma 3 architecture
- OpenAI for the GSM8K dataset
- Hugging Face for their TRL library and GRPO implementation