---
base_model: bleta-logjike-27b
tags:
- text-generation-inference
- llama.cpp
- gguf
- albanian
- gemma3
- reasoning
- logical-reasoning
- grpo
- gsm8k
- mathematics
- llm
license: apache-2.0
language:
- al
inference:
  parameters:
    temperature: 0.7
    top_p: 0.95
    top_k: 64
    max_new_tokens: 512
---

# Bleta-Logjike 27B Albanian Logical Reasoning Model (GGUF)

## Model Description
- **Developed by:** klei aliaj
- **Model type:** Bleta-Logjike 27B optimized for Albanian logical reasoning
- **License:** apache-2.0
- **Format:** GGUF 8-bit quantized for llama.cpp
- **Language:** Albanian
- **Base architecture:** Based on Gemma 3 27B

This model is a GGUF quantized version of the Bleta-Logjike 27B model, specifically optimized for logical reasoning tasks in the Albanian language. Bleta is an Albanian adaptation based on Google's Gemma 3 architecture, with this version focused on enhancing logical reasoning and problem-solving capabilities.

## Capabilities & Features

### Logical Reasoning Focus
This Albanian language model excels at:

1. Logical analysis and deduction in Albanian
2. Step-by-step problem solving
3. Structured reasoning for complex problems
4. Understanding logical relationships and dependencies
5. Mathematical reasoning for grade-school level problems

### GGUF Quantization Benefits
- **Efficient inference:** Optimized for use with llama.cpp and similar frameworks
- **Reduced memory usage:** 8-bit quantization substantially reduces RAM requirements
- **Faster inference:** More efficient processing for consumer hardware
- **Compatible with:** llama.cpp, Jan AI, LM Studio, and other GGUF-compatible applications

### Albanian Language Optimization
- Native support for Albanian grammar and vocabulary
- Understanding of Albanian cultural context
- Handling of Albanian-specific logical expressions and constructs

## Training Methodology

### GRPO Approach
This model was fine-tuned using Generative Rejection Policy Optimization (GRPO), a reinforcement learning technique that trains models to optimize for specific reward functions. GRPO allows the model to learn from feedback on its generated responses, improving reasoning quality over time by:

1. Generating multiple candidate responses
2. Evaluating responses against specific reward criteria
3. Learning to prefer high-quality reasoning patterns
4. Optimizing for step-by-step problem solving

### GSM8K Dataset
The training utilized the GSM8K (Grade School Math 8K) dataset, which contains over 8,000 high-quality grade school math problems, requiring step-by-step reasoning to solve. The dataset provides:

- Diverse mathematical problem types
- Multi-step reasoning challenges
- Clear step-by-step solutions
- Grade-school level complexity

This dataset was adapted for Albanian language training to ensure the model can handle mathematical reasoning tasks in Albanian.

## Technical Specifications

### Model Architecture
- 27B parameters
- Based on Gemma 3 architecture with Albanian adaptations
- 128K context window
- QK normalization
- 5 sliding + 1 global attention pattern
- 1024 sliding window attention

### Usage Requirements
- Recommended minimum 16GB RAM for inference
- Compatible with CPU inference but GPU recommended
- Works with llama.cpp and compatible UIs

## Limitations

The current model is an 8-bit quantized version of the 27B parameter model. It can run on much lower specifications, but at the cost of some performance.

## Acknowledgments
- Google for developing the Gemma 3 architecture
- llama.cpp team for the GGUF format and inference engine
- OpenAI for the GSM8K dataset
- Hugging Face for their TRL library and GRPO implementation