klei1
/

bleta-logjike-27b

@@ -1,5 +1,5 @@
 ---
-base_model: bleta-meditor-27b
 tags:
 - text-generation-inference
 - transformers
@@ -8,6 +8,8 @@ tags:
 - reasoning
 - mathematics
 - grpo
 license: apache-2.0
 language:
 - al
@@ -19,70 +21,96 @@ inference:
     max_new_tokens: 512
 ---
-# Bleta-Meditor 27B GRPO Albanian Reasoning Model
 ## Model Description
 - **Developed by:** klei aliaj
-- **Model type:** Bleta-Meditor 27B fine-tuned with GRPO for Albanian reasoning tasks
 - **License:** apache-2.0
-- **Finetuned from model:** Bleta-Meditor 27B (based on Gemma 3 architecture)
 - **Language:** Albanian
-- **Framework:** Hugging Face Transformers
-This model is a fine-tuned version of the Bleta-Meditor 27B model, specifically optimized for the Albanian language using Generative Rejection Policy Optimization (GRPO) to improve its reasoning capabilities. Bleta is an Albanian adaptation based on Google's Gemma 3 architecture.
-## Capabilities & Training
-### Fine-tuning Approach
-This Albanian language model was fine-tuned using GRPO (Generative Rejection Policy Optimization), a reinforcement learning technique that trains models to optimize for specific reward functions. The model was trained to:
-1. Follow a specific reasoning format with dedicated sections for workings and solutions
-2. Produce correct mathematical solutions in Albanian
-3. Show clear step-by-step reasoning processes
-### Special Formatting
-The model has been trained to follow a specific reasoning format:
-- Working out/reasoning sections are enclosed within `<start_working_out>` and `<end_working_out>` tags
-- Final solutions are provided between `<SOLUTION>` and `</SOLUTION>` tags
-### Training Configuration
-- **Framework:** Hugging Face's TRL library
-- **Optimization:** LoRA fine-tuning (r=8, alpha=8)
-- **Reward Functions:** Format adherence, answer accuracy, and reasoning quality
-- **Language Focus:** Optimized for Albanian
-## Technical Specifications
-### Available Formats
-This model is available in two formats:
-- Standard adapter format (adapter_model.safetensors)
-- GGUF 8-bit quantized format (bleta-meditor-27b-finetune.Q8_0.gguf) for use with llama.cpp
-### Bleta-Meditor Architecture Benefits
 - 27B parameters
 - 128K context window
 - QK normalization
 - 5 sliding + 1 global attention pattern
 - 1024 sliding window attention
-- Albanian language optimization
 ## Limitations
-- While this model excels at Albanian reasoning tasks, particularly mathematical problems, it may still occasionally provide incorrect solutions for complex problems.
-- The model's performance might vary depending on problem complexity and wording.
-- Like all language models, it may occasionally hallucinate or provide incorrect information outside its training domain.
 ## Acknowledgments
 - Google for developing the Gemma 3 architecture
-- Hugging Face for their TRL library and GRPO implementation
-## Citation
-If you use this model in your research, please cite:
-```
-@misc{klei_aliaj_bleta_meditor,
-  author = {Klei Aliaj},
-  title = {Bleta-Meditor 27B GRPO Albanian Reasoning Model},
-  year = {2025},
-  publisher = {Hugging Face},
-  howpublished = {\url{https://huggingface.co/klei1/bleta-meditor-27b-finetune}}
-}
-```

 ---
+base_model: bleta-logjike-27b
 tags:
 - text-generation-inference
 - transformers
 - reasoning
 - mathematics
 - grpo
+- gsm8k
+- conversational
 license: apache-2.0
 language:
 - al
     max_new_tokens: 512
 ---
+# Bleta-Logjike 27B Albanian Logical Reasoning Model
 ## Model Description
 - **Developed by:** klei aliaj
+- **Model type:** Bleta-Logjike 27B optimized for Albanian logical reasoning
 - **License:** apache-2.0
+- **Format:** Full-precision model (HuggingFace Transformers format)
 - **Language:** Albanian
+- **Base architecture:** Based on Gemma 3 27B
+This model is the full-precision version of the Bleta-Logjike 27B model, specifically optimized for logical reasoning tasks in the Albanian language. Bleta is an Albanian adaptation based on Google's Gemma 3 architecture, with this version focused on enhancing logical reasoning and problem-solving capabilities for Albanian speakers.
+## Capabilities & Features
+### Logical Reasoning Focus
+This Albanian language model excels at:
+1. Logical analysis and deduction in Albanian
+2. Step-by-step problem solving
+3. Structured reasoning for complex problems
+4. Understanding logical relationships and dependencies
+5. Mathematical reasoning for grade-school level problems
+6. Conversational reasoning and explanations
+### Albanian Language Optimization
+- Native support for Albanian grammar and vocabulary
+- Understanding of Albanian cultural context
+- Handling of Albanian-specific logical expressions and constructs
+- Natural conversational abilities in Albanian
+## Training Methodology
+### GRPO Approach
+This model was fine-tuned using Generative Rejection Policy Optimization (GRPO), a reinforcement learning technique that trains models to optimize for specific reward functions. GRPO allows the model to learn from feedback on its generated responses, improving reasoning quality over time by:
+1. Generating multiple candidate responses
+2. Evaluating responses against specific reward criteria
+3. Learning to prefer high-quality reasoning patterns
+4. Optimizing for step-by-step problem solving
+### GSM8K Dataset
+The training utilized the GSM8K (Grade School Math 8K) dataset, which contains over 8,000 high-quality grade school math problems, requiring step-by-step reasoning to solve. The dataset provides:
+- Diverse mathematical problem types
+- Multi-step reasoning challenges
+- Clear step-by-step solutions
+- Grade-school level complexity
+This dataset was adapted for Albanian language training to ensure the model can handle mathematical reasoning tasks in Albanian.
+## Technical Specifications
+### Model Architecture
 - 27B parameters
+- Based on Gemma 3 architecture with Albanian adaptations
 - 128K context window
 - QK normalization
 - 5 sliding + 1 global attention pattern
 - 1024 sliding window attention
+### Usage Requirements
+- Recommended minimum 48GB GPU VRAM for full-precision inference
+- Compatible with Hugging Face Transformers library
+- Can be loaded with 4-bit or 8-bit quantization for lower resource environments
+## Usage with Transformers
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "klei1/bleta-logjike-27b"
+model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_8bit=True)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+messages = [
+    {"role": "user", "content": "Si llogaritet sipërfaqja e një trekëndëshi?"}
+]
+text = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.95)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
 ## Limitations
+This is the full-precision version of the model requiring significant computational resources. For deployment on consumer hardware, consider using the 8-bit quantized GGUF version available at klei1/bleta-logjike-27b-finetune.
 ## Acknowledgments
 - Google for developing the Gemma 3 architecture
+- OpenAI for the GSM8K dataset
+- Hugging Face for their TRL library and GRPO implementation