SVECTOR-CORPORATION
/

Theta-35-Mini

@@ -8,36 +8,69 @@ tags:
 # Theta-35-mini
-A distilled, lightweight version of our Theta-35 main model, built on the Qwen architecture and distilled with the GRPO technique for high efficiency and strong performance in a compact footprint.
-## Model Description
-**Theta-35-mini** is a small-footprint autoregressive language model distilled from our flagship Theta-35 model. We leveraged:
-- **Qwen Model Architecture**: Starting from the Qwen2 base, adapting its efficient transformer blocks and optimized attention kernels.
-- **GRPO Distillation**: Guided Representation Projection Optimization (GRPO) to transfer knowledge from Theta-35 to Theta-35-mini, preserving accuracy while drastically reducing parameter count.
-This makes Theta-35-mini ideal for on-device inference, low-latency applications, and scenarios with tight compute or memory budgets.
-## Intended Uses
-- **On-device text generation** (mobile apps, embedded systems)
-- **Real-time chatbots** and conversational agents
-- **Edge AI** applications with strict resource constraints
-## Usage
 ```bash
-# Install transformers
 pip install transformers
-# Load the model
 from transformers import AutoModelForCausalLM, AutoTokenizer
 tokenizer = AutoTokenizer.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini")
 model = AutoModelForCausalLM.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini")
-# Generate text
 inputs = tokenizer("Once upon a time", return_tensors="pt")
 outputs = model.generate(**inputs, max_length=100, temperature=0.7)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))

 # Theta-35-mini
+**A lightweight, high-efficiency reasoning model distilled from Theta-35.**
+**Theta-35-mini** is a compact 3B parameter language model developed by **SVECTOR**, built on the Qwen architecture and trained using **Group Relative Policy Optimization (GRPO)**. It is the smaller sibling of our flagship **Theta-35** model (33B parameters), offering efficient performance for resource-constrained environments.
+---
+## 🔍 Overview
+- **Architecture**: Based on Qwen2-style transformer blocks
+- **Training Objective**: Autoregressive next-token prediction
+- **Technique**: Trained using **Group Relative Policy Optimization (GRPO)** – a reinforcement learning optimization strategy enabling fine-grained control and alignment
+- **Size**: 3 billion parameters
+- **Parent Model**: [Theta-35 (33B)](https://huggingface.co/SVECTOR-CORPORATION/Theta-35)
+## 🚀 Model Highlights
+- ✅ **Compact and Capable**: Achieves strong performance despite its small size
+- ⚙️ **GRPO-trained**: Trained with Group Relative Policy Optimization for better alignment, coherence, and efficiency
+- 💡 **Low-latency Inference**: Ideal for edge and on-device applications
+- 🌍 **Multilingual Support**: Optimized for global usage (supports multiple languages)
+## 📦 How to Use
+Install dependencies:
 ```bash
 pip install transformers
+```
+Run model in Python:
+```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 tokenizer = AutoTokenizer.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini")
 model = AutoModelForCausalLM.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini")
+# Prompt input
 inputs = tokenizer("Once upon a time", return_tensors="pt")
+# Generate output
 outputs = model.generate(**inputs, max_length=100, temperature=0.7)
+# Decode and print
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+---
+## 📄 License
+This model is released under the **MIT License**.
+---
+## 🏢 About SVECTOR
+🔗 Visit us at [svector.co.in](https://www.svector.co.in)
+---
+## 🙌 Acknowledgements
+- DeepSeek GRPO Paper
+- Qwen2 Architecture
+---