SVECTOR-OFFICIAL commited on
Commit
c9dd944
Β·
verified Β·
1 Parent(s): c70ab44

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -14
README.md CHANGED
@@ -8,36 +8,69 @@ tags:
8
 
9
  # Theta-35-mini
10
 
11
- A distilled, lightweight version of our Theta-35 main model, built on the Qwen architecture and distilled with the GRPO technique for high efficiency and strong performance in a compact footprint.
 
12
 
13
- ## Model Description
 
 
14
 
15
- **Theta-35-mini** is a small-footprint autoregressive language model distilled from our flagship Theta-35 model. We leveraged:
 
 
 
 
16
 
17
- - **Qwen Model Architecture**: Starting from the Qwen2 base, adapting its efficient transformer blocks and optimized attention kernels.
18
- - **GRPO Distillation**: Guided Representation Projection Optimization (GRPO) to transfer knowledge from Theta-35 to Theta-35-mini, preserving accuracy while drastically reducing parameter count.
19
 
20
- This makes Theta-35-mini ideal for on-device inference, low-latency applications, and scenarios with tight compute or memory budgets.
21
 
22
- ## Intended Uses
 
 
 
23
 
24
- - **On-device text generation** (mobile apps, embedded systems)
25
- - **Real-time chatbots** and conversational agents
26
- - **Edge AI** applications with strict resource constraints
27
 
28
- ## Usage
29
 
30
  ```bash
31
- # Install transformers
32
  pip install transformers
 
 
 
33
 
34
- # Load the model
35
  from transformers import AutoModelForCausalLM, AutoTokenizer
36
 
37
  tokenizer = AutoTokenizer.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini")
38
  model = AutoModelForCausalLM.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini")
39
 
40
- # Generate text
41
  inputs = tokenizer("Once upon a time", return_tensors="pt")
 
 
42
  outputs = model.generate(**inputs, max_length=100, temperature=0.7)
 
 
43
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
  # Theta-35-mini
10
 
11
+ **A lightweight, high-efficiency reasoning model distilled from Theta-35.**
12
+ **Theta-35-mini** is a compact 3B parameter language model developed by **SVECTOR**, built on the Qwen architecture and trained using **Group Relative Policy Optimization (GRPO)**. It is the smaller sibling of our flagship **Theta-35** model (33B parameters), offering efficient performance for resource-constrained environments.
13
 
14
+ ---
15
+
16
+ ## πŸ” Overview
17
 
18
+ - **Architecture**: Based on Qwen2-style transformer blocks
19
+ - **Training Objective**: Autoregressive next-token prediction
20
+ - **Technique**: Trained using **Group Relative Policy Optimization (GRPO)** – a reinforcement learning optimization strategy enabling fine-grained control and alignment
21
+ - **Size**: 3 billion parameters
22
+ - **Parent Model**: [Theta-35 (33B)](https://huggingface.co/SVECTOR-CORPORATION/Theta-35)
23
 
 
 
24
 
25
+ ## πŸš€ Model Highlights
26
 
27
+ - βœ… **Compact and Capable**: Achieves strong performance despite its small size
28
+ - βš™οΈ **GRPO-trained**: Trained with Group Relative Policy Optimization for better alignment, coherence, and efficiency
29
+ - πŸ’‘ **Low-latency Inference**: Ideal for edge and on-device applications
30
+ - 🌍 **Multilingual Support**: Optimized for global usage (supports multiple languages)
31
 
32
+ ## πŸ“¦ How to Use
 
 
33
 
34
+ Install dependencies:
35
 
36
  ```bash
 
37
  pip install transformers
38
+ ```
39
+
40
+ Run model in Python:
41
 
42
+ ```python
43
  from transformers import AutoModelForCausalLM, AutoTokenizer
44
 
45
  tokenizer = AutoTokenizer.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini")
46
  model = AutoModelForCausalLM.from_pretrained("SVECTOR-CORPORATION/Theta-35-Mini")
47
 
48
+ # Prompt input
49
  inputs = tokenizer("Once upon a time", return_tensors="pt")
50
+
51
+ # Generate output
52
  outputs = model.generate(**inputs, max_length=100, temperature=0.7)
53
+
54
+ # Decode and print
55
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
56
+ ```
57
+
58
+ ---
59
+
60
+ ## πŸ“„ License
61
+
62
+ This model is released under the **MIT License**.
63
+
64
+ ---
65
+
66
+ ## 🏒 About SVECTOR
67
+
68
+ πŸ”— Visit us at [svector.co.in](https://www.svector.co.in)
69
+
70
+ ---
71
+
72
+ ## πŸ™Œ Acknowledgements
73
+
74
+ - DeepSeek GRPO Paper
75
+ - Qwen2 Architecture
76
+ ---