Improve language tag (#1)
Browse files- Improve language tag (992b68889a2c1f131b34e1f73538ace69908f5dd)
Co-authored-by: Loïck BOURDOIS <[email protected]>
README.md
CHANGED
@@ -1,67 +1,79 @@
|
|
1 |
-
---
|
2 |
-
base_model:
|
3 |
-
- Qwen/Qwen2.5-3B-Instruct
|
4 |
-
tags:
|
5 |
-
- gguf
|
6 |
-
- q4
|
7 |
-
- text-generation-inference
|
8 |
-
- transformers
|
9 |
-
- qwen2
|
10 |
-
- trl
|
11 |
-
- grpo
|
12 |
-
license: apache-2.0
|
13 |
-
language:
|
14 |
-
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
-
|
20 |
-
-
|
21 |
-
-
|
22 |
-
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
- **
|
32 |
-
- **
|
33 |
-
- **
|
34 |
-
- **
|
35 |
-
|
36 |
-
## **
|
37 |
-
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
## **
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
67 |
"""
|
|
|
1 |
+
---
|
2 |
+
base_model:
|
3 |
+
- Qwen/Qwen2.5-3B-Instruct
|
4 |
+
tags:
|
5 |
+
- gguf
|
6 |
+
- q4
|
7 |
+
- text-generation-inference
|
8 |
+
- transformers
|
9 |
+
- qwen2
|
10 |
+
- trl
|
11 |
+
- grpo
|
12 |
+
license: apache-2.0
|
13 |
+
language:
|
14 |
+
- zho
|
15 |
+
- eng
|
16 |
+
- fra
|
17 |
+
- spa
|
18 |
+
- por
|
19 |
+
- deu
|
20 |
+
- ita
|
21 |
+
- rus
|
22 |
+
- jpn
|
23 |
+
- kor
|
24 |
+
- vie
|
25 |
+
- tha
|
26 |
+
- ara
|
27 |
+
---
|
28 |
+
|
29 |
+
# TBH.AI Base Reasoning (GGUF - Q4)
|
30 |
+
|
31 |
+
- **Developed by:** TBH.AI
|
32 |
+
- **License:** apache-2.0
|
33 |
+
- **Fine-tuned from:** Qwen/Qwen2.5-3B-Instruct
|
34 |
+
- **GGUF Format:** 4-bit quantized (Q4) for optimized inference
|
35 |
+
|
36 |
+
## **Model Description**
|
37 |
+
TBH.AI Base Reasoning (GGUF - Q4) is a **4-bit GGUF quantized** version of `saishshinde15/TBH.AI_Base_Reasoning`, a fine-tuned model based on **Qwen 2.5**. This version is designed for **high-efficiency inference on CPU/GPU with minimal memory usage**, making it ideal for on-device applications and low-latency AI systems.
|
38 |
+
|
39 |
+
Trained using **GRPO (General Reinforcement with Policy Optimization)**, the model excels in **self-reasoning, logical deduction, and structured problem-solving**, comparable to **DeepSeek-R1**. The **Q4 quantization** ensures significantly lower memory requirements while maintaining strong reasoning performance.
|
40 |
+
|
41 |
+
## **Features**
|
42 |
+
- **4-bit Quantization (Q4 GGUF):** Optimized for low-memory, high-speed inference on compatible backends.
|
43 |
+
- **Self-Reasoning AI:** Can process complex queries autonomously, generating logical and structured responses.
|
44 |
+
- **GRPO Fine-Tuning:** Uses policy optimization for improved logical consistency and step-by-step reasoning.
|
45 |
+
- **Efficient On-Device Deployment:** Works seamlessly with **llama.cpp, KoboldCpp, GPT4All, and ctransformers**.
|
46 |
+
- **Ideal for Logical Tasks:** Best suited for **research, coding logic, structured Q&A, and decision-making applications**.
|
47 |
+
|
48 |
+
## **Limitations**
|
49 |
+
- This **Q4 GGUF version is inference-only** and does not support additional fine-tuning.
|
50 |
+
- Quantization may slightly reduce response accuracy compared to FP16/full-precision models.
|
51 |
+
- Performance depends on the execution environment and GGUF-compatible runtime.
|
52 |
+
|
53 |
+
## **Usage**
|
54 |
+
|
55 |
+
# Use this prompt for more detailed and personalized results. This is the recommended prompt as the model was tuned on it.
|
56 |
+
|
57 |
+
```python
|
58 |
+
You are a reasoning model made by researcher at TBH.AI and your role is to respond in the following format only and in detail :
|
59 |
+
|
60 |
+
<reasoning>
|
61 |
+
...
|
62 |
+
</reasoning>
|
63 |
+
<answer>
|
64 |
+
...
|
65 |
+
</answer>
|
66 |
+
```
|
67 |
+
|
68 |
+
# Use this prompt for concise representation of answers.
|
69 |
+
|
70 |
+
```python
|
71 |
+
SYSTEM_PROMPT = """
|
72 |
+
Respond in the following format:
|
73 |
+
<reasoning>
|
74 |
+
...
|
75 |
+
</reasoning>
|
76 |
+
<answer>
|
77 |
+
...
|
78 |
+
</answer>
|
79 |
"""
|