lbourdois commited on
Commit
992b688
·
verified ·
1 Parent(s): 0cb7aaf

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +78 -66
README.md CHANGED
@@ -1,67 +1,79 @@
1
- ---
2
- base_model:
3
- - Qwen/Qwen2.5-3B-Instruct
4
- tags:
5
- - gguf
6
- - q4
7
- - text-generation-inference
8
- - transformers
9
- - qwen2
10
- - trl
11
- - grpo
12
- license: apache-2.0
13
- language:
14
- - en
15
- ---
16
-
17
- # TBH.AI Base Reasoning (GGUF - Q4)
18
-
19
- - **Developed by:** TBH.AI
20
- - **License:** apache-2.0
21
- - **Fine-tuned from:** Qwen/Qwen2.5-3B-Instruct
22
- - **GGUF Format:** 4-bit quantized (Q4) for optimized inference
23
-
24
- ## **Model Description**
25
- TBH.AI Base Reasoning (GGUF - Q4) is a **4-bit GGUF quantized** version of `saishshinde15/TBH.AI_Base_Reasoning`, a fine-tuned model based on **Qwen 2.5**. This version is designed for **high-efficiency inference on CPU/GPU with minimal memory usage**, making it ideal for on-device applications and low-latency AI systems.
26
-
27
- Trained using **GRPO (General Reinforcement with Policy Optimization)**, the model excels in **self-reasoning, logical deduction, and structured problem-solving**, comparable to **DeepSeek-R1**. The **Q4 quantization** ensures significantly lower memory requirements while maintaining strong reasoning performance.
28
-
29
- ## **Features**
30
- - **4-bit Quantization (Q4 GGUF):** Optimized for low-memory, high-speed inference on compatible backends.
31
- - **Self-Reasoning AI:** Can process complex queries autonomously, generating logical and structured responses.
32
- - **GRPO Fine-Tuning:** Uses policy optimization for improved logical consistency and step-by-step reasoning.
33
- - **Efficient On-Device Deployment:** Works seamlessly with **llama.cpp, KoboldCpp, GPT4All, and ctransformers**.
34
- - **Ideal for Logical Tasks:** Best suited for **research, coding logic, structured Q&A, and decision-making applications**.
35
-
36
- ## **Limitations**
37
- - This **Q4 GGUF version is inference-only** and does not support additional fine-tuning.
38
- - Quantization may slightly reduce response accuracy compared to FP16/full-precision models.
39
- - Performance depends on the execution environment and GGUF-compatible runtime.
40
-
41
- ## **Usage**
42
-
43
- # Use this prompt for more detailed and personalized results. This is the recommended prompt as the model was tuned on it.
44
-
45
- ```python
46
- You are a reasoning model made by researcher at TBH.AI and your role is to respond in the following format only and in detail :
47
-
48
- <reasoning>
49
- ...
50
- </reasoning>
51
- <answer>
52
- ...
53
- </answer>
54
- ```
55
-
56
- # Use this prompt for concise representation of answers.
57
-
58
- ```python
59
- SYSTEM_PROMPT = """
60
- Respond in the following format:
61
- <reasoning>
62
- ...
63
- </reasoning>
64
- <answer>
65
- ...
66
- </answer>
 
 
 
 
 
 
 
 
 
 
 
 
67
  """
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-3B-Instruct
4
+ tags:
5
+ - gguf
6
+ - q4
7
+ - text-generation-inference
8
+ - transformers
9
+ - qwen2
10
+ - trl
11
+ - grpo
12
+ license: apache-2.0
13
+ language:
14
+ - zho
15
+ - eng
16
+ - fra
17
+ - spa
18
+ - por
19
+ - deu
20
+ - ita
21
+ - rus
22
+ - jpn
23
+ - kor
24
+ - vie
25
+ - tha
26
+ - ara
27
+ ---
28
+
29
+ # TBH.AI Base Reasoning (GGUF - Q4)
30
+
31
+ - **Developed by:** TBH.AI
32
+ - **License:** apache-2.0
33
+ - **Fine-tuned from:** Qwen/Qwen2.5-3B-Instruct
34
+ - **GGUF Format:** 4-bit quantized (Q4) for optimized inference
35
+
36
+ ## **Model Description**
37
+ TBH.AI Base Reasoning (GGUF - Q4) is a **4-bit GGUF quantized** version of `saishshinde15/TBH.AI_Base_Reasoning`, a fine-tuned model based on **Qwen 2.5**. This version is designed for **high-efficiency inference on CPU/GPU with minimal memory usage**, making it ideal for on-device applications and low-latency AI systems.
38
+
39
+ Trained using **GRPO (General Reinforcement with Policy Optimization)**, the model excels in **self-reasoning, logical deduction, and structured problem-solving**, comparable to **DeepSeek-R1**. The **Q4 quantization** ensures significantly lower memory requirements while maintaining strong reasoning performance.
40
+
41
+ ## **Features**
42
+ - **4-bit Quantization (Q4 GGUF):** Optimized for low-memory, high-speed inference on compatible backends.
43
+ - **Self-Reasoning AI:** Can process complex queries autonomously, generating logical and structured responses.
44
+ - **GRPO Fine-Tuning:** Uses policy optimization for improved logical consistency and step-by-step reasoning.
45
+ - **Efficient On-Device Deployment:** Works seamlessly with **llama.cpp, KoboldCpp, GPT4All, and ctransformers**.
46
+ - **Ideal for Logical Tasks:** Best suited for **research, coding logic, structured Q&A, and decision-making applications**.
47
+
48
+ ## **Limitations**
49
+ - This **Q4 GGUF version is inference-only** and does not support additional fine-tuning.
50
+ - Quantization may slightly reduce response accuracy compared to FP16/full-precision models.
51
+ - Performance depends on the execution environment and GGUF-compatible runtime.
52
+
53
+ ## **Usage**
54
+
55
+ # Use this prompt for more detailed and personalized results. This is the recommended prompt as the model was tuned on it.
56
+
57
+ ```python
58
+ You are a reasoning model made by researcher at TBH.AI and your role is to respond in the following format only and in detail :
59
+
60
+ <reasoning>
61
+ ...
62
+ </reasoning>
63
+ <answer>
64
+ ...
65
+ </answer>
66
+ ```
67
+
68
+ # Use this prompt for concise representation of answers.
69
+
70
+ ```python
71
+ SYSTEM_PROMPT = """
72
+ Respond in the following format:
73
+ <reasoning>
74
+ ...
75
+ </reasoning>
76
+ <answer>
77
+ ...
78
+ </answer>
79
  """