---
base_model:
- Qwen/Qwen2.5-3B-Instruct
tags:
- gguf
- q4
- text-generation-inference
- transformers
- qwen2
- trl
- grpo
license: apache-2.0
language:
- zho
- eng
- fra
- spa
- por
- deu
- ita
- rus
- jpn
- kor
- vie
- tha
- ara
---

# TBH.AI Base Reasoning (GGUF - Q4)

- **Developed by:** TBH.AI  
- **License:** apache-2.0  
- **Fine-tuned from:** Qwen/Qwen2.5-3B-Instruct  
- **GGUF Format:** 4-bit quantized (Q4) for optimized inference  

## **Model Description**  
TBH.AI Base Reasoning (GGUF - Q4) is a **4-bit GGUF quantized** version of `saishshinde15/TBH.AI_Base_Reasoning`, a fine-tuned model based on **Qwen 2.5**. This version is designed for **high-efficiency inference on CPU/GPU with minimal memory usage**, making it ideal for on-device applications and low-latency AI systems.  

Trained using **GRPO (General Reinforcement with Policy Optimization)**, the model excels in **self-reasoning, logical deduction, and structured problem-solving**, comparable to **DeepSeek-R1**. The **Q4 quantization** ensures significantly lower memory requirements while maintaining strong reasoning performance.  

## **Features**  
- **4-bit Quantization (Q4 GGUF):** Optimized for low-memory, high-speed inference on compatible backends.  
- **Self-Reasoning AI:** Can process complex queries autonomously, generating logical and structured responses.  
- **GRPO Fine-Tuning:** Uses policy optimization for improved logical consistency and step-by-step reasoning.  
- **Efficient On-Device Deployment:** Works seamlessly with **llama.cpp, KoboldCpp, GPT4All, and ctransformers**.  
- **Ideal for Logical Tasks:** Best suited for **research, coding logic, structured Q&A, and decision-making applications**.  

## **Limitations**  
- This **Q4 GGUF version is inference-only** and does not support additional fine-tuning.  
- Quantization may slightly reduce response accuracy compared to FP16/full-precision models.  
- Performance depends on the execution environment and GGUF-compatible runtime.  

## **Usage** 

# Use this prompt for more detailed and personalized results. This is the recommended prompt as the model was tuned on it.

```python
You are a reasoning model made by researcher at TBH.AI and your role is to respond in the following format only and in detail :

<reasoning>
...
</reasoning>
<answer>
...
</answer>
```

# Use this prompt for concise representation of answers.

```python
SYSTEM_PROMPT = """
Respond in the following format:
<reasoning>
...
</reasoning>
<answer>
...
</answer>
"""