metadata

license: apache-2.0
language:
  - en
base_model:
  - prithivMLmods/OpenRHO-2B-Thinker
pipeline_tag: text-generation
library_name: transformers
tags:
  - text-generation-inference

OpenRHO-2B-Thinker-GGUF

OpenRHO-2B-Thinker is a general-purpose reasoning model designed to enhance the cognitive abilities of edge-deployed large language models (LLMs) through reinforcement learning (RL). Fine-tuned from Qwen2-1.5B-Instruct using the QwQ distill dataset, it delivers refined improvements in logical reasoning, structured problem-solving, and lightweight coding — making it highly efficient for resource-constrained environments.

Model Files

File Name	Size	Quantization	Format	Description
`OpenRHO-2B-Thinker.BF16.gguf`	3.56 GB	BF16	GGUF	BFloat16 precision version
`OpenRHO-2B-Thinker.F16.gguf`	3.56 GB	FP16	GGUF	Float16 precision version
`OpenRHO-2B-Thinker.F32.gguf`	7.11 GB	FP32	GGUF	Float32 precision version
`OpenRHO-2B-Thinker.Q2_K.gguf`	753 MB	Q2_K	GGUF	2-bit quantized (K variant)
`OpenRHO-2B-Thinker.Q3_K_M.gguf`	924 MB	Q3_K_M	GGUF	3-bit quantized (K M variant)
`OpenRHO-2B-Thinker.Q4_K_M.gguf`	1.12 GB	Q4_K_M	GGUF	4-bit quantized (K M variant)
`OpenRHO-2B-Thinker.Q4_K_S.gguf`	1.07 GB	Q4_K_S	GGUF	4-bit quantized (K S variant)
`OpenRHO-2B-Thinker.Q5_K_M.gguf`	1.29 GB	Q5_K_M	GGUF	5-bit quantized (K M variant)
`OpenRHO-2B-Thinker.Q5_K_S.gguf`	1.26 GB	Q5_K_S	GGUF	5-bit quantized (K S variant)
`OpenRHO-2B-Thinker.Q6_K.gguf`	1.46 GB	Q6_K	GGUF	6-bit quantized (K variant)
`OpenRHO-2B-Thinker.Q8_0.gguf`	1.89 GB	Q8_0	GGUF	8-bit quantized
`.gitattributes`	2.24 kB	—	—	Git LFS tracking file
`config.json`	31 B	—	—	Configuration file
`README.md`	670 B	—	—	Model documentation

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Link	Type	Size/GB	Notes
GGUF	Q2_K	0.4
GGUF	Q3_K_S	0.5
GGUF	Q3_K_M	0.5	lower quality
GGUF	Q3_K_L	0.5
GGUF	IQ4_XS	0.6
GGUF	Q4_K_S	0.6	fast, recommended
GGUF	Q4_K_M	0.6	fast, recommended
GGUF	Q5_K_S	0.6
GGUF	Q5_K_M	0.7
GGUF	Q6_K	0.7	very good quality
GGUF	Q8_0	0.9	fast, best quality
GGUF	f16	1.6	16 bpw, overkill

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):