Nemotron-Research-Reasoning-Qwen-1.5B-GGUF

Nemotron-Research-Reasoning-Qwen-1.5B is the world’s leading 1.5B open-weight model for complex reasoning tasks such as mathematical problems, coding challenges, scientific questions, and logic puzzles. It is trained using the ProRL algorithm on a diverse and comprehensive set of datasets. Our model has achieved impressive results, outperforming Deepseek’s 1.5B model by a large margin on a broad range of tasks, including math, coding, and GPQA.

Model Files

File Name Format Size Precision Use Case
Nemotron-Research-Reasoning-Qwen-1.5B.F32.gguf GGUF 7.11 GB F32 Highest precision, research use
Nemotron-Research-Reasoning-Qwen-1.5B.BF16.gguf GGUF 3.56 GB BF16 High precision, balanced performance
Nemotron-Research-Reasoning-Qwen-1.5B.F16.gguf GGUF 3.56 GB F16 High precision, memory efficient
Nemotron-Research-Reasoning-Qwen-1.5B.Q8_0.gguf GGUF 1.89 GB Q8_0 Good quality, moderate compression
Nemotron-Research-Reasoning-Qwen-1.5B.Q5_K_M.gguf GGUF 1.29 GB Q5_K_M Balanced quality/size (recommended)
Nemotron-Research-Reasoning-Qwen-1.5B.Q5_K_S.gguf GGUF 1.26 GB Q5_K_S Good quality, smaller size
Nemotron-Research-Reasoning-Qwen-1.5B.Q4_K_M.gguf GGUF 1.12 GB Q4_K_M Good balance for most users
Nemotron-Research-Reasoning-Qwen-1.5B.Q4_K_S.gguf GGUF 1.07 GB Q4_K_S Decent quality, compact size
Nemotron-Research-Reasoning-Qwen-1.5B.Q3_K_L.gguf GGUF 980 MB Q3_K_L Lower quality, very compact
Nemotron-Research-Reasoning-Qwen-1.5B.Q3_K_M.gguf GGUF 924 MB Q3_K_M Fast inference, limited quality
Nemotron-Research-Reasoning-Qwen-1.5B.Q3_K_S.gguf GGUF 861 MB Q3_K_S Fastest inference, basic quality
Nemotron-Research-Reasoning-Qwen-1.5B.Q2_K.gguf GGUF 753 MB Q2_K Minimal size, experimental use

Quick Selection Guide

  • For Research/Development: Use F32 or BF16 for maximum accuracy
  • For Production (Recommended): Use Q5_K_M for best quality/performance balance
  • For Resource-Constrained Environments: Use Q4_K_M or Q4_K_S
  • For Edge Devices: Use Q3_K_M or Q2_K for minimal footprint

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Downloads last month
99
GGUF
Model size
1.78B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Research-Reasoning-Qwen-F32-GGUF