Hermes-3-Llama-3.2-3B-f32-GGUF

Hermes 3 - Llama-3.2 3B by Nous Research is a compact yet powerful generalist language model fine-tuned from the Llama-3.2 3B foundation, designed to deliver advanced agentic capabilities, improved roleplaying, complex reasoning, and coherent multi-turn conversations with long context understanding. It offers powerful user steering and control, supports structured prompting with ChatML format, and enables function calling and JSON output for dynamic interaction and automation.

Trained on high-performance GPU infrastructure, Hermes 3 excels in problem-solving, dialogue, and code generation tasks, demonstrating competitive benchmark scores across GPT4All, AGIEval, and BigBench. This model is optimized for efficiency, supports 4-bit quantization for deployment, and is ideal for building sophisticated AI assistants with controllable and natural user interactions.

Model Files

Model File name	Size	QuantType
Hermes-3-Llama-3.2-3B.BF16.gguf	6.43 GB	BF16
Hermes-3-Llama-3.2-3B.F16.gguf	6.43 GB	F16
Hermes-3-Llama-3.2-3B.F32.gguf	12.9 GB	F32
Hermes-3-Llama-3.2-3B.Q2_K.gguf	1.36 GB	Q2_K
Hermes-3-Llama-3.2-3B.Q3_K_L.gguf	1.82 GB	Q3_K_L
Hermes-3-Llama-3.2-3B.Q3_K_M.gguf	1.69 GB	Q3_K_M
Hermes-3-Llama-3.2-3B.Q3_K_S.gguf	1.54 GB	Q3_K_S
Hermes-3-Llama-3.2-3B.Q4_K_M.gguf	2.02 GB	Q4_K_M
Hermes-3-Llama-3.2-3B.Q4_K_S.gguf	1.93 GB	Q4_K_S
Hermes-3-Llama-3.2-3B.Q5_K_M.gguf	2.32 GB	Q5_K_M
Hermes-3-Llama-3.2-3B.Q5_K_S.gguf	2.27 GB	Q5_K_S
Hermes-3-Llama-3.2-3B.Q6_K.gguf	2.64 GB	Q6_K
Hermes-3-Llama-3.2-3B.Q8_0.gguf	3.42 GB	Q8_0

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

prithivMLmods
/

Hermes-3-Llama-3.2-3B-f32-GGUF

Hermes-3-Llama-3.2-3B-f32-GGUF

Model Files

Quants Usage

Model tree for prithivMLmods/Hermes-3-Llama-3.2-3B-f32-GGUF