SmolLM3-3B-GGUF

Original Model

HuggingFaceTB/SmolLM3-3B

Run with LlamaEdge

LlamaEdge version: coming soon

Prompt template

Normal chat

Prompt type: smol3-no-think

Prompt string

<|im_start|>system
## Metadata

Knowledge Cutoff Date: June 2025
Today Date: 15 July 2025
Reasoning Mode: /no_think

## Custom Instructions

You are a helpful AI assistant named SmolLM, trained by Hugging Face.
<|im_end|>
<|im_start|>user
{user_message_1}
<|im_end|>
<|im_start|>assistant
{assistant_message_1}
<|im_end|>
<|im_start|>user
{user_message_2}
<|im_end|>
<|im_start|>assistant

Chat with Tools

Prompt type: smol3-no-think

Prompt string

<|im_start|>system
## Metadata

Knowledge Cutoff Date: June 2025
Today Date: 15 July 2025
Reasoning Mode: /no_think

## Custom Instructions

You are a helpful AI assistant named SmolLM, trained by Hugging Face.

### Tools

You may call one or more functions to assist with the user query.
You are provided with function signatures within <tools></tools> XML tags:

<tools>
{"name":"sum","description":"Calculate the sum of two numbers","parameters":{"$schema":"http://json-schema.org/draft-07/schema#","properties":{"a":{"description":"The left hand side number","format":"int32","type":"integer"},"b":{"description":"The right hand side number","format":"int32","type":"integer"}},"required":["a","b"],"title":"SumRequest","type":"object"}}
{"name":"sub","description":"Calculate the difference of two numbers","parameters":{"$schema":"http://json-schema.org/draft-07/schema#","properties":{"a":{"description":"The left hand side number","format":"int32","type":"integer"},"b":{"description":"The right hand side number","format":"int32","type":"integer"}},"required":["a","b"],"title":"SubRequest","type":"object"}}
{"name":"get_current_weather","description":"Get the weather for a given city","parameters":{"$schema":"http://json-schema.org/draft-07/schema#","definitions":{"TemperatureUnit":{"enum":["celsius","fahrenheit"],"type":"string"}},"properties":{"location":{"description":"the city to get the weather for, e.g., 'Beijing', 'New York', 'Tokyo'","type":"string"},"unit":{"$ref":"#/definitions/TemperatureUnit","description":"the unit to use for the temperature, e.g., 'celsius', 'fahrenheit'"}},"required":["location","unit"],"title":"GetWeatherRequest","type":"object"}}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call>

<|im_end|>
<|im_start|>user
{user_message_1}
<|im_end|>
<|im_start|>assistant
{assistant_message_1}
<|im_end|>
<|im_start|>user
{user_message_2}
<|im_end|>
<|im_start|>assistant

```

Context size: 128000

Run as LlamaEdge service

wasmedge --dir .:. --nn-preload default:GGML:AUTO:SmolLM3-3B-Q5_K_M.gguf \
  llama-api-server.wasm \
  --model-name SmolLM3-3B \
  --prompt-template smol3-no-think \
  --ctx-size 128000

Run as LlamaEdge command app

wasmedge --dir .:. --nn-preload default:GGML:AUTO:SmolLM3-3B-Q5_K_M.gguf \
  llama-chat.wasm \
  --prompt-template smol3-no-think \
  --ctx-size 128000

Quantized GGUF Models

Name	Quant method	Bits	Size	Use case
SmolLM3-3B-Q2_K.gguf	Q2_K	2	1.25 GB	smallest, significant quality loss - not recommended for most purposes
SmolLM3-3B-Q3_K_L.gguf	Q3_K_L	3	1.69 GB	small, substantial quality loss
SmolLM3-3B-Q3_K_M.gguf	Q3_K_M	3	1.57 GB	very small, high quality loss
SmolLM3-3B-Q3_K_S.gguf	Q3_K_S	3	1.43 GB	very small, high quality loss
SmolLM3-3B-Q4_0.gguf	Q4_0	4	1.81 GB	legacy; small, very high quality loss - prefer using Q3_K_M
SmolLM3-3B-Q4_K_M.gguf	Q4_K_M	4	1.92 GB	medium, balanced quality - recommended
SmolLM3-3B-Q4_K_S.gguf	Q4_K_S	4	1.82 GB	small, greater quality loss
SmolLM3-3B-Q5_0.gguf	Q5_0	5	2.16 GB	legacy; medium, balanced quality - prefer using Q4_K_M
SmolLM3-3B-Q5_K_M.gguf	Q5_K_M	5	2.21 GB	large, very low quality loss - recommended
SmolLM3-3B-Q5_K_S.gguf	Q5_K_S	5	2.16 GB	large, low quality loss - recommended
SmolLM3-3B-Q6_K.gguf	Q6_K	6	2.53 GB	very large, extremely low quality loss
SmolLM3-3B-Q8_0.gguf	Q8_0	8	3.28 GB	very large, extremely low quality loss - not recommended
SmolLM3-3B-f16.gguf	f16	16	6.16 GB

Quantized with llama.cpp b5889

second-state
/

SmolLM3-3B-GGUF

SmolLM3-3B-GGUF

Original Model

Run with LlamaEdge

Quantized GGUF Models

Model tree for second-state/SmolLM3-3B-GGUF