Llama-4-Scout-17B-16E-Instruct-GGUF
ONLY text mode of Llama-4 supported for now!
Original Model
unsloth/Llama-4-Scout-17B-16E-Instruct
Run with LlamaEdge
LlamaEdge version: v0.16.16 and above
Prompt template
Prompt type:
llama-4-chat
Prompt string
<|begin_of_text|><|header_start|>system<|header_end|> {system_prompt}<|eot|><|header_start|>user<|header_end|> {user_message_1}<|eot|><|header_start|>assistant<|header_end|> {assistant_message_1}<|eot|><|header_start|>user<|header_end|> {user_message_2}<|eot|> <|header_start|>assistant<|header_end|>
Example: tool use
Prompt with user question and tool info
Expand to see the example
<|begin_of_text|><|header_start|>system<|header_end|> You are a helpful assistant.<|eot|> <|header_start|>user<|header_end|> Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt. Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables. [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "description": "The temperature unit to use. Infer this from the users location.", "enum": [ "celsius", "fahrenheit" ] } }, "required": [ "location", "unit" ] } } }, { "type": "function", "function": { "name": "predict_weather", "description": "Predict the weather in 24 hours", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "description": "The temperature unit to use. Infer this from the users location.", "enum": [ "celsius", "fahrenheit" ] } }, "required": [ "location", "unit" ] } } }, { "type": "function", "function": { "name": "sum", "description": "Calculate the sum of two numbers", "parameters": { "type": "object", "properties": { "a": { "type": "integer", "description": "the left hand side number" }, "b": { "type": "integer", "description": "the right hand side number" } }, "required": [ "a", "b" ] } } } ] Question: How is the weather of Beijing, China in celsius?<|eot|><|header_start|>assistant<|header_end|>
Prompt with tool results
Expand to see the example
<|begin_of_text|><|header_start|>system<|header_end|> You are a helpful assistant.<|eot|> <|header_start|>user<|header_end|> How is the weather of Beijing, China in celsius?<|eot|> <|header_start|>assistant<|header_end|> {"name":"get_current_weather","arguments":"{\"location\":\"Beijing, China\",\"unit\":\"celsius\"}"}<|eot|> <|header_start|>ipython<|header_end|> {"temperature":"30","unit":"celsius"}<|eot|><|header_start|>assistant<|header_end|>
Expand to see the example
Context size:
10M
Run as LlamaEdge service
wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-4-Scout-17B-16E-Instruct-Q5_K_M.gguf \ llama-api-server.wasm \ --prompt-template llama-4-chat \ --ctx-size 1000000 \ --model-name Llama-4-Scout
Run as LlamaEdge command app
wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-4-Scout-17B-16E-Instruct-Q5_K_M.gguf \ llama-chat.wasm \ --prompt-template llama-4-chat \ --ctx-size 1000000
Quantized GGUF Models
Quantized with llama.cpp b5074.
- Downloads last month
- 3,115
Hardware compatibility
Log In
to view the estimation
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for second-state/Llama-4-Scout-17B-16E-Instruct-GGUF
Base model
meta-llama/Llama-4-Scout-17B-16E
Finetuned
unsloth/Llama-4-Scout-17B-16E-Instruct