barandinho/TDM-8b-v0.1-Q8_0-GGUF

This model was converted to GGUF format from barandinho/TDM-8b-v0.1 using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

IMPORTANT NOTE :

Make sure to use this model with temperature=0.6 and top_p=0.95 settings for better performance.
You can also try out top_k=20 and min_p=0.01 settings.

Recommended way of using the model is through local llm clients like LM Studio or Ollama.

Make sure to do these settings before using the model:

Set this as system prompt:

Sen TÜDÜM (TÜrkçe Düşünen Üretken Model) isimli yardımsever bir yapay zeka modelisin.
Türkçe cevap ver ve cevabını tamamla.

For enabling multi-turn conversation paste this jinja template as chat template:

{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% set ns = namespace(is_first=false, is_tool=false, is_output_first=true, system_prompt='', is_first_sp=true, is_last_user=false) %}{%- for message in messages %}{%- if message['role'] == 'system' %}{%- if ns.is_first_sp %}{% set ns.system_prompt = ns.system_prompt + message['content'] %}{% set ns.is_first_sp = false %}{%- else %}{% set ns.system_prompt = ns.system_prompt + '
' + message['content'] %}{%- endif %}{%- endif %}{%- endfor %}{{ bos_token }}{{ ns.system_prompt }}{%- for message in messages %}{% set content = message['content'] %}{%- if message['role'] == 'user' %}{%- set ns.is_tool = false -%}{%- set ns.is_first = false -%}{%- set ns.is_last_user = true -%}{{'<|User|>' + content + '<|Assistant|>'}}{%- endif %}{%- if message['role'] == 'assistant' %}{%- set content = (message.content.split('</think>')|last).lstrip() %}{%- endif %}{%- if message['role'] == 'assistant' and message['tool_calls'] is defined and message['tool_calls'] is not none %}{%- set ns.is_last_user = false -%}{%- if ns.is_tool %}{{'<|tool▁outputs▁end|>'}}{%- endif %}{%- set ns.is_first = false %}{%- set ns.is_tool = false -%}{%- set ns.is_output_first = true %}{%- for tool in message['tool_calls'] %}{%- if not ns.is_first %}{%- if content is none %}{{'<|tool▁calls▁begin|><|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '
' + '```json' + '
' + tool['function']['arguments'] + '
' + '```' + '<|tool▁call▁end|>'}}{%- else %}{{content + '<|tool▁calls▁begin|><|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '
' + '```json' + '
' + tool['function']['arguments'] + '
' + '```' + '<|tool▁call▁end|>'}}{%- endif %}{%- set ns.is_first = true -%}{%- else %}{{'
' + '<|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '
' + '```json' + '
' + tool['function']['arguments'] + '
' + '```' + '<|tool▁call▁end|>'}}{%- endif %}{%- endfor %}{{'<|tool▁calls▁end|><|end▁of▁sentence|>'}}{%- endif %}{%- if message['role'] == 'assistant' and (message['tool_calls'] is not defined or message['tool_calls'] is none)%}{%- set ns.is_last_user = false -%}{%- if ns.is_tool %}{{'<|tool▁outputs▁end|>' + content + '<|end▁of▁sentence|>'}}{%- set ns.is_tool = false -%}{%- else %}{{content + '<|end▁of▁sentence|>'}}{%- endif %}{%- endif %}{%- if message['role'] == 'tool' %}{%- set ns.is_last_user = false -%}{%- set ns.is_tool = true -%}{%- if ns.is_output_first %}{{'<|tool▁outputs▁begin|><|tool▁output▁begin|>' + content + '<|tool▁output▁end|>'}}{%- set ns.is_output_first = false %}{%- else %}{{'
<|tool▁output▁begin|>' + content + '<|tool▁output▁end|>'}}{%- endif %}{%- endif %}{%- endfor -%}{% if ns.is_tool %}{{'<|tool▁outputs▁end|>'}}{% endif %}{% if add_generation_prompt and not ns.is_last_user and not ns.is_tool %}{{'<|Assistant|>'}}{% endif %}

Use with llama.cpp

Below is an auto-generated generic way of using the model with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo barandinho/TDM-8b-v0.1-Q8_0-GGUF --hf-file tdm-8b-v0.1-q8_0.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo barandinho/TDM-8b-v0.1-Q8_0-GGUF --hf-file tdm-8b-v0.1-q8_0.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo barandinho/TDM-8b-v0.1-Q8_0-GGUF --hf-file tdm-8b-v0.1-q8_0.gguf -p "The meaning to life and the universe is"

or

./llama-server --hf-repo barandinho/TDM-8b-v0.1-Q8_0-GGUF --hf-file tdm-8b-v0.1-q8_0.gguf -c 2048
Downloads last month
22
GGUF
Model size
8.19B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for barandinho/TDM-8b-v0.1-Q8_0-GGUF

Quantized
(3)
this model