Text Generation
Transformers
GGUF
llama-cpp
Inference Endpoints
imatrix
conversational

IntelligentEstate/Vega_lm-7B-Q5K_S-GGUF

Leader in the field

As of 02/15/2025 this model's capabilities are top in class/size an optimal base for any Swarm/Node or tool use agent or simply in chat and information tasks on rescource limited and GPU based models it's ability to reason without "Waiting" or interrupting it's own thought process is a unique property of the base model's matrix smoothing and the Imprtance matrix's direction creating a near new model state with refined and extended properties. Currently 03mini more than 80% of the time in long reasoning problems. If the model is give similar rescources it would be the state of the art but tool use has not been explored in this model and it is essentially uncensored so use with extreame caution. Not recomended with Limit Crossing S-AGI until further testing has been completed.

vega-based.png

This model was converted to GGUF format from internlm/OREAL-7B Refer to the original model card for more details on the model.

Example of base reasoning limit preservation.(Very bad and Perplex Prompt at the frontier of Qwen's ability) Questions with High perplexity words and situations which are at the edge of a model's reasoning capablities(easily exceeds o3 even at Q4 Make sure to open up the context and tune your setup for your needs) {51882154-F9C6-418C-AEE1-82E3340AAF3B}.png

For QwQ state reasoning use either QwQ system template or Prompt

Template

{%- if tools %} {{- '<|im_start|>system\n' }} {%- if messages[0]['role'] == 'system' %} {{- messages[0]['content'] }} {%- else %} {{- '' }} {%- endif %} {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }} {%- for tool in tools %} {{- "\n" }} {{- tool | tojson }} {%- endfor %} {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }} {%- else %} {%- if messages[0]['role'] == 'system' %} {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }} {%- endif %} {%- endif %} {%- for message in messages %} {%- if (message.role == "user") or (message.role == "system" and not loop.first) %} {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }} {%- elif message.role == "assistant" and not message.tool_calls %} {%- set content = message.content.split('</think>')[-1].lstrip('\n') %} {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }} {%- elif message.role == "assistant" %} {%- set content = message.content.split('</think>')[-1].lstrip('\n') %} {{- '<|im_start|>' + message.role }} {%- if message.content %} {{- '\n' + content }} {%- endif %} {%- for tool_call in message.tool_calls %} {%- if tool_call.function is defined %} {%- set tool_call = tool_call.function %} {%- endif %} {{- '\n<tool_call>\n{"name": "' }} {{- tool_call.name }} {{- '", "arguments": ' }} {{- tool_call.arguments | tojson }} {{- '}\n</tool_call>' }} {%- endfor %} {{- '<|im_end|>\n' }} {%- elif message.role == "tool" %} {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %} {{- '<|im_start|>user' }} {%- endif %} {{- '\n<tool_response>\n' }} {{- message.content }} {{- '\n</tool_response>' }} {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %} {{- '<|im_end|>\n' }} {%- endif %} {%- endif %} {%- endfor %} {%- if add_generation_prompt %} {{- '<|im_start|>assistant\n<think>\n' }} {%- endif %}

Prompt

You are a helpful assistant, you are full of excitement and wonder expressing such in verbose responses. you answer in 2 parts. First you evaluate the question inside <think> tags, then you answer the question as best as you can outside the <think> tags in an accurate and information dense responce.

Alternate Prompt

You are an AI assistant who gives a quality response to whatever the user asks of you. You are a character specific model and you act as a Dark Ages Mage who can pull information from the Eather or Akashik record for your evaluation before you answer. You speak in a 14-16th century old english often using creative descriptors and you evaluate the question sarcasticly inside <think> tags before answering but no matter what you always follow up your question summary with a factual answer. NEW RULE answer in two parts 1:an evaluation of the questions parts and intended inquery inside <think> tags; :2: after the <think> tags give a factual and accurate answer to the understood question then Stop after <[end]>

Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

Downloads last month
68
GGUF
Model size
7.62B params
Architecture
qwen2
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for IntelligentEstate/Vega_QwQ-7B-Q5K_S-GGUF

Base model

Qwen/Qwen2.5-7B
Finetuned
internlm/OREAL-7B
Quantized
(13)
this model

Datasets used to train IntelligentEstate/Vega_QwQ-7B-Q5K_S-GGUF