Qwen2.5-Wolverine-Coder-11B

This repo contains the full precision source code, in "safe tensors" format to generate GGUFs, GPTQ, EXL2, AWQ, HQQ and other formats. The source code can also be used directly.
"Ripping your programming worries to shreds... fast."
Tipping the scales at 42 layers and 507 tensors... the monster lives.
Two monsters in fact.
This repo has the source code for Version 1.
Each model generates stronger, more compact code with an enhanced understanding of your instructions and follows what you tell them to the letter.
These overpowered - yet wickedly fast - CODING ENGINES are based on two of the best coder AIs:
"Qwen2.5-Coder-7B-Instruct"
[ https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct ]
and
"OlympicCoder-7B"
[ https://huggingface.co/open-r1/OlympicCoder-7B ]
These two models are stuffed into one compact powerhouse 11B merge that is stronger in performance and understanding than both donor models.
Quants Q4_K_S - one of each version - are available at the moment.
AND
Quants IQ3_S - NEO IMATRIX (in house dataset developed by DavidAU), one for each version too.
AND
Q8 MAX - one for each model.
These are unaltered quants for primary testing, except Q8 MAX with output tensor at bfloat16 (full precision).
The output tensor accounts for 10-20% of the "decision making" in a model.
Suggest smaller quants for simpler projects / smaller code, with Q8 MAX for larger more complex projects // lots of instructions.
LIMITED GGUF REPO HERE:
https://huggingface.co/DavidAU/Qwen2.5-Wolverine-CODER-11B-gguf
(additional quants/ggufs under "quantizations" upper right)
CONFIGS:
- #1 -> OlympicCoder-7B as primary/start, with Qwen2.5-Coder-7B-Instruct as "finalizer".
- #2 -> Qwen2.5-Coder-7B-Instruct primary/start, with OlympicCoder-7B as "finalizer".
NOTES:
- Each config/version will be very different from each other.
- Model tested to IQ2_S (NEO Imatrix), fully operational.
- Tool Calling is supported in both versions.
- Source(s) / full quanting to follow // full repos to follow.
- Final model size (including layers/tensors) / config subject to change.
Config / Settings
Model is set at 32k/32768 context for these GGUFS, full quants/full repos will be 128k/131072.
Requirements [Qwen 2.5 7B Coder default settings]:
- Temp .5 to .7 (or lower)
- topk: 20, topp: .8, minp: .05
- rep pen: 1.1 (can be lower)
- Jinja Template (embedded) or CHATML template.
- A System Prompt is not required. (ran tests with blank system prompt)
Refer to either "Qwen2.5-Coder-7B-Instruct" and/or "OlympicCoder-7B" repos (above) for additional settings, benchmarks and usage.
Help, Adjustments, Samplers, Parameters and More
Settings: CHAT / ROLEPLAY and/or SMOOTHER operation of this model:
In "KoboldCpp" or "oobabooga/text-generation-webui" or "Silly Tavern" ;
Set the "Smoothing_factor" to 1.5
: in KoboldCpp -> Settings->Samplers->Advanced-> "Smooth_F"
: in text-generation-webui -> parameters -> lower right.
: In Silly Tavern this is called: "Smoothing"
NOTE: For "text-generation-webui"
-> if using GGUFs you need to use "llama_HF" (which involves downloading some config files from the SOURCE version of this model)
Source versions (and config files) of my models are here:
OTHER OPTIONS:
Increase rep pen to 1.1 to 1.15 (you don't need to do this if you use "smoothing_factor")
If the interface/program you are using to run AI MODELS supports "Quadratic Sampling" ("smoothing") just make the adjustment as noted.
Highest Quality Settings / Optimal Operation Guide / Parameters and Samplers
This a "Class 1" model:
For all settings used for this model (including specifics for its "class"), including example generation(s) and for advanced settings guide (which many times addresses any model issue(s)), including methods to improve model performance for all use case(s) as well as chat, roleplay and other use case(s) please see:
You can see all parameters used for generation, in addition to advanced parameters and samplers to get the most out of this model here:
- Downloads last month
- -