DavidAU
/

Llama-3.1-1-million-ctx-DeepHermes-Deep-Reasoning-8B-GGUF

Model card Files Files and versions Community

General overview.

by VizorZ0042 - opened Apr 18

Apr 18

Dear DavidAU.

Have been testing this model for 7+ hours and astonished by its capabilities, it doesn't break characters, keep good writing style and overall is greatly improved. It's astonishing how 8B model can be so good. Thank you for your amazing work, truly appreciate every bit you do for everyone.

DavidAU

Owner Apr 19

Thank you again.
Actually was surprised myself during testing how well this model performs.
Going to test 2m/4m versions next.

Seems some of the 1m "training" transfers to the "core" model - still a lot of questions/things to try.

VizorZ0042

Apr 19

•

edited Apr 19

Yeah, 1M dramatically enhances your models. According to the comparison list, they don't differ much, but I'm curious to see and test them further

VizorZ0042

Apr 19

Further testing also reveals an extra EOS Token ://

DavidAU

Owner Apr 20

Excellent ; do you remember the Extra EOS token that it... popped out?
Seems there was an update to the tokenizer for 1 million model after this model was created ; might address this.
Hmmm.

VizorZ0042

Apr 20

@DavidAU It's related to Deep Hermes as Dark-Reasoning-Dark-Planet-Hermes-R1-Uncensored outputs the same. Sometimes it just outputs "://" (Without quotes.) before triggering an actual EOS token (For my occasion it's >), or something like:
You are a smart, helpful assistant...
And etc. before triggering an actual EOS token.

VizorZ0042

May 5

•

edited May 5

@DavidAU After 2 weeks+ testing I noticed how extremely stable this model is (up to 2.3 temp); It doesn't mess up memory and names like stheno merges do, definitely one of the most stable 8B models so far. Temp 2.4 and 2.5 may give good results sometimes, and temp 2.5+ starts to output more and more noticeable confusions.

Lower temps (1.35-1.6 / 1.75) gives a good balance and greater stability for longer instructions and character cards.

Overall this model has great performance, outstanding stability and coherence, while providing good creativity.

DavidAU

Owner May 6

@VizorZ0042

Excellent,. thank you for the detailed notes.

Just uploading source for multiple context levels of Qwen 3 - 8Bs.
("reg" 32k context is already up/quanted => NEO/HORROR versions)

Found that setting / changing to "core max context" length (via YARN) impacts generation - especially long form / creative.
Uploading 64k, 96k, 128k, 192k, 256k and 320k versions.

Likewise HORROR / NEO imatrix when applied to each (and generated at different max context lengths) also affects gen / operation / reasoning.
For creative -> This can have an extreme impact.

VizorZ0042

May 6

@DavidAU Absolutely great! But unfortunately I can only test max 12K context for Q4KS / Q4KM or 10K for Q5KS / Q5KM with my machine.

If it's okay to test your Qwen 3 8Bs variants with 10K/12K context, I'll be glad to do it.

DavidAU

Owner May 6

@VizorZ0042
IQ3_M (imatrix) work very well (as do all IQ3s) ; likewise IQ4XS/NL.
More context and boom for your VRAM.
I tested IQ3_M (imat) with the 320k context version.

NOTE:
Without Imatrix, min size is IQ4XS / Q4 or better.

VizorZ0042

22 days ago

•

edited 22 days ago

@davidau
After a long time and countless tests I noticed this particular LLM starts to repeat certain segments over and over, increasing Repetition Penaly, Temperature, Top_K does not help; however, increasing Repetition Range from 64 to 128 helps, but alters the verbosity in slightly worse way, especially noticeable with flow / progression of events.

This behavior is very similar to 1M-DarkPlanet, in which you made version 1.01 to fix such similar issue.

Also it tends to paste python scrips sometimes, starting with ``` or in very rare cases :// (both from NousHermes's side, not UltraLong-1M)

Also in conclusion, and after countless tests I must say - this model is the most stable compared to the most 8B models you have (not including non-merged thinking models)

DavidAU

Owner 22 days ago

Excellent ; thank you for feedback and detailed notes.

VizorZ0042

3 days ago

Testing with custom settings:

Kyubey (without Optional Enhancement):

Kyubey (with Optional Enhancement):

More stable compared to million-Dark-Planet; Needs to be fine-tuned; Performs much better with custom settings compared to default CLASS1.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment