DavidAU
/

Llama-3.1-1-million-cxt-Dark-Planet-8B-GGUF

Model card Files Files and versions Community

Excellent Model / Detailed overview and discussion.

by VizorZ0042 - opened 4 days ago

Discussion

VizorZ0042

4 days ago

•

edited 3 days ago

Dear DavidAU.

I've been testing this model alongside with 128k. one and I can surely say this one is much more bulletproof compared to 128k. one, providing visibly more coherent, stable, vivid, logic and really resilient. I've also noticed how better it follows long instructions(especially with long character cards) compared to basic one.

Overall it's nearly perfect and somehow doesn't have the same issues involving Stheno merge(currently). I've also noticed increased repetition in some cases compared to basic one, so increasing to 1.105, 1.121 helps, it also retains a good coherence even with higher rep. penalty compared to other ones.

DavidAU

Owner 3 days ago

Excellent; thank you ;

RE: Repeat;
Hmm, maybe higher temps? and/or larger, slightly more detailed prompt?
I find this helps with this type of issue.

Larger rep pen range, (even with lower rep pen) may also do the trick.

VizorZ0042

2 days ago

Well, I experimented with really detailed prompt, it start repeating in mid-process of completing the prompt, and looping by adding the same sentence in slightly altered form, adding more and more.

Higher rep pen range helps with repetition, but in cost of shorter responses, a bit less vivid and descriptive scenes and faster completion of the whole prompt. For example rep pen 64 fulfills all the actions from the detailed prompt in 32 outputs, assisted with User inputs, while rep pen 256 might finish it in 27 outputs and so on. Smoothing Factor helps a bit, but may also decrease coherency.

Furthermore I'm still testing it and astonished by how well and stable it is.

DavidAU

Owner about 21 hours ago

@VizorZ0042
thank you so much for your feedback and testing.

I have located a possible fix to the "repeat" issues at the end ; and this is in testing.
Once the hard testing is complete, I will upload the new quants.

VizorZ0042

about 12 hours ago

@DavidAU I do hope the new quants won't be less coherent and/or creative as currently this model has outstanding coherency even with higher Repetition Penalty, Temperature, Top-K and other different sampling configurations aimed to achieve better creativity, where other models just lose coherency with small changes of Top-K, higher Temperature and etc.

DavidAU

Owner about 6 hours ago

@VizorZ0042

Going to upload them as a new version alongside the current version in the repo.
Same thoughts as you here - a little fix, sometimes does not address the "greater good".

The issue relates to EOS tokens, and additional a patch with new quants to address this.
This issue appears in Grand Horror 1 million , as well as Darkest Planet 1 million - but not in DeepHermes Reasoning 1 million.

Thank you again for your reviews/feedback and help.

DavidAU

Owner about 5 hours ago

@VizorZ0042
Version 1.01 quants uploading now for Dark Planet 1 million context.
Filename will have "1.01" in it.

VizorZ0042

about 2 hours ago

•

edited about 2 hours ago

@DavidAU Thanks once again for your great work, was really glad to help. And I'll be testing this one too, with Q4KS quants, and with special set of actions and prompts in around 42 generations, and I'll show results later.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment