Great model

#1
by JasonNan - opened

This is a very good model. I have a question for my own understanding. When for instance using this model, in the output, which part of the output comes from which part of the model? Are they separated ? Does it mean DeepHermes-3-Llama-3-8B-Preview contributes only to the tag parts while the actual final result comes only from Dark Planet 8B? Or is it a more complex blend? (excuse my ignorance, I'm not an expert in LLM).

Owner

thank you! ;

This model embraces a mixing method at the final layers, rather than "reasoning model" totally controlling output at this point.
This lets more of the core model "shine thru" in terms of output generation step(s), and in some cases the thinking too.

An update: I've used this model quite a bit now. It's very creative, let's put it this way. But now I'm switching to another model you have (Llama-3.1-Dark-Planet-SuperNova-8B-D_AU). The reason is that Llama-3.1-Dark-Planet-SuperNova-8B-D_AU follows instructions quite well, while L3.1-Evil-Reasoning-Dark-Planet-Hermes-R1-Uncensored-8B is rather 'disobedient', he more often than not, decides to go his own way in answering and doesn't care much about the precise instructions, even at temps as low as 0.5. He's great for creativity but when you need some structure, Dark-Planet-SuperNova seems to do the job better. My 2 cents. Thanks for all your models, they're the best out there!

Excellent ; thank you for feedback and detailed notes.

@JasonNan

Try SpinFire too, it's more stable compared to SuperNova, while Gemma3-12B being the best with the creativity and stability (compared to other 12B models)

Sign up or log in to comment