Wondering

#1
by Elfrino - opened

Hey Dave,

Just wondering if Psymed RP is still on the cards? Still love that model for it's creativity. Been checking your huggingspace daily hoping it'll pop up soon, haha. It's okay if you're busy, noticing alot of new juicy models your pushing out constantly. Really keen to try the Command R remaster. :)

P.S I'm noticing Undi's 20B frankenmerges (not just Psymed are super solid, that guy has some secret sauce in his models, if they were longer context they would be perfect, especially PsymedRP, MXLed-L2-20B and mlewd-remm-l2-chat-20b-inverted).

Thanks for the great work. :)

Owner

It is and on the list.
Yes, a long list of 20B models are planned / upgrades to them as time permits. (err... upload gods helping of course).
May use his 13B version, and merge it into a 29Bish version as those extend the power of the models significantly.
Although I do have his 20B PsyMed mapped out for "mergekit".

The context is a training issue, models need to be trained on longer context --> then you can use it.
However you can also use "rope settings" too - to extend 4k to up to 16k or beyond.
Draw back is instruction following / output can suffer.
For info on this see my "COMPOS" 4 model merge; ROPE info is detailed here which can be used for any model... llama3, llama2, mistral etc etc.
thanks ;
D

Oh that's great to hear. :)

I tried merging PsymedRP-20B with DarkForest-20B using the mergekit but unfortunately it came out a garbled mess. :/ I must say I'm rather new to the whole merging stuff and I'm sure I screwed something up somewhere, lol.

Very interesting idea merging the 20B Psymed with the 13B version, I might have to try that but I need to understand merging better, I know theres all different types of merges available and I need to read up on
it more before experimenting again.

Yes I've used rope with KoboldCPP and it's as easy as just extending the context in the start UI and it does the rope scaling under the hood (atleast that's what I gleaned from the documentation) however you are right, the output while still good, is slightly blunted, dulled unfortunately.

Anyways, thanks for the tips and advice. :)

Owner

RE: PsymedRP-20B with DarkForest-20B
Try a "task art" or "dare" merge.
You could also try a "model stock" merge. You might also want to checkout "mergekit" merges formulas as well:

https://huggingface.co/models?other=mergekit&sort=created

Most of the model makers list the formula.
D

Sign up or log in to comment