Fizz 🏳️‍⚧️ PRO

Fizzarolli

AI & ML interests

None yet

Recent Activity

updated a model about 1 hour ago
allura-forge/q3-30b-rc3-kto-adpt-step50
published a model about 1 hour ago
allura-forge/q3-30b-rc3-kto-adpt-step50
updated a model about 12 hours ago
allura-forge/q3-30b-rc3-actually-good-now-i-promise
View all activity

Organizations

Zeus Labs's profile picture Social Post Explorers's profile picture ShuttleAI's profile picture testing's profile picture Alfitaria's profile picture Allura's profile picture Estrogen's profile picture Smol Community's profile picture Allura (Forge)'s profile picture Allura (Quants)'s profile picture Mawdistical Brewery's profile picture

Posts 2

view post
Post
2483
hi everyone!

i wanted to share an experiment i did with upcycling phi-3 mini into an moe recently.
while benchmarks are definitely within a margin of error and they performed similarly, i think it's an interesting base to try and see if you can improve phi's performance! (maybe looking into HuggingFaceFW/fineweb-edu could be interesting, i also left some other notes if anyone with more compute access wants to try it themselves)

check it out! Fizzarolli/phi3-4x4b-v1
view post
Post
2904
Is anyone looking into some sort of decentralized/federated dataset generation or classification by humans instead of synthetically?

From my experience with trying models, a *lot* of modern finetunes are trained on what amounts to, in essence, GPT-4 generated slop that makes everything sound like a rip-off GPT-4 (refer to i.e. the Dolphin finetunes). I have a feeling that this is a lot of the reason people haven't been quite as successful as Meta's instruct tunes of Llama 3.