@AbstractPhil on Hugging Face: "The T5-small + VIT-L-14 guidance shunt adapter is ready for toy use.…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

AbstractPhil

posted an update May 23

Post

411

The T5-small + VIT-L-14 guidance shunt adapter is ready for toy use.
AbstractPhil/t5-vit-14-v1
Included is a simple drop-in for sdxl experimentation using colab.

The outcome is okay but not great - diffusers is a headache so I spent more time trying to disjoint that machine than I did actually messing with this adapter.

I trained two variations of the baseline adapter;
t5-small vanilla and t5-small-human-associated-try2-pass3.
The vanilla was more accurate to adding context while the human associated stays locked onto human topics like a bloodhound... badly. Both ended up being substandard, even with a robust adapter like this.

Finetunes with specific goals can complete at runtime if desired due to the t5-small's tiny size, clip_l's inference speed, and the adapter's size. The adapter is very small and has safeguards for overfitting that can be disabled, so runtime freezing and adaptive shifts can be a viable methodology to immediate task pipeline adaptation.

The t5-small lacks the behavioral complexity of a model more built for such a task such as the base, large, or xxl - or even the Flan T5-small. However, this doesn't slow the little brain slug down. It guides and it's wrappers have many rapid generation potentials, whether it's trained the way I trained it or not.
The proof of concept is there, and the outcomes are present. Judge yourself.
The next variation will be more dims, more catches, higher conv, and additional safeguards to prevent overfitting - as well as including considerably more laion flavors so the T5-flan-base doesn't overwhelm or vise-versa.

AbstractPhil

May 23

•

edited May 23

Upgraded version based on t5-flan-base cooking with a more powerful schema, more powerful block catches, a more curated bottleneck integrity, additional loss calculations and roughly 6.7m params.

In this post

AbstractPhil AbstractPhila