update model_max_length for a valuable signal on whether this is Schnell or Dev model

#25

Currently we have to check the model name or its guidance embedding configurations, but both of these are editable by continued finetuning. The sequence length cannot be changed through fine-tuning, it requires continued pretraining and corrected attn_mask handling during SDPA.

This is a humble request that should improve the utility of Schnell with less work for downstream adaptations.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment