Input tokens limited

#43
by dunzic - opened

The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens:

This happens if your prompt is too long for CLIP. Flux has a second, T5, text encoder that can handle up to 512 tokens, though (only 256 on Schnell). You have to explicitly pass this during the generation call with max_sequence_length=512

Hi, thanks, I'm trying to force my FLUX Dev colab (uses CLIP by default) to use T5. I added the max_sequence to my pipe, but the colab keeps using CLIP ("The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens"), even with:

                image = pipe(
                    prompt=processed_caption,
                    num_inference_steps=num_inference_steps,
                    guidance_scale=guidance_scale,
                    width=width1 if i == 0 else width2,
                    height=height1 if i == 0 else height2,
                    generator=generator,
                    max_sequence_length=512
                ).images[0]

@QES Both the clip and T5 embeddings are passed to the model, just that T5 supports a longer length. Without seeing your pipeline load statement, I can't say for sure that T5 is being loaded, but it likely is. This error message will still show up but will generate fine, including the additional T5 tokens past 77. Don't try to disable clip, it likely won't work well.

Thanks. I made some tests and realized that: the 77 tokens message keeps showing, BUT the whole prompt is processed (I did put precise details at the end of a long ass prompt ;-)

Sign up or log in to comment