Unable to run the example

#17
by tommykoctur - opened

I have tried the example code from model card and it says:

  • 760 else:
    761 full_cache_length = attention_mask.shape[-1] if attention_mask is not None else sequence_length
    --> 763 cond1 = first_cache_position >= attention_chunk_size
    764 cond2 = (first_cache_position < attention_chunk_size) & (
    765 first_cache_position + sequence_length > attention_chunk_size
    766 )
    767 key_length = (
    768 torch.where(
    769 cond1,
    (...) 774 else full_cache_length
    775 )

TypeError: '>=' not supported between instances of 'Tensor' and 'NoneType'
'''

Llama4ForCausalLM has no _prepare_4d_causal_attention_mask_with_cache_position method defined in its base modeling class. Compiled forward passes will be sub-optimal. If you're writing code, see Llama for an example implementation.

I'm getting the same error. Any information yet?

Sign up or log in to comment