meta-llama/Llama-Guard-4-12B · Unable to run the example

I have tried the example code from model card and it says:

760 else:
761 full_cache_length = attention_mask.shape[-1] if attention_mask is not None else sequence_length
--> 763 cond1 = first_cache_position >= attention_chunk_size
764 cond2 = (first_cache_position < attention_chunk_size) & (
765 first_cache_position + sequence_length > attention_chunk_size
766 )
767 key_length = (
768 torch.where(
769 cond1,
(...) 774 else full_cache_length
775 )

TypeError: '>=' not supported between instances of 'Tensor' and 'NoneType'
'''

Llama4ForCausalLM has no _prepare_4d_causal_attention_mask_with_cache_position method defined in its base modeling class. Compiled forward passes will be sub-optimal. If you're writing code, see Llama for an example implementation.