"Bidirectional attention"

#1
by olivierdehaene HF staff - opened

Hello,

It seems that compared to the 7B variant, this model has bi-directional attention turned off. Is this normal?
See this line where is_causal is set to True in this variant.
VS
This line where is_causal is set to False in the 7B variant.

Alibaba-NLP org

Thanks for your careful observation! The bi-directional attention is indeed used on this model. The code has already been updated now.

olivierdehaene changed discussion status to closed
olivierdehaene changed discussion status to open

@zyznull , it seems that you didn't properly update the code. is_causal is still set to True by default in the model forward which is the main entrypoint for Transformers and SentenceTransformers.

Sign up or log in to comment