Introduce config files for simple & warning-free Sentence Transformers integration
Hello!
Pull Request overview
- Introduce config files for Sentence Transformers integration
modules.json
to inform which modules should be used (Transformer -> Pooling -> Normalization), Transformer configuration, Pooling configuration, and overall sentence transformers configuration.
- Add
sentence-transformers
tag to the model card metadata - Reformat code snippets in the README to introduce more visual space
Details
Nice work on this release! I see that you already have a ST integration - these configuration options make it a bit easier and warning-free for users to use it. E.g. users don't have to call model.set_pooling_include_prompt
anymore.
Beyond that, I looked into the difference in outputs between Transformers and ST, and my believe is that it's related to the dtype assignment of the inv_freq
in the RoPE. Notably, on the Sentence Transformers side, the inv_freq
seems to update slightly compared to the original_inv_freq
which stays the same, and I'm able to reproduce this update exactly if I take the inv_freq
from transformers
, cast it to float32, and then cast it back to bf16. In short: I think there's a minor difference in how the model is moved/assigned to bf16 that's messing with the RoPE parameters very very slightly. As you mentioned already, it doesn't affect results.
- Tom Aarsen
Hi Tom,
Sorry for missing out on this pull request previously. Thank you so much for kindly helping us to improve the support for SentenceTransformers and explaining the details! I just tested your pull request. It works like a charm! We are very grateful to have your help!