Introduce config files for simple & warning-free Sentence Transformers integration

by tomaarsen HF Staff - opened May 2

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

+60

-7

Files changed (5) hide show

1_Pooling/config.json +10 -0
README.md +16 -7
config_sentence_transformers.json +10 -0
modules.json +20 -0
sentence_bert_config.json +4 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 4096,
+  "pooling_mode_cls_token": false,
+  "pooling_mode_mean_tokens": true,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": false
+}

README.md CHANGED Viewed

@@ -6,6 +6,8 @@ language:
 license: cc-by-sa-4.0
 pipeline_tag: feature-extraction
 library_name: transformers
 ---
 ## Model Summary
@@ -22,17 +24,20 @@ Make sure to install `transformers>=4.47.0` first!
 ### Transformers
 ```python
-from transformers import AutoModel, AutoTokenizer
 model = AutoModel.from_pretrained("reasonir/ReasonIR-8B", torch_dtype="auto", trust_remote_code=True)
 query = "The quick brown fox jumps over the lazy dog."
 document = "The quick brown fox jumps over the lazy dog."
 query_instruction = ""
 doc_instruction = ""
-model = model.to("cuda")
-model.eval()
 query_emb = model.encode(query, instruction=query_instruction)
 doc_emb = model.encode(document, instruction=doc_instruction)
 sim = query_emb @ doc_emb.T
 ```
@@ -43,26 +48,30 @@ When using `AutoModel`, it is important to:
 ### Sentence Transformers
-Ordinary retrieval models that use mean pooling can automatically be used with SentenceTransformer after being published on Huggingface.
 ```python
 from sentence_transformers import SentenceTransformer
 model_kwargs = {"torch_dtype": "auto"}
 model = SentenceTransformer("reasonir/ReasonIR-8B", trust_remote_code=True, model_kwargs=model_kwargs)
-model.set_pooling_include_prompt(include_prompt=False) # exclude the prompt during pooling
 query = "The quick brown fox jumps over the lazy dog."
 document = "The quick brown fox jumps over the lazy dog."
 query_instruction = ""
 doc_instruction = ""
 query_emb = model.encode(query, instruction=query_instruction)
 doc_emb = model.encode(document, instruction=doc_instruction)
-sim = query_emb @ doc_emb.T
 ```
 It is important to also include `trust_remote_code=True` and `torch_dtype="auto"` as discussed earlier.
-NOTE: there seems to be some very slight floating point discrepancy when using the SentenceTransformer (because it does not support bf16 precision), though it should not affect the results in general.
 ## Citation
 ```

 license: cc-by-sa-4.0
 pipeline_tag: feature-extraction
 library_name: transformers
+tags:
+- sentence-transformers
 ---
 ## Model Summary
 ### Transformers
 ```python
+from transformers import AutoModel
 model = AutoModel.from_pretrained("reasonir/ReasonIR-8B", torch_dtype="auto", trust_remote_code=True)
+model = model.to("cuda")
+model.eval()
 query = "The quick brown fox jumps over the lazy dog."
 document = "The quick brown fox jumps over the lazy dog."
 query_instruction = ""
 doc_instruction = ""
 query_emb = model.encode(query, instruction=query_instruction)
 doc_emb = model.encode(document, instruction=doc_instruction)
 sim = query_emb @ doc_emb.T
 ```
 ### Sentence Transformers
+In addition to Transformers, you can also use this model with Sentence Transformers
 ```python
+# pip install sentence-transformers
 from sentence_transformers import SentenceTransformer
 model_kwargs = {"torch_dtype": "auto"}
 model = SentenceTransformer("reasonir/ReasonIR-8B", trust_remote_code=True, model_kwargs=model_kwargs)
 query = "The quick brown fox jumps over the lazy dog."
 document = "The quick brown fox jumps over the lazy dog."
 query_instruction = ""
 doc_instruction = ""
 query_emb = model.encode(query, instruction=query_instruction)
 doc_emb = model.encode(document, instruction=doc_instruction)
+sim = model.similarity(query_emb, doc_emb)
 ```
 It is important to also include `trust_remote_code=True` and `torch_dtype="auto"` as discussed earlier.
+> [!NOTE]
+> There are some very slight floating point discrepancy when using the model via SentenceTransformer caused by how the models are cast to the `bfloat16` dtype, though it should not affect the results in general.
 ## Citation
 ```

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "4.0.2",
+    "transformers": "4.48.2",
+    "pytorch": "2.6.0+cu124"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": "cosine"
+}

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 131072,
+  "do_lower_case": false
+}