Alibaba-NLP
/

gte-reranker-modernbert-base

@@ -10,6 +10,7 @@ library_name: transformers
 tags:
 - sentence-transformers
 - transformers.js
 ---
 # gte-reranker-modernbert-base
@@ -129,6 +130,46 @@ const { logits } = await model(inputs);
 console.log(logits.tolist()); // [[2.138258218765259], [2.4609625339508057], [-1.6775450706481934]]
 ```
 ## Training Details
 The `gte-modernbert` series of models follows the training scheme of the previous [GTE models](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469), with the only difference being that the pre-training language model base has been replaced from [GTE-MLM](https://huggingface.co/Alibaba-NLP/gte-en-mlm-base) to [ModernBert](https://huggingface.co/answerdotai/ModernBERT-base). For more training details, please refer to our paper: [mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval](https://aclanthology.org/2024.emnlp-industry.103/)

 tags:
 - sentence-transformers
 - transformers.js
+- text-embeddings-inference
 ---
 # gte-reranker-modernbert-base
 console.log(logits.tolist()); // [[2.138258218765259], [2.4609625339508057], [-1.6775450706481934]]
 ```
+Additionally, you can also deploy `Alibaba-NLP/gte-reranker-modernbert-base` with [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference) as follows:
+- CPU
+```bash
+docker run --platform linux/amd64 \
+  -p 8080:80 \
+  -v $PWD/data:/data \
+  --pull always \
+  ghcr.io/huggingface/text-embeddings-inference:cpu-1.7 \
+  --model-id Alibaba-NLP/gte-reranker-modernbert-base
+```
+- GPU
+```bash
+docker run --platform linux/amd64 \
+  --gpus all \
+  -p 8080:80 \
+  -v $PWD/data:/data \
+  --pull always \
+  ghcr.io/huggingface/text-embeddings-inference:1.7 \
+  --model-id Alibaba-NLP/gte-reranker-modernbert-base
+```
+Then you can send requests to the deployed API via the `/rerank` route (see the [Text Embeddings Inference OpenAPI Specification](https://huggingface.github.io/text-embeddings-inference/) for more details):
+```bash
+curl https://0.0.0.0:8080/rerank \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query": "What is the capital of China?",
+    "raw_scores": false,
+    "return_text": false,
+    "texts": [ "Beijing" ],
+    "truncate": true,
+    "truncation_direction": "right"
+  }'
+```
 ## Training Details
 The `gte-modernbert` series of models follows the training scheme of the previous [GTE models](https://huggingface.co/collections/Alibaba-NLP/gte-models-6680f0b13f885cb431e6d469), with the only difference being that the pre-training language model base has been replaced from [GTE-MLM](https://huggingface.co/Alibaba-NLP/gte-en-mlm-base) to [ModernBert](https://huggingface.co/answerdotai/ModernBERT-base). For more training details, please refer to our paper: [mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval](https://aclanthology.org/2024.emnlp-industry.103/)