Intel
/

bge-large-en-v1.5-rag-int8-static

Feature Extraction

text-embeddings-inference

Model card Files Files and versions

peterizsak commited on Feb 19, 2024

Commit

8569aca

·

verified ·

1 Parent(s): 2e43ad1

Upload README.md

Files changed (1) hide show

README.md +9 -8

README.md CHANGED Viewed

@@ -6,11 +6,9 @@ language:
 # BGE-large-en-v1.5-rag-int8-static
-A quantized version of [BAAI/BGE-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) embedder compatible with [Optimum-Intel](https://github.com/huggingface/optimum-intel) and [Intel® Neural Compressor](https://github.com/huggingface/optimum-intel).
-The model can be used with [Optimum-Intel](https://github.com/huggingface/optimum-intel) API and as an embedder/ranker model as part of [fastRAG](https://github.com/IntelLabs/fastRAG).
-See [model page](https://huggingface.co/BAAI/bge-large-en-v1.5) for full details on model architecture and training details.
 ## Technical details
@@ -27,9 +25,12 @@ Instructions how to reproduce the quantized model can be found [here](https://gi
 ## Evaluation - MTEB
 |  | `INT8` | `FP32` | % diff |
 |---|:---:|:---:|:---:|
-| Reranking | 0.5997 | 0.6003 | -0.108% |
 ## Usage
@@ -38,15 +39,15 @@ Instructions how to reproduce the quantized model can be found [here](https://gi
 See [Optimum-intel](https://github.com/huggingface/optimum-intel) installation page for instructions how to install. Or run:
 ``` sh
-pip install -U optimum[neural-compressor] intel-extension-for-transformers
 ```
 Loading a model:
 ``` python
-from optimum.intel import INCModel
-model = INCModel.from_pretrained("Intel/bge-large-en-v1.5-rag-int8-static")
 ```
 Running inference:

 # BGE-large-en-v1.5-rag-int8-static
+A quantized version of [BAAI/BGE-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) quantized with [Intel® Neural Compressor](https://github.com/huggingface/optimum-intel) and compatible with [Optimum-Intel](https://github.com/huggingface/optimum-intel).
+The model can be used with [Optimum-Intel](https://github.com/huggingface/optimum-intel) API and as a standalone model or as an embedder or ranker module as part of [fastRAG](https://github.com/IntelLabs/fastRAG) RAG pipeline.
 ## Technical details
 ## Evaluation - MTEB
+Model performance on the [Massive Text Embedding Benchmark (MTEB)](https://huggingface.co/spaces/mteb/leaderboard) *retrieval* and *reranking* tasks.
 |  | `INT8` | `FP32` | % diff |
 |---|:---:|:---:|:---:|
+| Reranking | 0.5997 | 0.6003  | -0.108% |
+| Retrieval | 0.5346  | 0.5429 | -1.53%  |
 ## Usage
 See [Optimum-intel](https://github.com/huggingface/optimum-intel) installation page for instructions how to install. Or run:
 ``` sh
+pip install -U optimum[neural-compressor, ipex] intel-extension-for-transformers
 ```
 Loading a model:
 ``` python
+from optimum.intel import IPEXModel
+model = IPEXModel.from_pretrained("Intel/bge-large-en-v1.5-rag-int8-static")
 ```
 Running inference: