111
#8
by
MatrixYao
- opened
- README.md +9 -66
- model.safetensors +0 -3
- onnx/model.onnx +0 -3
README.md
CHANGED
@@ -2622,30 +2622,18 @@ language:
|
|
2622 |
<p>
|
2623 |
</h4>
|
2624 |
|
2625 |
-
|
2626 |
-
|
2627 |
-
If you are looking for a model that supports more languages, longer texts, and other retrieval methods, you can try using [bge-m3](https://huggingface.co/BAAI/bge-m3).
|
2628 |
|
2629 |
|
2630 |
[English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
|
2631 |
|
2632 |
-
FlagEmbedding
|
2633 |
-
|
2634 |
-
|
2635 |
-
|
2636 |
-
-
|
2637 |
-
-
|
2638 |
-
-
|
2639 |
-
|
2640 |
-
## News
|
2641 |
-
- 1/30/2024: Release **BGE-M3**, a new member to BGE model series! M3 stands for **M**ulti-linguality (100+ languages), **M**ulti-granularities (input length up to 8192), **M**ulti-Functionality (unification of dense, lexical, multi-vec/colbert retrieval).
|
2642 |
-
It is the first embedding model that supports all three retrieval methods, achieving new SOTA on multi-lingual (MIRACL) and cross-lingual (MKQA) benchmarks.
|
2643 |
-
[Technical Report](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/BGE_M3/BGE_M3.pdf) and [Code](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3). :fire:
|
2644 |
-
- 1/9/2024: Release [Activation-Beacon](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon), an effective, efficient, compatible, and low-cost (training) method to extend the context length of LLM. [Technical Report](https://arxiv.org/abs/2401.03462) :fire:
|
2645 |
-
- 12/24/2023: Release **LLaRA**, a LLaMA-7B based dense retriever, leading to state-of-the-art performances on MS MARCO and BEIR. Model and code will be open-sourced. Please stay tuned. [Technical Report](https://arxiv.org/abs/2312.15503) :fire:
|
2646 |
-
- 11/23/2023: Release [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail), a method to maintain general capabilities during fine-tuning by merging multiple language models. [Technical Report](https://arxiv.org/abs/2311.13534) :fire:
|
2647 |
-
- 10/12/2023: Release [LLM-Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), a unified embedding model to support diverse retrieval augmentation needs for LLMs. [Technical Report](https://arxiv.org/pdf/2310.07554.pdf)
|
2648 |
-
- 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) and [massive training data](https://data.baai.ac.cn/details/BAAI-MTP) of BGE has been released
|
2649 |
- 09/12/2023: New models:
|
2650 |
- **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
|
2651 |
- **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
|
@@ -2670,7 +2658,6 @@ It is the first embedding model that supports all three retrieval methods, achie
|
|
2670 |
|
2671 |
| Model | Language | | Description | query instruction for retrieval [1] |
|
2672 |
|:-------------------------------|:--------:| :--------:| :--------:|:--------:|
|
2673 |
-
| [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) | Multilingual | [Inference](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3#usage) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3) | Multi-Functionality(dense retrieval, sparse retrieval, multi-vector(colbert)), Multi-Linguality, and Multi-Granularity(8192 tokens) | |
|
2674 |
| [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
|
2675 |
| [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
2676 |
| [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
@@ -2687,6 +2674,7 @@ It is the first embedding model that supports all three retrieval methods, achie
|
|
2687 |
| [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a base-scale model but with similar ability to `bge-large-zh` | `为这个句子生成表示以用于检索相关文章:` |
|
2688 |
| [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a small-scale model but with competitive performance | `为这个句子生成表示以用于检索相关文章:` |
|
2689 |
|
|
|
2690 |
[1\]: If you need to search the relevant passages to a query, we suggest to add the instruction to the query; in other cases, no instruction is needed, just use the original query directly. In all cases, **no instruction** needs to be added to passages.
|
2691 |
|
2692 |
[2\]: Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. To balance the accuracy and time cost, cross-encoder is widely used to re-rank top-k documents retrieved by other simple models.
|
@@ -2864,51 +2852,6 @@ sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, di
|
|
2864 |
print("Sentence embeddings:", sentence_embeddings)
|
2865 |
```
|
2866 |
|
2867 |
-
#### Usage of the ONNX files
|
2868 |
-
|
2869 |
-
```python
|
2870 |
-
from optimum.onnxruntime import ORTModelForFeatureExtraction # type: ignore
|
2871 |
-
|
2872 |
-
import torch
|
2873 |
-
from transformers import AutoModel, AutoTokenizer
|
2874 |
-
|
2875 |
-
tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-large-en-v1.5')
|
2876 |
-
model = AutoModel.from_pretrained('BAAI/bge-large-en-v1.5', revision="refs/pr/13")
|
2877 |
-
model_ort = ORTModelForFeatureExtraction.from_pretrained('BAAI/bge-large-en-v1.5', revision="refs/pr/13",file_name="onnx/model.onnx")
|
2878 |
-
|
2879 |
-
# Sentences we want sentence embeddings for
|
2880 |
-
sentences = ["样例数据-1", "样例数据-2"]
|
2881 |
-
|
2882 |
-
# Tokenize sentences
|
2883 |
-
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
|
2884 |
-
# for s2p(short query to long passage) retrieval task, add an instruction to query (not add instruction for passages)
|
2885 |
-
# encoded_input = tokenizer([instruction + q for q in queries], padding=True, truncation=True, return_tensors='pt')
|
2886 |
-
|
2887 |
-
model_output_ort = model_ort(**encoded_input)
|
2888 |
-
# Compute token embeddings
|
2889 |
-
with torch.no_grad():
|
2890 |
-
model_output = model(**encoded_input)
|
2891 |
-
|
2892 |
-
# model_output and model_output_ort are identical
|
2893 |
-
|
2894 |
-
```
|
2895 |
-
|
2896 |
-
Its also possible to deploy the onnx files with the [infinity_emb](https://github.com/michaelfeil/infinity) pip package.
|
2897 |
-
```python
|
2898 |
-
import asyncio
|
2899 |
-
from infinity_emb import AsyncEmbeddingEngine, EngineArgs
|
2900 |
-
|
2901 |
-
sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]
|
2902 |
-
engine = AsyncEmbeddingEngine.from_args(
|
2903 |
-
EngineArgs(model_name_or_path = "BAAI/bge-large-en-v1.5", device="cpu", engine="optimum" # or engine="torch"
|
2904 |
-
))
|
2905 |
-
|
2906 |
-
async def main():
|
2907 |
-
async with engine:
|
2908 |
-
embeddings, usage = await engine.embed(sentences=sentences)
|
2909 |
-
asyncio.run(main())
|
2910 |
-
```
|
2911 |
-
|
2912 |
### Usage for Reranker
|
2913 |
|
2914 |
Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding.
|
|
|
2622 |
<p>
|
2623 |
</h4>
|
2624 |
|
2625 |
+
More details please refer to our Github: [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding).
|
|
|
|
|
2626 |
|
2627 |
|
2628 |
[English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
|
2629 |
|
2630 |
+
FlagEmbedding can map any text to a low-dimensional dense vector which can be used for tasks like retrieval, classification, clustering, or semantic search.
|
2631 |
+
And it also can be used in vector databases for LLMs.
|
2632 |
+
|
2633 |
+
************* 🌟**Updates**🌟 *************
|
2634 |
+
- 10/12/2023: Release [LLM-Embedder](./FlagEmbedding/llm_embedder/README.md), a unified embedding model to support diverse retrieval augmentation needs for LLMs. [Paper](https://arxiv.org/pdf/2310.07554.pdf) :fire:
|
2635 |
+
- 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) of BGE has been released
|
2636 |
+
- 09/15/2023: The [masive training data](https://data.baai.ac.cn/details/BAAI-MTP) of BGE has been released
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2637 |
- 09/12/2023: New models:
|
2638 |
- **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
|
2639 |
- **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
|
|
|
2658 |
|
2659 |
| Model | Language | | Description | query instruction for retrieval [1] |
|
2660 |
|:-------------------------------|:--------:| :--------:| :--------:|:--------:|
|
|
|
2661 |
| [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
|
2662 |
| [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
2663 |
| [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
|
|
2674 |
| [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a base-scale model but with similar ability to `bge-large-zh` | `为这个句子生成表示以用于检索相关文章:` |
|
2675 |
| [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh) | Chinese | [Inference](#usage-for-embedding-model) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune) | a small-scale model but with competitive performance | `为这个句子生成表示以用于检索相关文章:` |
|
2676 |
|
2677 |
+
|
2678 |
[1\]: If you need to search the relevant passages to a query, we suggest to add the instruction to the query; in other cases, no instruction is needed, just use the original query directly. In all cases, **no instruction** needs to be added to passages.
|
2679 |
|
2680 |
[2\]: Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. To balance the accuracy and time cost, cross-encoder is widely used to re-rank top-k documents retrieved by other simple models.
|
|
|
2852 |
print("Sentence embeddings:", sentence_embeddings)
|
2853 |
```
|
2854 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2855 |
### Usage for Reranker
|
2856 |
|
2857 |
Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding.
|
model.safetensors
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:45e1954914e29bd74080e6c1510165274ff5279421c89f76c418878732f64ae7
|
3 |
-
size 1340616616
|
|
|
|
|
|
|
|
onnx/model.onnx
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:69ed3f810d3b6d13f70dff9ca89966f39c0a0e877fb88211be7bcc070df2a2ce
|
3 |
-
size 1336854281
|
|
|
|
|
|
|
|