nurasaki commited on
Commit
b1d1e48
·
1 Parent(s): 57882da

Improved README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -0
README.md CHANGED
@@ -14,3 +14,28 @@ short_description: Conversational space enhanced with Viquipedia RAG
14
  # Wiki Tools
15
 
16
  An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  # Wiki Tools
15
 
16
  An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
17
+
18
+ This space demonstrates how to build a conversational AI application enhanced with Retrieval-Augmented Generation (RAG) using a Vector Database (VectorDB) built from Viquipedia articles.
19
+
20
+
21
+ ### Current resources
22
+ - **VectorDB mRoBERTA**: 9.9GB / 2400k vectors [langtech-innovation/mRoberta_experimental_ViquipediaVectorStore](https://huggingface.co/langtech-innovation/mRoberta_experimental_ViquipediaVectorStore)
23
+ - **Embedding model mRoBERTA**: [langtech-innovation/sentence-mRoBERTa-v0](https://huggingface.co/langtech-innovation/sentence-mRoBERTa-v0)
24
+ - **LLM endpoint model** Salamandra-7B-Instruct-Tools-16k
25
+
26
+ <!-- Disclaimer -->
27
+ > [!WARNING]
28
+ > **DISCLAIMER:** This model is an **experimental version** and is provided for **research purposes only**.
29
+ > Access is **not public**.
30
+ > Please do not share.
31
+
32
+
33
+ ### Alternative resources
34
+
35
+ Configure other available alternative resources for embeddings and VectorDB.
36
+
37
+ Setup `VS_HF_PATH=langtech-innovation/vdb-cawiki-v3` and `EMBEDDINGS_MODEL=BAAI/bge-m3` variables to switch between resources. In this case, the system will use:
38
+
39
+ - **VectorDB BGE-M3**: 12.3GB / 2400k vectors [langtech-innovation/vdb-cawiki-v3](https://huggingface.co/langtech-innovation/vdb-cawiki-v3)
40
+ - **Embedding model BGE-M3**: [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3)
41
+