Spaces:
Sleeping
Sleeping
Improved README.md
Browse files
README.md
CHANGED
|
@@ -14,3 +14,28 @@ short_description: Conversational space enhanced with Viquipedia RAG
|
|
| 14 |
# Wiki Tools
|
| 15 |
|
| 16 |
An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
# Wiki Tools
|
| 15 |
|
| 16 |
An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
|
| 17 |
+
|
| 18 |
+
This space demonstrates how to build a conversational AI application enhanced with Retrieval-Augmented Generation (RAG) using a Vector Database (VectorDB) built from Viquipedia articles.
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
### Current resources
|
| 22 |
+
- **VectorDB mRoBERTA**: 9.9GB / 2400k vectors [langtech-innovation/mRoberta_experimental_ViquipediaVectorStore](https://huggingface.co/langtech-innovation/mRoberta_experimental_ViquipediaVectorStore)
|
| 23 |
+
- **Embedding model mRoBERTA**: [langtech-innovation/sentence-mRoBERTa-v0](https://huggingface.co/langtech-innovation/sentence-mRoBERTa-v0)
|
| 24 |
+
- **LLM endpoint model** Salamandra-7B-Instruct-Tools-16k
|
| 25 |
+
|
| 26 |
+
<!-- Disclaimer -->
|
| 27 |
+
> [!WARNING]
|
| 28 |
+
> **DISCLAIMER:** This model is an **experimental version** and is provided for **research purposes only**.
|
| 29 |
+
> Access is **not public**.
|
| 30 |
+
> Please do not share.
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
### Alternative resources
|
| 34 |
+
|
| 35 |
+
Configure other available alternative resources for embeddings and VectorDB.
|
| 36 |
+
|
| 37 |
+
Setup `VS_HF_PATH=langtech-innovation/vdb-cawiki-v3` and `EMBEDDINGS_MODEL=BAAI/bge-m3` variables to switch between resources. In this case, the system will use:
|
| 38 |
+
|
| 39 |
+
- **VectorDB BGE-M3**: 12.3GB / 2400k vectors [langtech-innovation/vdb-cawiki-v3](https://huggingface.co/langtech-innovation/vdb-cawiki-v3)
|
| 40 |
+
- **Embedding model BGE-M3**: [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3)
|
| 41 |
+
|