| # Norwegian RAG Chatbot Project Structure | |
| ## Overview | |
| This document outlines the project structure for our lightweight Norwegian RAG chatbot implementation that uses Hugging Face's Inference API instead of running models locally. | |
| ## Directory Structure | |
| ``` | |
| chatbot_project/ | |
| βββ design/ # Design documents | |
| β βββ rag_architecture.md | |
| β βββ document_processing.md | |
| β βββ chat_interface.md | |
| βββ research/ # Research findings | |
| β βββ norwegian_llm_research.md | |
| βββ src/ # Source code | |
| β βββ api/ # API integration | |
| β β βββ __init__.py | |
| β β βββ huggingface_api.py # HF Inference API integration | |
| β β βββ config.py # API configuration | |
| β βββ document_processing/ # Document processing | |
| β β βββ __init__.py | |
| β β βββ extractor.py # Text extraction from documents | |
| β β βββ chunker.py # Text chunking | |
| β β βββ processor.py # Main document processor | |
| β βββ rag/ # RAG implementation | |
| β β βββ __init__.py | |
| β β βββ retriever.py # Document retrieval | |
| β β βββ generator.py # Response generation | |
| β βββ web/ # Web interface | |
| β β βββ __init__.py | |
| β β βββ app.py # Gradio app | |
| β β βββ embed.py # Embedding functionality | |
| β βββ utils/ # Utilities | |
| β β βββ __init__.py | |
| β β βββ helpers.py # Helper functions | |
| β βββ main.py # Main application entry point | |
| βββ data/ # Data storage | |
| β βββ documents/ # Original documents | |
| β βββ processed/ # Processed documents and embeddings | |
| βββ tests/ # Tests | |
| β βββ test_api.py | |
| β βββ test_document_processing.py | |
| β βββ test_rag.py | |
| βββ venv/ # Virtual environment | |
| βββ requirements-ultra-light.txt # Lightweight dependencies | |
| βββ requirements.txt # Original requirements (for reference) | |
| βββ README.md # Project documentation | |
| ``` | |
| ## Key Components | |
| ### 1. API Integration (`src/api/`) | |
| - `huggingface_api.py`: Integration with Hugging Face Inference API for both LLM and embedding models | |
| - `config.py`: Configuration for API endpoints, model IDs, and API keys | |
| ### 2. Document Processing (`src/document_processing/`) | |
| - `extractor.py`: Extract text from various document formats | |
| - `chunker.py`: Split documents into manageable chunks | |
| - `processor.py`: Orchestrate the document processing pipeline | |
| ### 3. RAG Implementation (`src/rag/`) | |
| - `retriever.py`: Retrieve relevant document chunks based on query | |
| - `generator.py`: Generate responses using retrieved context | |
| ### 4. Web Interface (`src/web/`) | |
| - `app.py`: Gradio web interface for the chatbot | |
| - `embed.py`: Generate embedding code for website integration | |
| ### 5. Main Application (`src/main.py`) | |
| - Entry point for the application | |
| - Orchestrates the different components | |
| ## Implementation Approach | |
| 1. **Remote Model Execution**: Use Hugging Face's Inference API for both LLM and embedding models | |
| 2. **Lightweight Document Processing**: Process documents locally but use remote APIs for embedding generation | |
| 3. **Simple Vector Storage**: Store embeddings in simple file-based format rather than dedicated vector database | |
| 4. **Gradio Interface**: Create a simple but effective chat interface using Gradio | |
| 5. **Hugging Face Spaces Deployment**: Deploy the final solution to Hugging Face Spaces | |