A newer version of the Gradio SDK is available:
5.35.0
title: WAQO
emoji: 🐢
colorFrom: indigo
colorTo: yellow
sdk: gradio
sdk_version: 5.29.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Wakili! A quick one!
WAQO - Wakili, A Quick One
A legal assistant chatbot for the Kenya Finance Bill 2025 that provides easy-to-understand explanations of legal concepts and implications.
Features
- Interactive chat interface for querying about the Finance Bill 2025
- Multi-language support (English, Kiswahili, Luo)
- RAG (Retrieval-Augmented Generation) system for accurate responses
- Friendly, conversational tone with Kenyan context
Setup Instructions
Local Development
Clone the repository:
git clone https://huggingface.co/spaces/Wanxai/WAQO cd WAQO
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
Install dependencies:
pip install -r requirements.txt
Create a
.env
file in the project root with your Google API key:GOOGLE_API_KEY=your_api_key_here
Download the Finance Bill 2025 PDF:
- Create a
data
directory in the project root - Place the Finance Bill 2025 PDF in the
data
directory - Name it
finance-bill-2025.pdf
- Create a
Run the application:
python app.py
Access the web interface at http://localhost:7860
Deploying to Hugging Face Spaces
Fork this repository to your Hugging Face account
In the Hugging Face Space settings, add your Google API key as a secret:
- Name:
GOOGLE_API_KEY
- Value: Your Google Generative AI API key
- Name:
Upload the Finance Bill 2025 PDF:
- Go to the "Files" tab in your Space
- Create a
data
directory - Upload the PDF file as
finance-bill-2025.pdf
The Space will automatically deploy with the correct environment
Project Structure
app.py
: Main application with Gradio interfacemain.py
: FastAPI server entry pointapp/services/
: Core services for the chatbotllm_service.py
: Handles interaction with Google's Generative AIvector_store.py
: Manages the vector database for RAGdocument_processor.py
: Processes the PDF document
app/models/
: Data modelsapp/core/
: Configuration and utilitiesdata/
: Directory for storing the Finance Bill PDF
License
This project is licensed under the MIT License - see the LICENSE file for details.
Finance Bill RAG System
A Retrieval-Augmented Generation (RAG) system that processes a locally stored Finance Bill PDF and allows users to query it using natural language. The system uses Google's Gemini 1.5 Flash LLM to generate clear, concise responses based on the document content.
Features
- Automatic PDF processing on startup
- Multiple PDF text extraction methods (PyPDF and PDFPlumber)
- Intelligent text chunking for better context retrieval
- Vector storage using ChromaDB for semantic search
- Natural language querying using Gemini 1.5 Flash LLM
- Markdown-formatted responses for readability
System Architecture
- FastAPI Backend: High-performance API with a single query endpoint
- ChromaDB: Vector database for storing and retrieving document chunks
- Gemini 1.5 Flash: Advanced LLM for generating human-friendly responses
- PDF Processing Pipeline: Robust extraction with multiple fallback methods
Setup
- Clone the repository
- Create a virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Create a
.env
file and add your Google API key:GOOGLE_API_KEY=your_google_api_key
- Place your Finance Bill PDF in the
data
directory asfinance-bill-2025.pdf
- Run the application:
python main.py
API Endpoint
POST /query
: Query the Finance Bill document- Request body:
{ "query": "What changes are proposed for income tax?", "top_k": 4 // Optional, number of chunks to retrieve }
- Response format:
{ "query": "The original question asked", "answer": "Markdown-formatted response generated by Gemini", "sources": [{ "content": "The text chunk from the document", "metadata": { "document_id": "finance-bill-2025", "chunk_index": 0, "chunk_count": 1 }, "score": 0.7167216539382935 // Relevance score }] }
- Request body:
Example Usage
curl -X POST "http://localhost:8000/query" \
-H "Content-Type: application/json" \
-d '{
"query": "What changes are proposed for income tax?"
}'
Interactive Documentation
The system includes Swagger UI documentation at http://localhost:8000/docs
where you can interactively test the API.