Exodus: Bilingual FAQ Chatbot with Hybrid Model Support
This is a production-ready bilingual chatbot that combines Llama 3 (via Ollama) and Qwen2.5-7B-Instruct-AWQ for handling both English and Arabic queries. The system uses a hybrid approach with semantic search and fine-tuned responses.
Model Description
The system uses two main models:
Llama 3 (via Ollama):
- Used for simpler queries and general conversation
- Managed through Ollama for efficient resource usage
- Requires ~2GB system RAM
Qwen2.5-7B-Instruct-AWQ:
- Used for complex queries and specialized tasks
- 4-bit quantized for efficiency
- Requires ~8GB VRAM + ~4GB system RAM
- Fine-tuned on FAQ data using QLoRA
Intended Use
This model is designed for:
- Bilingual FAQ systems (English and Arabic)
- Customer support automation
- Documentation assistance
- General conversational tasks
Out-of-Scope Uses
This model should not be used for:
- Critical decision-making without human oversight
- Medical, legal, or financial advice
- Generation of harmful or inappropriate content
How to Use
1. Environment Setup
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
2. Model Setup
Ollama Setup (for Llama 3):
# Install Ollama from https://ollama.ai
# Pull the Llama 3 model
ollama pull llama3:latest
Qwen Setup:
# The model will be downloaded automatically when first used
# Or manually download from HuggingFace:
huggingface-cli download Qwen/Qwen2.5-7B-Instruct-AWQ
3. Running the System
# Start the server
python main.py
# In a separate terminal, start the frontend
cd frontend
npm install
npm run dev
4. API Usage
import requests
# Switch between models
response = requests.post("http://localhost:8000/api/models/switch",
json={"model_id": "llama3"})
# Send a query
response = requests.post("http://localhost:8000/api/chat",
json={"message": "What is your return policy?",
"language": "en"})
Fine-tuning
To fine-tune the model on your own FAQ data:
- Prepare your data in JSONL format:
{"question": "What is your return policy?", "answer": "Our return policy allows..."}
- Run the fine-tuning pipeline:
cd Fine-tune
./run.sh
For more detailed fine-tuning options:
./run.sh --help
Performance and Limitations
Performance Metrics
- Response time:
- Llama 3 (via Ollama): ~1-2 seconds
- Qwen2.5: ~2-3 seconds
- Memory usage:
- Llama 3: ~2GB RAM
- Qwen2.5: ~8GB VRAM + ~4GB RAM
- Accuracy on FAQ queries: >90%
- Arabic language support accuracy: >85%
Limitations
Resource Requirements:
- Minimum 8GB VRAM for Qwen2.5
- Stable internet for initial model downloads
Language Support:
- Primary support for English and Arabic
- Other languages may work but are not optimized
Response Time:
- May be slower on CPU-only systems
- Complex queries take longer to process
Training Details
Training Data
The model can be fine-tuned on:
- FAQ datasets
- Customer support conversations
- Documentation Q&A pairs
Training Process
Data Preparation:
- Convert to JSONL format
- Clean and preprocess text
- Split into train/validation sets
Fine-tuning:
- Uses QLoRA for efficient training
- 4-bit quantization
- Parameter-Efficient Fine-Tuning (PEFT)
Hyperparameters:
- Learning rate: 2e-4
- Batch size: 4
- LoRA rank: 8
- LoRA alpha: 16
Ethical Considerations
Content Filtering:
- Built-in profanity filtering
- Content moderation for both English and Arabic
Bias Mitigation:
- Regular evaluation of responses
- Configurable content filters
Privacy:
- No user data retention
- Local model deployment option
Maintenance and Support
The project is actively maintained with:
- Regular model updates
- Security patches
- Bug fixes
- Community contributions
For support:
- Check the GitHub Issues
- Review the Documentation
- Join the community discussions
Citation
If you use this model in your research, please cite:
@software{exodus_chatbot,
title = {Exodus: Bilingual FAQ Chatbot with Hybrid Model Support},
year = {2024},
url = {https://github.com/yazeedmshayekh2/Exodus},
author = {Yazeed Mshayekh},
version = {1.0.0}
}
License
This project is licensed under the MIT License. See the LICENSE file for details.
- Downloads last month
- 26