π§ Model Configuration Guide
The backend now supports configurable models via environment variables, making it easy to switch between different AI models without code changes.
π Environment Variables
Primary Configuration
# Main AI model for text generation (required)
export AI_MODEL="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B"
# Vision model for image processing (optional)
export VISION_MODEL="Salesforce/blip-image-captioning-base"
# HuggingFace token for private models (optional)
export HF_TOKEN="your_huggingface_token_here"
π Usage Examples
1. Use DeepSeek-R1 (Default)
# Uses your originally requested model
export AI_MODEL="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B"
./gradio_env/bin/python backend_service.py
2. Use DialoGPT (Faster, smaller)
# Switch to lighter model for development/testing
export AI_MODEL="microsoft/DialoGPT-medium"
./gradio_env/bin/python backend_service.py
3. Use Unsloth 4-bit Quantized Models
# Use Unsloth 4-bit Mistral model (memory efficient)
export AI_MODEL="unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit"
./gradio_env/bin/python backend_service.py
# Use other Unsloth models
export AI_MODEL="unsloth/llama-3-8b-Instruct-bnb-4bit"
./gradio_env/bin/python backend_service.py
4. Use Other Popular Models
# Use Zephyr chat model
export AI_MODEL="HuggingFaceH4/zephyr-7b-beta"
./gradio_env/bin/python backend_service.py
# Use CodeLlama for code generation
export AI_MODEL="codellama/CodeLlama-7b-Instruct-hf"
./gradio_env/bin/python backend_service.py
# Use Mistral
export AI_MODEL="mistralai/Mistral-7B-Instruct-v0.2"
./gradio_env/bin/python backend_service.py
5. Use Different Vision Model
export AI_MODEL="microsoft/DialoGPT-medium"
export VISION_MODEL="nlpconnect/vit-gpt2-image-captioning"
./gradio_env/bin/python backend_service.py
π Startup Script Examples
Development Mode (Fast startup)
#!/bin/bash
# dev_mode.sh
export AI_MODEL="microsoft/DialoGPT-medium"
export VISION_MODEL="Salesforce/blip-image-captioning-base"
./gradio_env/bin/python backend_service.py
Production Mode (Your preferred model)
#!/bin/bash
# production_mode.sh
export AI_MODEL="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B"
export VISION_MODEL="Salesforce/blip-image-captioning-base"
export HF_TOKEN="$YOUR_HF_TOKEN"
./gradio_env/bin/python backend_service.py
Testing Mode (Lightweight)
#!/bin/bash
# test_mode.sh
export AI_MODEL="microsoft/DialoGPT-medium"
export VISION_MODEL="Salesforce/blip-image-captioning-base"
./gradio_env/bin/python backend_service.py
π Model Verification
After starting the backend, check which model is loaded:
curl http://localhost:8000/health
Response will show:
{
"status": "healthy",
"model": "deepseek-ai/DeepSeek-R1-0528-Qwen3-8B",
"version": "1.0.0"
}
π Model Comparison
Model | Size | Speed | Quality | Use Case |
---|---|---|---|---|
microsoft/DialoGPT-medium |
~355MB | β‘ Fast | Good | Development/Testing |
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B |
~16GB | π Slow | β Excellent | Production |
unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit |
~7GB | π Medium | β Excellent | Production (4-bit) |
HuggingFaceH4/zephyr-7b-beta |
~14GB | π Slow | β Excellent | Chat/Conversation |
codellama/CodeLlama-7b-Instruct-hf |
~13GB | π Slow | β Good | Code Generation |
π οΈ Troubleshooting
Model Not Found
# Verify model exists on HuggingFace
./gradio_env/bin/python -c "
from huggingface_hub import HfApi
api = HfApi()
try:
info = api.model_info('your-model-name')
print(f'β
Model exists: {info.id}')
except:
print('β Model not found')
"
Memory Issues
# Use smaller model for limited RAM
export AI_MODEL="microsoft/DialoGPT-medium" # ~355MB
# or
export AI_MODEL="distilgpt2" # ~82MB
Authentication Issues
# Set HuggingFace token for private models
export HF_TOKEN="hf_your_token_here"
π― Quick Switch Commands
# Quick switch to development mode
export AI_MODEL="microsoft/DialoGPT-medium" && ./gradio_env/bin/python backend_service.py
# Quick switch to production mode
export AI_MODEL="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B" && ./gradio_env/bin/python backend_service.py
# Quick switch with custom vision model
export AI_MODEL="microsoft/DialoGPT-medium" AI_VISION="nlpconnect/vit-gpt2-image-captioning" && ./gradio_env/bin/python backend_service.py
β Summary
- Environment Variable:
AI_MODEL
controls the main text generation model - Default:
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
(your original preference) - Alternative:
microsoft/DialoGPT-medium
(faster for development) - Vision Model:
VISION_MODEL
controls image processing model - No Code Changes: Switch models by changing environment variables only
Your original DeepSeek-R1 model is still the default - I simply made it configurable so you can easily switch when needed!