A newer version of the Gradio SDK is available:
5.46.0
AI Chat Application with Qwen Coder
This is a fully functional AI chat application built for HuggingFace Spaces, integrating the Qwen/Qwen3-Coder-30B-A3B-Instruct model with advanced OPENAI API compatibility features.
Features
- Qwen Coder 3 Integration: Direct integration with the Qwen/Qwen3-Coder-30B-A3B-Instruct model
- OPENAI API Compatibility: Implements OPENAI API endpoints for seamless integration
- Streaming Responses: Real-time response streaming for interactive chat experience
- Conversation History: Persistent conversation history management
- Modern UI: Responsive design inspired by Perplexity AI with TailwindCSS
- Dark/Light Mode: Support for both dark and light themes
- Copy Responses: One-click copying of AI responses
- Typing Indicators: Visual indicators for AI response generation
- GPU Optimization: Full GPU optimization for maximum performance
- Error Handling: Robust error handling with automatic connection recovery
- Caching: Efficient caching mechanisms for improved performance
Project Structure
/
βββ app.py # Main application entry point
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ public/ # Frontend static files
β βββ index.html # Main HTML file
β βββ styles.css # TailwindCSS styles
β βββ app.js # JavaScript logic
βββ utils/ # Utility modules
βββ model_utils.py # Model management utilities
βββ conversation.py # Conversation management
βββ api_compat.py # OPENAI API compatibility
Requirements
- Python 3.8+
- GPU with CUDA support (recommended)
- 32GB+ RAM (for optimal performance with Qwen Coder 3)
Installation
Clone this repository:
git clone <repository-url> cd <repository-name>
Install dependencies:
pip install -r requirements.txt
Run the application:
python app.py
Deployment to HuggingFace Spaces
Create a new Space on HuggingFace:
- Go to https://huggingface.co/new-space
- Choose "Gradio" as the Space SDK
- Select a GPU hardware (recommended for Qwen Coder 3)
Upload files to your Space repository:
- Upload all files from this repository
- Make sure to include the
requirements.txt
file
Configure the Space:
- The Space will automatically detect and install dependencies from
requirements.txt
- The application will start automatically on port 7860
- The Space will automatically detect and install dependencies from
Access your deployed application:
- Once the build is complete, your application will be available at the provided URL
API Endpoints
OPENAI API Compatible Endpoint
POST /v1/chat/completions
Request format:
{
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"model": "Qwen/Qwen3-Coder-30B-A3B-Instruct",
"max_tokens": 1024,
"temperature": 0.7
}
Frontend Chat Endpoint
POST /chat
Request format:
{
"message": "Hello!",
"history": [
{"role": "user", "content": "Previous message"},
{"role": "assistant", "content": "Previous response"}
]
}
Customization
Model Configuration
You can customize the model behavior by modifying the parameters in utils/model_utils.py
:
DEFAULT_MAX_TOKENS
: Maximum tokens to generateDEFAULT_TEMPERATURE
: Sampling temperature
UI Customization
The UI can be customized by modifying:
public/styles.css
: CSS styles with TailwindCSSpublic/app.js
: JavaScript logicpublic/index.html
: HTML structure
Troubleshooting
Common Issues
Model Loading Errors:
- Ensure you have sufficient RAM and GPU memory
- Check that the model name is correct in
utils/model_utils.py
CUDA Out of Memory:
- Reduce
DEFAULT_MAX_TOKENS
inutils/model_utils.py
- Use a smaller model variant if available
- Reduce
Dependency Installation Failures:
- Check the HuggingFace Space logs for specific error messages
- Ensure all dependencies are listed in
requirements.txt
Performance Optimization
GPU Usage:
- The application automatically detects and uses CUDA if available
- For CPU-only environments, performance will be significantly slower
Caching:
- Redis is used for caching if available
- In-memory storage is used as fallback
Contributing
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Qwen team for the Qwen/Qwen3-Coder-30B-A3B-Instruct model
- HuggingFace for providing the platform
- Gradio team for the web interface framework