Qwen3 / readme.md
Semnykcz's picture
Upload 8 files
ac5ebc8 verified
|
raw
history blame
3.24 kB

AI Chat Application for HuggingFace Spaces

A fully functional AI chat application for HuggingFace Spaces integrating Qwen Coder 3 with advanced OPENAI API compatibility features.

Features

  • Integration with Qwen/Qwen3-Coder-30B-A3B-Instruct model
  • Advanced OPENAI API compatibility
  • Professional web interface replicating Perplexity AI design
  • Responsive layout with TailwindCSS styling
  • Dark/light mode support
  • Real-time streaming responses
  • Conversation history management
  • Copy response functionality
  • Typing indicators
  • Full GPU optimization
  • Robust error handling and automatic connection recovery
  • Caching mechanisms
  • Ready for immediate deployment on HuggingFace Spaces

Technology Stack

  • Backend: Python, Gradio, FastAPI, Transformers, PyTorch
  • Frontend: TailwindCSS, JavaScript, HTML5
  • Infrastructure: Redis for caching, HuggingFace Spaces deployment

Requirements

  • Python 3.8+
  • GPU with at least 24GB VRAM (for Qwen/Qwen3-Coder-30B-A3B-Instruct model)
  • Redis server (optional, for conversation caching)

Installation

  1. Clone this repository:

    git clone <repository-url>
    cd ai-chat-app
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Run the application:

    python app.py
    

Usage

Web Interface

The application provides a web interface accessible at http://localhost:7860 when running locally. The interface features:

  • Chat interface similar to Perplexity AI
  • Dark/light mode toggle
  • Conversation history sidebar
  • Copy buttons for responses
  • Typing indicators during response generation

API Endpoints

The application exposes OPENAI API compatible endpoints:

  • POST /v1/chat/completions - Chat completion endpoint

Example request:

{
  "messages": [
    {"role": "user", "content": "Hello, how are you?"}
  ],
  "model": "Qwen/Qwen3-Coder-30B-A3B-Instruct",
  "max_tokens": 1024,
  "temperature": 0.7
}

Deployment to HuggingFace Spaces

  1. Create a new Space on HuggingFace with the following configuration:

    • SDK: Gradio
    • Hardware: GPU (recommended)
  2. Upload all files to your Space repository

  3. The application will automatically start and be accessible through your Space URL

Configuration

The application can be configured through environment variables:

  • MODEL_NAME: The HuggingFace model identifier (default: Qwen/Qwen3-Coder-30B-A3B-Instruct)
  • MAX_TOKENS: Default maximum tokens for responses (default: 1024)
  • TEMPERATURE: Default temperature for generation (default: 0.7)
  • REDIS_URL: Redis connection URL for caching (optional)

Troubleshooting

GPU Memory Issues

If you encounter GPU memory issues:

  1. Ensure your GPU has at least 24GB VRAM
  2. Try reducing the max_tokens parameter
  3. Use quantization techniques for model loading

Model Loading Errors

If the model fails to load:

  1. Check your internet connection
  2. Ensure you have sufficient disk space
  3. Verify the model identifier is correct

Contributing

Contributions are welcome! Please fork the repository and submit a pull request with your changes.

License

This project is licensed under the MIT License - see the LICENSE file for details.