Spaces:

Semnykcz
/

Qwen3

Paused

App Files Files Community

Qwen3 / readme.md

Semnykcz

Upload 8 files

ac5ebc8 verified 4 months ago

preview code

raw

history blame

3.24 kB

AI Chat Application for HuggingFace Spaces

A fully functional AI chat application for HuggingFace Spaces integrating Qwen Coder 3 with advanced OPENAI API compatibility features.

Features

Integration with Qwen/Qwen3-Coder-30B-A3B-Instruct model
Advanced OPENAI API compatibility
Professional web interface replicating Perplexity AI design
Responsive layout with TailwindCSS styling
Dark/light mode support
Real-time streaming responses
Conversation history management
Copy response functionality
Typing indicators
Full GPU optimization
Robust error handling and automatic connection recovery
Caching mechanisms
Ready for immediate deployment on HuggingFace Spaces

Technology Stack

Backend: Python, Gradio, FastAPI, Transformers, PyTorch
Frontend: TailwindCSS, JavaScript, HTML5
Infrastructure: Redis for caching, HuggingFace Spaces deployment

Requirements

Python 3.8+
GPU with at least 24GB VRAM (for Qwen/Qwen3-Coder-30B-A3B-Instruct model)
Redis server (optional, for conversation caching)

Installation

Clone this repository:

git clone <repository-url>
cd ai-chat-app

Install dependencies:
```
pip install -r requirements.txt
```
Run the application:
```
python app.py
```

Usage

Web Interface

The application provides a web interface accessible at http://localhost:7860 when running locally. The interface features:

Chat interface similar to Perplexity AI
Dark/light mode toggle
Conversation history sidebar
Copy buttons for responses
Typing indicators during response generation

API Endpoints

The application exposes OPENAI API compatible endpoints:

POST /v1/chat/completions - Chat completion endpoint

Example request:

{
  "messages": [
    {"role": "user", "content": "Hello, how are you?"}
  ],
  "model": "Qwen/Qwen3-Coder-30B-A3B-Instruct",
  "max_tokens": 1024,
  "temperature": 0.7
}

Deployment to HuggingFace Spaces

Create a new Space on HuggingFace with the following configuration:
- SDK: Gradio
- Hardware: GPU (recommended)
Upload all files to your Space repository
The application will automatically start and be accessible through your Space URL

Configuration

The application can be configured through environment variables:

MODEL_NAME: The HuggingFace model identifier (default: Qwen/Qwen3-Coder-30B-A3B-Instruct)
MAX_TOKENS: Default maximum tokens for responses (default: 1024)
TEMPERATURE: Default temperature for generation (default: 0.7)
REDIS_URL: Redis connection URL for caching (optional)

Troubleshooting

GPU Memory Issues

If you encounter GPU memory issues:

Ensure your GPU has at least 24GB VRAM
Try reducing the max_tokens parameter
Use quantization techniques for model loading

Model Loading Errors

If the model fails to load:

Check your internet connection
Ensure you have sufficient disk space
Verify the model identifier is correct

Contributing

Contributions are welcome! Please fork the repository and submit a pull request with your changes.

License

This project is licensed under the MIT License - see the LICENSE file for details.