firstAI / README.md
ndc8
πŸ”§ Fix HuggingFace Space compatibility
b6cf19e
|
raw
history blame
8.67 kB
---
title: Multimodal AI Backend Service
emoji: πŸš€
colorFrom: yellow
colorTo: purple
sdk: docker
app_port: 8000
pinned: false
---
# firstAI - Multimodal AI Backend πŸš€
A powerful AI backend service with **multimodal capabilities** - supporting both text generation and image analysis using transformers pipelines.
## πŸŽ‰ Features
### πŸ€– Dual AI Models
- **Text Generation**: Microsoft DialoGPT-medium for conversations
- **Image Analysis**: Salesforce BLIP for image captioning and visual Q&A
### πŸ–ΌοΈ Multimodal Support
- Process text-only messages
- Analyze images from URLs
- Combined image + text conversations
- OpenAI Vision API compatible format
### πŸ”§ Production Ready
- FastAPI backend with automatic docs
- Comprehensive error handling
- Health checks and monitoring
- PyTorch with MPS acceleration (Apple Silicon)
## πŸš€ Quick Start
### 1. Install Dependencies
```bash
pip install -r requirements.txt
```
### 2. Start the Service
```bash
python backend_service.py
```
### 3. Test Multimodal Capabilities
```bash
python test_final.py
```
The service will start on **http://localhost:8001** with both text and vision models loaded.
## πŸ’‘ Usage Examples
### Text-Only Chat
```bash
curl -X POST http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "microsoft/DialoGPT-medium",
"messages": [{"role": "user", "content": "Hello!"}]
}'
```
### Image Analysis
```bash
curl -X POST http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Salesforce/blip-image-captioning-base",
"messages": [
{
"role": "user",
"content": [
{
"type": "image",
"url": "https://example.com/image.jpg"
}
]
}
]
}'
```
### Multimodal (Image + Text)
```bash
curl -X POST http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Salesforce/blip-image-captioning-base",
"messages": [
{
"role": "user",
"content": [
{
"type": "image",
"url": "https://example.com/image.jpg"
},
{
"type": "text",
"text": "What do you see in this image?"
}
]
}
]
}'
```
## πŸ”§ Technical Details
### Architecture
- **FastAPI** web framework
- **Transformers** pipeline for AI models
- **PyTorch** backend with GPU/MPS support
- **Pydantic** for request/response validation
### Models
- **Text**: microsoft/DialoGPT-medium
- **Vision**: Salesforce/blip-image-captioning-base
### API Endpoints
- `GET /` - Service information
- `GET /health` - Health check
- `GET /v1/models` - List available models
- `POST /v1/chat/completions` - Chat completions (text/multimodal)
- `GET /docs` - Interactive API documentation
## πŸ§ͺ Testing
Run the comprehensive test suite:
```bash
python test_final.py
```
Test individual components:
```bash
python test_multimodal.py # Basic multimodal tests
python test_pipeline.py # Pipeline compatibility
```
## πŸ“¦ Dependencies
Key packages:
- `fastapi` - Web framework
- `transformers` - AI model pipelines
- `torch` - PyTorch backend
- `Pillow` - Image processing
- `accelerate` - Model acceleration
- `requests` - HTTP client
## 🎯 Integration Complete
This project successfully integrates:
βœ… **Transformers image-text-to-text pipeline**
βœ… **OpenAI Vision API compatibility**
βœ… **Multimodal message processing**
βœ… **Production-ready FastAPI service**
See `MULTIMODAL_INTEGRATION_COMPLETE.md` for detailed integration documentation.
- PyTorch with MPS acceleration (Apple Silicon) AI Backend Service
emoji: οΏ½
colorFrom: yellow
colorTo: purple
sdk: fastapi
sdk_version: 0.100.0
app_file: backend_service.py
pinned: false
---
# AI Backend Service πŸš€
**Status: βœ… CONVERSION COMPLETE!**
Successfully converted from a non-functioning Gradio HuggingFace app to a production-ready FastAPI backend service with OpenAI-compatible API endpoints.
## Quick Start
### 1. Setup Environment
```bash
# Activate the virtual environment
source gradio_env/bin/activate
# Install dependencies (already done)
pip install -r requirements.txt
```
### 2. Start the Backend Service
```bash
python backend_service.py --port 8000 --reload
```
### 3. Test the API
```bash
# Run comprehensive tests
python test_api.py
# Or try usage examples
python usage_examples.py
```
## API Endpoints
| Endpoint | Method | Description |
| ---------------------- | ------ | ----------------------------------- |
| `/` | GET | Service information |
| `/health` | GET | Health check |
| `/v1/models` | GET | List available models |
| `/v1/chat/completions` | POST | Chat completion (OpenAI compatible) |
| `/v1/completions` | POST | Text completion |
## Example Usage
### Chat Completion
```bash
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "microsoft/DialoGPT-medium",
"messages": [
{"role": "user", "content": "Hello! How are you?"}
],
"max_tokens": 150,
"temperature": 0.7
}'
```
### Streaming Chat
```bash
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "microsoft/DialoGPT-medium",
"messages": [
{"role": "user", "content": "Tell me a joke"}
],
"stream": true
}'
```
## Files
- **`app.py`** - Original Gradio ChatInterface (still functional)
- **`backend_service.py`** - New FastAPI backend service ⭐
- **`test_api.py`** - Comprehensive API testing
- **`usage_examples.py`** - Simple usage examples
- **`requirements.txt`** - Updated dependencies
- **`CONVERSION_COMPLETE.md`** - Detailed conversion documentation
## Features
βœ… **OpenAI-Compatible API** - Drop-in replacement for OpenAI API
βœ… **Async FastAPI** - High-performance async architecture
βœ… **Streaming Support** - Real-time response streaming
βœ… **Error Handling** - Robust error handling with fallbacks
βœ… **Production Ready** - CORS, logging, health checks
βœ… **Docker Ready** - Easy containerization
βœ… **Auto-reload** - Development-friendly auto-reload
βœ… **Type Safety** - Full type hints with Pydantic validation
## Service URLs
- **Backend Service**: http://localhost:8000
- **API Documentation**: http://localhost:8000/docs
- **OpenAPI Spec**: http://localhost:8000/openapi.json
## Model Information
- **Current Model**: `microsoft/DialoGPT-medium`
- **Type**: Conversational AI model
- **Provider**: HuggingFace Inference API
- **Capabilities**: Text generation, chat completion
## Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Client Request │───▢│ FastAPI Backend │───▢│ HuggingFace API β”‚
β”‚ (OpenAI format) β”‚ β”‚ (backend_service) β”‚ β”‚ (DialoGPT-medium) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ OpenAI Response β”‚
β”‚ (JSON/Streaming) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## Development
The service includes:
- **Auto-reload** for development
- **Comprehensive logging** for debugging
- **Type checking** for code quality
- **Test suite** for reliability
- **Error handling** for robustness
## Production Deployment
Ready for production with:
- **Environment variables** for configuration
- **Health check endpoints** for monitoring
- **CORS support** for web applications
- **Docker compatibility** for containerization
- **Structured logging** for observability
---
**πŸŽ‰ Conversion Status: COMPLETE!**
Successfully transformed from broken Gradio app to production-ready AI backend service.
For detailed conversion documentation, see [`CONVERSION_COMPLETE.md`](CONVERSION_COMPLETE.md).