|
--- |
|
title: Multimodal AI Backend Service |
|
emoji: π |
|
colorFrom: yellow |
|
colorTo: purple |
|
sdk: docker |
|
app_port: 8000 |
|
pinned: false |
|
--- |
|
|
|
# firstAI - Multimodal AI Backend π |
|
|
|
A powerful AI backend service with **multimodal capabilities** - supporting both text generation and image analysis using transformers pipelines. |
|
|
|
## π Features |
|
|
|
### π€ Dual AI Models |
|
|
|
- **Text Generation**: Microsoft DialoGPT-medium for conversations |
|
- **Image Analysis**: Salesforce BLIP for image captioning and visual Q&A |
|
|
|
### πΌοΈ Multimodal Support |
|
|
|
- Process text-only messages |
|
- Analyze images from URLs |
|
- Combined image + text conversations |
|
- OpenAI Vision API compatible format |
|
|
|
### π§ Production Ready |
|
|
|
- FastAPI backend with automatic docs |
|
- Comprehensive error handling |
|
- Health checks and monitoring |
|
- PyTorch with MPS acceleration (Apple Silicon) |
|
|
|
## π Quick Start |
|
|
|
### 1. Install Dependencies |
|
|
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
### 2. Start the Service |
|
|
|
```bash |
|
python backend_service.py |
|
``` |
|
|
|
### 3. Test Multimodal Capabilities |
|
|
|
```bash |
|
python test_final.py |
|
``` |
|
|
|
The service will start on **http://localhost:8001** with both text and vision models loaded. |
|
|
|
## π‘ Usage Examples |
|
|
|
### Text-Only Chat |
|
|
|
```bash |
|
curl -X POST http://localhost:8001/v1/chat/completions \ |
|
-H "Content-Type: application/json" \ |
|
-d '{ |
|
"model": "microsoft/DialoGPT-medium", |
|
"messages": [{"role": "user", "content": "Hello!"}] |
|
}' |
|
``` |
|
|
|
### Image Analysis |
|
|
|
```bash |
|
curl -X POST http://localhost:8001/v1/chat/completions \ |
|
-H "Content-Type: application/json" \ |
|
-d '{ |
|
"model": "Salesforce/blip-image-captioning-base", |
|
"messages": [ |
|
{ |
|
"role": "user", |
|
"content": [ |
|
{ |
|
"type": "image", |
|
"url": "https://example.com/image.jpg" |
|
} |
|
] |
|
} |
|
] |
|
}' |
|
``` |
|
|
|
### Multimodal (Image + Text) |
|
|
|
```bash |
|
curl -X POST http://localhost:8001/v1/chat/completions \ |
|
-H "Content-Type: application/json" \ |
|
-d '{ |
|
"model": "Salesforce/blip-image-captioning-base", |
|
"messages": [ |
|
{ |
|
"role": "user", |
|
"content": [ |
|
{ |
|
"type": "image", |
|
"url": "https://example.com/image.jpg" |
|
}, |
|
{ |
|
"type": "text", |
|
"text": "What do you see in this image?" |
|
} |
|
] |
|
} |
|
] |
|
}' |
|
``` |
|
|
|
## π§ Technical Details |
|
|
|
### Architecture |
|
|
|
- **FastAPI** web framework |
|
- **Transformers** pipeline for AI models |
|
- **PyTorch** backend with GPU/MPS support |
|
- **Pydantic** for request/response validation |
|
|
|
### Models |
|
|
|
- **Text**: microsoft/DialoGPT-medium |
|
- **Vision**: Salesforce/blip-image-captioning-base |
|
|
|
### API Endpoints |
|
|
|
- `GET /` - Service information |
|
- `GET /health` - Health check |
|
- `GET /v1/models` - List available models |
|
- `POST /v1/chat/completions` - Chat completions (text/multimodal) |
|
- `GET /docs` - Interactive API documentation |
|
|
|
## π§ͺ Testing |
|
|
|
Run the comprehensive test suite: |
|
|
|
```bash |
|
python test_final.py |
|
``` |
|
|
|
Test individual components: |
|
|
|
```bash |
|
python test_multimodal.py # Basic multimodal tests |
|
python test_pipeline.py # Pipeline compatibility |
|
``` |
|
|
|
## π¦ Dependencies |
|
|
|
Key packages: |
|
|
|
- `fastapi` - Web framework |
|
- `transformers` - AI model pipelines |
|
- `torch` - PyTorch backend |
|
- `Pillow` - Image processing |
|
- `accelerate` - Model acceleration |
|
- `requests` - HTTP client |
|
|
|
## π― Integration Complete |
|
|
|
This project successfully integrates: |
|
β
**Transformers image-text-to-text pipeline** |
|
β
**OpenAI Vision API compatibility** |
|
β
**Multimodal message processing** |
|
β
**Production-ready FastAPI service** |
|
|
|
See `MULTIMODAL_INTEGRATION_COMPLETE.md` for detailed integration documentation. |
|
|
|
- PyTorch with MPS acceleration (Apple Silicon) AI Backend Service |
|
emoji: οΏ½ |
|
colorFrom: yellow |
|
colorTo: purple |
|
sdk: fastapi |
|
sdk_version: 0.100.0 |
|
app_file: backend_service.py |
|
pinned: false |
|
|
|
--- |
|
|
|
# AI Backend Service π |
|
|
|
**Status: β
CONVERSION COMPLETE!** |
|
|
|
Successfully converted from a non-functioning Gradio HuggingFace app to a production-ready FastAPI backend service with OpenAI-compatible API endpoints. |
|
|
|
## Quick Start |
|
|
|
### 1. Setup Environment |
|
|
|
```bash |
|
# Activate the virtual environment |
|
source gradio_env/bin/activate |
|
|
|
# Install dependencies (already done) |
|
pip install -r requirements.txt |
|
``` |
|
|
|
### 2. Start the Backend Service |
|
|
|
```bash |
|
python backend_service.py --port 8000 --reload |
|
``` |
|
|
|
### 3. Test the API |
|
|
|
```bash |
|
# Run comprehensive tests |
|
python test_api.py |
|
|
|
# Or try usage examples |
|
python usage_examples.py |
|
``` |
|
|
|
## API Endpoints |
|
|
|
| Endpoint | Method | Description | |
|
| ---------------------- | ------ | ----------------------------------- | |
|
| `/` | GET | Service information | |
|
| `/health` | GET | Health check | |
|
| `/v1/models` | GET | List available models | |
|
| `/v1/chat/completions` | POST | Chat completion (OpenAI compatible) | |
|
| `/v1/completions` | POST | Text completion | |
|
|
|
## Example Usage |
|
|
|
### Chat Completion |
|
|
|
```bash |
|
curl -X POST http://localhost:8000/v1/chat/completions \ |
|
-H "Content-Type: application/json" \ |
|
-d '{ |
|
"model": "microsoft/DialoGPT-medium", |
|
"messages": [ |
|
{"role": "user", "content": "Hello! How are you?"} |
|
], |
|
"max_tokens": 150, |
|
"temperature": 0.7 |
|
}' |
|
``` |
|
|
|
### Streaming Chat |
|
|
|
```bash |
|
curl -X POST http://localhost:8000/v1/chat/completions \ |
|
-H "Content-Type: application/json" \ |
|
-d '{ |
|
"model": "microsoft/DialoGPT-medium", |
|
"messages": [ |
|
{"role": "user", "content": "Tell me a joke"} |
|
], |
|
"stream": true |
|
}' |
|
``` |
|
|
|
## Files |
|
|
|
- **`app.py`** - Original Gradio ChatInterface (still functional) |
|
- **`backend_service.py`** - New FastAPI backend service β |
|
- **`test_api.py`** - Comprehensive API testing |
|
- **`usage_examples.py`** - Simple usage examples |
|
- **`requirements.txt`** - Updated dependencies |
|
- **`CONVERSION_COMPLETE.md`** - Detailed conversion documentation |
|
|
|
## Features |
|
|
|
β
**OpenAI-Compatible API** - Drop-in replacement for OpenAI API |
|
β
**Async FastAPI** - High-performance async architecture |
|
β
**Streaming Support** - Real-time response streaming |
|
β
**Error Handling** - Robust error handling with fallbacks |
|
β
**Production Ready** - CORS, logging, health checks |
|
β
**Docker Ready** - Easy containerization |
|
β
**Auto-reload** - Development-friendly auto-reload |
|
β
**Type Safety** - Full type hints with Pydantic validation |
|
|
|
## Service URLs |
|
|
|
- **Backend Service**: http://localhost:8000 |
|
- **API Documentation**: http://localhost:8000/docs |
|
- **OpenAPI Spec**: http://localhost:8000/openapi.json |
|
|
|
## Model Information |
|
|
|
- **Current Model**: `microsoft/DialoGPT-medium` |
|
- **Type**: Conversational AI model |
|
- **Provider**: HuggingFace Inference API |
|
- **Capabilities**: Text generation, chat completion |
|
|
|
## Architecture |
|
|
|
``` |
|
βββββββββββββββββββββββ ββββββββββββββββββββββββ βββββββββββββββββββββββ |
|
β Client Request βββββΆβ FastAPI Backend βββββΆβ HuggingFace API β |
|
β (OpenAI format) β β (backend_service) β β (DialoGPT-medium) β |
|
βββββββββββββββββββββββ ββββββββββββββββββββββββ βββββββββββββββββββββββ |
|
β |
|
βΌ |
|
ββββββββββββββββββββββββ |
|
β OpenAI Response β |
|
β (JSON/Streaming) β |
|
ββββββββββββββββββββββββ |
|
``` |
|
|
|
## Development |
|
|
|
The service includes: |
|
|
|
- **Auto-reload** for development |
|
- **Comprehensive logging** for debugging |
|
- **Type checking** for code quality |
|
- **Test suite** for reliability |
|
- **Error handling** for robustness |
|
|
|
## Production Deployment |
|
|
|
Ready for production with: |
|
|
|
- **Environment variables** for configuration |
|
- **Health check endpoints** for monitoring |
|
- **CORS support** for web applications |
|
- **Docker compatibility** for containerization |
|
- **Structured logging** for observability |
|
|
|
--- |
|
|
|
**π Conversion Status: COMPLETE!** |
|
Successfully transformed from broken Gradio app to production-ready AI backend service. |
|
|
|
For detailed conversion documentation, see [`CONVERSION_COMPLETE.md`](CONVERSION_COMPLETE.md). |
|
|