mwalker22's picture
Corrected ALLOW_ORIGINS environment variable processing for CORS security.
3e3ccf3
# Backend Documentation
This document provides an overview of the backend architecture and components for the AI Maker Space project.
## Project Structure
```
backend/
β”œβ”€β”€ agents/ # Agent definitions and implementations
β”œβ”€β”€ api/ # API endpoints and routes
β”œβ”€β”€ core/ # Core functionality and utilities
β”œβ”€β”€ prompts/ # Prompt templates and management
β”œβ”€β”€ tools/ # Tool implementations for agents
β”œβ”€β”€ tests/ # Test suite
β”œβ”€β”€ main.py # Application entry point
└── .env # Environment variables (not in repo)
```
## Core Components
### Vector Store
The `VectorStore` class in `core/vector_store.py` implements a singleton pattern for managing document embeddings and vector search functionality. It provides methods for:
- Processing and embedding documents
- Searching for relevant context based on queries
- Managing the vector database state
### Text Processing
The `text_utils.py` module provides utilities for:
- Loading documents from various formats (PDF, TXT)
- Splitting text into chunks for processing
- Text preprocessing and normalization
### Embeddings
The `embeddings.py` module handles:
- Creating embeddings for text chunks
- Managing embedding models and configurations
### Vector Database
The `vectordatabase.py` module implements:
- Storage and retrieval of vector embeddings
- Similarity search functionality
- Database management operations
## API Endpoints
The backend exposes the following API endpoints:
### Upload Endpoints
- `POST /upload/pdf` - Upload and process PDF documents
- `POST /upload/text` - Upload and process text documents
### Query Endpoints
- `POST /ask` - Query the knowledge base with a question
### Agent Endpoints
- `POST /agent/run` - Execute an agent with specific parameters
## Environment Configuration
The application uses environment variables for configuration. See `.env.template` for required variables:
- `OPENAI_API_KEY` - API key for OpenAI services
- `ALLOWED_ORIGINS` - CORS allowed origins (comma-separated)
- `ENVIRONMENT` - Set to "production" or "development"
- `DEBUG` - Enable debug mode when set to "true"
- `LOG_LEVEL` - Set the default log level (defaults to INFO)
- `LOG_FORMAT` - Customize the log message format
### Logging Configuration
The application uses a centralized logging system that can be configured through:
1. Environment variables:
- `LOG_LEVEL`: Set the default log level (defaults to INFO)
- `LOG_FORMAT`: Customize the log message format
2. API endpoint:
- `POST /api/logging/level`: Update the log level at runtime
- Example: `curl -X POST "http://localhost:8000/api/logging/level" -H "Content-Type: application/json" -d '{"level": "DEBUG"}'`
Available log levels:
- DEBUG: Detailed information for debugging
- INFO: General operational information
- WARNING: Warning messages for potentially harmful situations
- ERROR: Error messages for serious problems
- CRITICAL: Critical messages for fatal errors
## CORS Configuration
To allow your frontend to communicate with the backend, set the `ALLOWED_ORIGINS` environment variable in your deployment environment (e.g., Hugging Face Spaces).
**Recommended format (JSON array):**
```
["https://your-space-name.hf.space"]
```
- You can also use a single string (will be wrapped in a list):
```
https://your-space-name.hf.space
```
- For multiple origins, use:
```
["https://your-space-name.hf.space", "http://localhost:5173"]
```
The backend will automatically parse this variable and configure CORS accordingly.
## Development Setup
1. Clone the repository
2. Create a `.env` file based on `.env.template`
3. Install dependencies:
```
pip install -r requirements.txt
```
4. Run the development server:
```
uvicorn backend.main:app --reload
```
## Testing
Run the test suite with:
```
pytest
```
Tests are organized by component and include:
- API endpoint tests
- Core functionality tests
- Integration tests
## Architecture Notes
- The application uses FastAPI for the web framework
- VectorStore implements a singleton pattern to ensure a single instance across the application
- Document processing is asynchronous to handle large files efficiently
- The frontend communicates with the backend via REST API endpoints