Spaces:
Sleeping
Sleeping
license: mit | |
title: Product Categorization Demo | |
sdk: streamlit | |
emoji: π | |
colorFrom: purple | |
colorTo: yellow | |
# sdk_version: (Streamlit doesn't typically use a fixed version here) | |
# Product Categorization App - Streamlit Demo | |
This is a Streamlit application for categorizing products based on their similarity to ingredients or predefined categories using AI embeddings (e.g., Voyage AI) and optional reranking (Voyage AI, OpenAI). | |
## Quick Start | |
1. **Clone the repository:** | |
```bash | |
git clone <repository_url> | |
cd <repository_directory> | |
``` | |
2. **Create a virtual environment (optional but recommended):** | |
```bash | |
python -m venv venv | |
source venv/bin/activate # On Windows use `venv\Scripts\activate` | |
``` | |
3. **Install dependencies:** | |
```bash | |
pip install -r requirements.txt | |
``` | |
4. **Prepare Embeddings:** Ensure your embedding files (`ingredient_embeddings_voyageai.pkl`, `category_embeddings.pickle`, etc.) are present in the `data/` directory. | |
5. **Configure API Keys:** | |
* Copy the `.env.example` file (if it exists) or create a new file named `.env`. | |
* Add your API keys to the `.env` file: | |
```dotenv | |
VOYAGE_API_KEY="YOUR_VOYAGE_API_KEY_HERE" | |
OPENAI_API_KEY="YOUR_OPENAI_API_KEY_HERE" | |
# Add other keys like CHICORY if needed | |
``` | |
6. **Run the application:** | |
```bash | |
streamlit run app.py | |
``` | |
Alternatively, if you have configured the `./run_app.sh` script: | |
```bash | |
./run_app.sh | |
``` | |
7. The application will open in your default web browser. | |
## Features | |
- **Multiple Matching Methods:** | |
- Ingredient Embeddings | |
- Category Embeddings | |
- Voyage AI Reranking (Ingredients/Categories) | |
- OpenAI Reranking (Ingredients/Categories) | |
- Comparison View across methods | |
- **Text Input:** Enter product names one per line. | |
- **Description Expansion:** Optionally use OpenAI to expand product descriptions before matching. | |
- **Adjustable Parameters:** Control Top-N results, confidence thresholds, etc. for different methods. | |
- **Example Loading:** Quickly load sample product names. | |
## Hosting on Hugging Face Spaces | |
1. Create a free account on [Hugging Face](https://huggingface.co/). | |
2. Go to [Hugging Face Spaces](https://huggingface.co/spaces). | |
3. Click "Create a new Space". | |
4. Select "Streamlit" as the SDK. | |
5. Choose a repository type (usually Git). | |
6. Upload all project files (including the `data` directory with embeddings) to the space repository. | |
7. **Important:** Add your API keys (`VOYAGE_API_KEY`, `OPENAI_API_KEY`, etc.) as **Secrets** in your Hugging Face Space settings. Do *not* commit the `.env` file directly. | |
8. Your app should build and deploy automatically. | |
## Files Included | |
- `app.py`: The main Streamlit application entry point. | |
- `ui.py`: Defines the Streamlit UI layout and components. | |
- `*.py` (various): Backend logic for embeddings, matching, API calls, formatting. | |
- `requirements.txt`: Required Python packages. | |
- `.env`: File to store API keys (add your keys here, **do not commit**). | |
- `run_app.sh`: Example script to run the app locally. | |
- `data/`: Directory containing embedding files. | |
## Requirements | |
- Python 3.8+ | |
- API keys for Voyage AI and/or OpenAI (stored in `.env`). | |
- Internet connection for API calls. |