--- license: mit title: Product Categorization Demo sdk: streamlit emoji: 🚀 colorFrom: purple colorTo: yellow # sdk_version: (Streamlit doesn't typically use a fixed version here) --- # Product Categorization App - Streamlit Demo This is a Streamlit application for categorizing products based on their similarity to ingredients or predefined categories using AI embeddings (e.g., Voyage AI) and optional reranking (Voyage AI, OpenAI). ## Quick Start 1. **Clone the repository:** ```bash git clone cd ``` 2. **Create a virtual environment (optional but recommended):** ```bash python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate` ``` 3. **Install dependencies:** ```bash pip install -r requirements.txt ``` 4. **Prepare Embeddings:** Ensure your embedding files (`ingredient_embeddings_voyageai.pkl`, `category_embeddings.pickle`, etc.) are present in the `data/` directory. 5. **Configure API Keys:** * Copy the `.env.example` file (if it exists) or create a new file named `.env`. * Add your API keys to the `.env` file: ```dotenv VOYAGE_API_KEY="YOUR_VOYAGE_API_KEY_HERE" OPENAI_API_KEY="YOUR_OPENAI_API_KEY_HERE" # Add other keys like CHICORY if needed ``` 6. **Run the application:** ```bash streamlit run app.py ``` Alternatively, if you have configured the `./run_app.sh` script: ```bash ./run_app.sh ``` 7. The application will open in your default web browser. ## Features - **Multiple Matching Methods:** - Ingredient Embeddings - Category Embeddings - Voyage AI Reranking (Ingredients/Categories) - OpenAI Reranking (Ingredients/Categories) - Comparison View across methods - **Text Input:** Enter product names one per line. - **Description Expansion:** Optionally use OpenAI to expand product descriptions before matching. - **Adjustable Parameters:** Control Top-N results, confidence thresholds, etc. for different methods. - **Example Loading:** Quickly load sample product names. ## Hosting on Hugging Face Spaces 1. Create a free account on [Hugging Face](https://huggingface.co/). 2. Go to [Hugging Face Spaces](https://huggingface.co/spaces). 3. Click "Create a new Space". 4. Select "Streamlit" as the SDK. 5. Choose a repository type (usually Git). 6. Upload all project files (including the `data` directory with embeddings) to the space repository. 7. **Important:** Add your API keys (`VOYAGE_API_KEY`, `OPENAI_API_KEY`, etc.) as **Secrets** in your Hugging Face Space settings. Do *not* commit the `.env` file directly. 8. Your app should build and deploy automatically. ## Files Included - `app.py`: The main Streamlit application entry point. - `ui.py`: Defines the Streamlit UI layout and components. - `*.py` (various): Backend logic for embeddings, matching, API calls, formatting. - `requirements.txt`: Required Python packages. - `.env`: File to store API keys (add your keys here, **do not commit**). - `run_app.sh`: Example script to run the app locally. - `data/`: Directory containing embedding files. ## Requirements - Python 3.8+ - API keys for Voyage AI and/or OpenAI (stored in `.env`). - Internet connection for API calls.