Spaces:

thuanan
/

cocktail_suggestions

Running

thuanan commited on 19 days ago

Commit

01f7130

1 Parent(s): 4e7e9f9

Refactor and streamline Cocktail Suggestions project

- Removed README.md and SETUP_GUIDE.md files to consolidate documentation.
- Enhanced app.py with error handling for recommender initialization and environment variable settings for Hugging Face Spaces compatibility.
- Deleted data_processor.py, debug.py, demo_setup.py, docker-compose.yml, quickstart.py, setup.sh, test_system.py files to simplify the project structure.
- Added start.sh script to manage environment setup and start the Streamlit app with appropriate configurations.
- Updated recommender.py to improve model loading with fallback options.
- Removed requirements.txt to streamline dependency management.

Files changed (19) hide show

.gitignore +60 -0
Dockerfile +52 -5
README.md +21 -7
src/.env.example +0 -9
src/.gitignore +0 -219
src/Dockerfile +0 -28
src/README.md +0 -219
src/SETUP_GUIDE.md +0 -142
src/app.py +16 -7
src/data_processor.py +0 -256
src/debug.py +0 -203
src/demo_setup.py +0 -96
src/docker-compose.yml +0 -37
src/quickstart.py +0 -116
src/recommender.py +12 -2
src/requirements.txt +0 -9
src/setup.sh +0 -33
src/test_system.py +0 -165
start.sh +23 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,60 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# Virtual environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+# OS
+.DS_Store
+Thumbs.db
+# Logs
+*.log
+# Database
+*.db
+*.sqlite
+# Cache
+.cache/
+.streamlit/
+# Hugging Face
+.huggingface/
+# Model cache
+models/
+checkpoints/

Dockerfile CHANGED Viewed

@@ -1,7 +1,24 @@
 FROM python:3.9-slim
-WORKDIR /app
 RUN apt-get update && apt-get install -y \
     build-essential \
     curl \
@@ -9,13 +26,43 @@ RUN apt-get update && apt-get install -y \
     git \
     && rm -rf /var/lib/apt/lists/*
-COPY requirements.txt ./
-COPY src/ ./src/
-RUN pip3 install -r requirements.txt
 EXPOSE 8501
 HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
-ENTRYPOINT ["streamlit", "run", "src/app.py", "--server.port=8501", "--server.address=0.0.0.0"]

 FROM python:3.9-slim
+# Create a non-root user
+RUN useradd -m -u 1000 user
+USER user
+# Set home directory and working directory
+ENV HOME=/home/user \
+    PATH=/home/user/.local/bin:$PATH \
+    PYTHONPATH=$HOME/app \
+    PYTHONUNBUFFERED=1 \
+    GRADIO_ALLOW_FLAGGING=never \
+    GRADIO_NUM_PORTS=1 \
+    GRADIO_SERVER_NAME=0.0.0.0 \
+    GRADIO_THEME=huggingface \
+    SYSTEM=spaces
+WORKDIR $HOME/app
+# Install system dependencies
+USER root
 RUN apt-get update && apt-get install -y \
     build-essential \
     curl \
     git \
     && rm -rf /var/lib/apt/lists/*
+# Switch back to user
+USER user
+# Set up cache and config directories with proper permissions
+RUN mkdir -p $HOME/.cache/huggingface \
+    && mkdir -p $HOME/.streamlit \
+    && mkdir -p $HOME/.cache/torch
+# Set environment variables for caching
+ENV HF_HOME=$HOME/.cache/huggingface \
+    TRANSFORMERS_CACHE=$HOME/.cache/huggingface \
+    SENTENCE_TRANSFORMERS_HOME=$HOME/.cache/huggingface \
+    TORCH_HOME=$HOME/.cache/torch
+# Copy files
+COPY --chown=user requirements.txt ./
+COPY --chown=user src/ ./src/
+COPY --chown=user start.sh ./
+# Make start script executable
+RUN chmod +x start.sh
+# Install Python dependencies
+RUN pip3 install --user -r requirements.txt
+# Create Streamlit config
+RUN echo "[server]\n\
+headless = true\n\
+port = 8501\n\
+address = \"0.0.0.0\"\n\
+\n\
+[browser]\n\
+gatherUsageStats = false\n\
+" > $HOME/.streamlit/config.toml
 EXPOSE 8501
 HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
+ENTRYPOINT ["./start.sh"]

README.md CHANGED Viewed

@@ -1,19 +1,33 @@
 ---
 title: Cocktail Suggestions
-emoji: 🚀
 colorFrom: red
-colorTo: red
 sdk: docker
 app_port: 8501
 tags:
 - streamlit
 pinned: false
-short_description: AI-Powered Cocktail Suggestions
 ---
-# Welcome to Streamlit!
-Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
-If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
-forums](https://discuss.streamlit.io).

 ---
 title: Cocktail Suggestions
+emoji: 🍹
 colorFrom: red
+colorTo: purple
 sdk: docker
 app_port: 8501
 tags:
 - streamlit
+- cocktails
+- ai
+- recommendations
 pinned: false
+short_description: AI-Powered Cocktail Recommendation System
 ---
+# 🍹 Cocktail Suggestions
+An AI-powered cocktail recommendation system that helps you discover new drinks based on your preferences.
+## Features
+- Intelligent cocktail recommendations using semantic search
+- Browse cocktails by category, glass type, and ingredients
+- Beautiful, modern UI with responsive design
+- Powered by sentence transformers and vector similarity search
+## How to Use
+1. Select your preferences from the sidebar
+2. Choose ingredients you like or want to try
+3. Get personalized cocktail recommendations
+4. Explore new drinks and save your favorites!

src/.env.example DELETED Viewed

@@ -1,9 +0,0 @@
-# Environment variables
-DB_HOST=localhost
-DB_PORT=5432
-DB_NAME=cocktails_db
-DB_USER=postgres
-DB_PASSWORD=your_password
-# Vector embedding model
-MODEL_NAME=all-MiniLM-L6-v2

src/.gitignore DELETED Viewed

@@ -1,219 +0,0 @@
-# Byte-compiled / optimized / DLL files
-__pycache__/
-*.py[codz]
-*$py.class
-# C extensions
-*.so
-# Distribution / packaging
-.Python
-build/
-develop-eggs/
-dist/
-downloads/
-eggs/
-.eggs/
-lib/
-lib64/
-parts/
-sdist/
-var/
-wheels/
-share/python-wheels/
-*.egg-info/
-.installed.cfg
-*.egg
-MANIFEST
-# Project-specific files
-.env
-data/cocktails.csv
-data/*.csv
-logs/
-*.log
-# PyInstaller
-#  Usually these files are written by a python script from a template
-#  before PyInstaller builds the exe, so as to inject date/other infos into it.
-*.manifest
-*.spec
-# Installer logs
-pip-log.txt
-pip-delete-this-directory.txt
-# Unit test / coverage reports
-htmlcov/
-.tox/
-.nox/
-.coverage
-.coverage.*
-.cache
-nosetests.xml
-coverage.xml
-*.cover
-*.py.cover
-.hypothesis/
-.pytest_cache/
-cover/
-# Translations
-*.mo
-*.pot
-# Django stuff:
-*.log
-local_settings.py
-db.sqlite3
-db.sqlite3-journal
-# Flask stuff:
-instance/
-.webassets-cache
-# Scrapy stuff:
-.scrapy
-# Sphinx documentation
-docs/_build/
-# PyBuilder
-.pybuilder/
-target/
-# Jupyter Notebook
-.ipynb_checkpoints
-# IPython
-profile_default/
-ipython_config.py
-# pyenv
-#   For a library or package, you might want to ignore these files since the code is
-#   intended to run in multiple environments; otherwise, check them in:
-# .python-version
-# pipenv
-#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
-#   However, in case of collaboration, if having platform-specific dependencies or dependencies
-#   having no cross-platform support, pipenv may install dependencies that don't work, or not
-#   install all needed dependencies.
-#Pipfile.lock
-# UV
-#   Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
-#   This is especially recommended for binary packages to ensure reproducibility, and is more
-#   commonly ignored for libraries.
-#uv.lock
-# poetry
-#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
-#   This is especially recommended for binary packages to ensure reproducibility, and is more
-#   commonly ignored for libraries.
-#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
-#poetry.lock
-#poetry.toml
-# pdm
-#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
-#   pdm recommends including project-wide configuration in pdm.toml, but excluding .pdm-python.
-#   https://pdm-project.org/en/latest/usage/project/#working-with-version-control
-#pdm.lock
-#pdm.toml
-.pdm-python
-.pdm-build/
-# pixi
-#   Similar to Pipfile.lock, it is generally recommended to include pixi.lock in version control.
-#pixi.lock
-#   Pixi creates a virtual environment in the .pixi directory, just like venv module creates one
-#   in the .venv directory. It is recommended not to include this directory in version control.
-.pixi
-# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
-__pypackages__/
-# Celery stuff
-celerybeat-schedule
-celerybeat.pid
-# SageMath parsed files
-*.sage.py
-# Environments
-.env
-.envrc
-.venv
-env/
-venv/
-ENV/
-env.bak/
-venv.bak/
-# Spyder project settings
-.spyderproject
-.spyproject
-# Rope project settings
-.ropeproject
-# mkdocs documentation
-/site
-# mypy
-.mypy_cache/
-.dmypy.json
-dmypy.json
-# Pyre type checker
-.pyre/
-# pytype static type analyzer
-.pytype/
-# Cython debug symbols
-cython_debug/
-# PyCharm
-#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
-#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
-#  and can be added to the global gitignore or merged into this file.  For a more nuclear
-#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
-#.idea/
-# Abstra
-# Abstra is an AI-powered process automation framework.
-# Ignore directories containing user credentials, local state, and settings.
-# Learn more at https://abstra.io/docs
-.abstra/
-# Visual Studio Code
-#  Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore
-#  that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore
-#  and can be added to the global gitignore or merged into this file. However, if you prefer,
-#  you could uncomment the following to ignore the entire vscode folder
-# .vscode/
-# Ruff stuff:
-.ruff_cache/
-# PyPI configuration file
-.pypirc
-# Cursor
-#  Cursor is an AI-powered code editor. `.cursorignore` specifies files/directories to
-#  exclude from AI features like autocomplete and code analysis. Recommended for sensitive data
-#  refer to https://docs.cursor.com/context/ignore-files
-.cursorignore
-.cursorindexingignore
-# Marimo
-marimo/_static/
-marimo/_lsp/
-__marimo__/
-.vscode
-*.npy
-*.csv
-*.json
-.env.supabase

src/Dockerfile DELETED Viewed

@@ -1,28 +0,0 @@
-FROM python:3.10-slim
-WORKDIR /app
-# Install system dependencies
-RUN apt-get update && apt-get install -y \
-    gcc \
-    postgresql-client \
-    && rm -rf /var/lib/apt/lists/*
-# Copy requirements first for better caching
-COPY requirements.txt .
-RUN pip install --no-cache-dir -r requirements.txt
-# Copy application code
-COPY . .
-# Create data directory
-RUN mkdir -p data logs
-# Expose Streamlit port
-EXPOSE 8501
-# Health check
-HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
-# Run the application
-CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]

src/README.md DELETED Viewed

@@ -1,219 +0,0 @@
-# 🍹 AI-Powered Cocktail Suggestions
-An intelligent cocktail recommendation system using vector databases and AI embeddings to suggest the perfect drinks based on your preferences.
-## 🚀 Live Demos
-[![Streamlit App](https://img.shields.io/badge/🍹_Streamlit-Demo-FF4B4B?style=for-the-badge&logo=streamlit&logoColor=white)](https://cocktail-suggestions.streamlit.app/)
-[![Gradio App](https://img.shields.io/badge/🍸_Gradio-Demo-FF7C00?style=for-the-badge&logo=gradio&logoColor=white)](https://huggingface.co/spaces/thuanan/cocktail_suggestions)
-> 🎯 **Try the live demos above to explore cocktail recommendations without any setup!**
-## 🎯 Project Overview
-This project creates a smart cocktail recommendation system that
-- Stores cocktail recipes in a vector database using pgvector
-- Uses AI embeddings to understand cocktail characteristics
-- Provides personalized suggestions based on user preferences
-- Features a beautiful Streamlit web interface
-## 🏗️ Architecture
-- **Database**: PostgreSQL with pgvector extension for vector similarity search
-- **AI Model**: SentenceTransformers for generating embeddings
-- **Web Framework**: Streamlit for the user interface
-- **Dataset**: Kaggle cocktails dataset with 600+ recipes
-## 📊 Dataset
-**Source**: https://www.kaggle.com/datasets/aadyasingh55/cocktails/data
-## 🛠️ Technology Stack
-- **Vector Database**: [pgvector](https://github.com/pgvector/pgvector)
-- **Web Framework**: Streamlit
-- **AI/ML**: SentenceTransformers, scikit-learn
-- **Database**: PostgreSQL
-- **Language**: Python 3.8+
-## 🚀 Quick Start
-> **💡 Want to try it first?** Check out our [live demos](#-live-demos) above for instant access without any setup!
-### Option 1: Docker (Recommended)
-```bash
-# Clone the repository
-git clone https://github.com/ThuanNaN/aio2025_cocktail_suggestions
-cd aio2025_cocktail_suggestions
-# Download the dataset
-# Place cocktails.csv in the data/ directory
-# Start with Docker Compose
-docker-compose up -d
-# Set up the database (first time only)
-docker-compose exec cocktail-app python database_setup.py
-docker-compose exec cocktail-app python data_processor.py
-# Access the app at http://localhost:8501
-```
-### Option 2: Local Setup
-```bash
-# Clone the repository
-git clone <repository-url>
-cd aio2025_cocktail_suggestions
-# Run the quick setup
-python quickstart.py
-# Or manual setup:
-pip install -r requirements.txt
-cp .env.example .env
-# Edit .env with your database credentials
-# Set up PostgreSQL with pgvector
-# Run the database setup
-python database_setup.py
-# Process and store the cocktail data
-python data_processor.py
-# Start the Streamlit app
-streamlit run app.py
-```
-## 📋 Prerequisites
-### For Local Setup
-- Python 3.8+
-- PostgreSQL with pgvector extension
-- Git
-### For Docker Setup
-- Docker and Docker Compose
-## 🔧 Configuration
-1. **Environment Variables** (`.env` file):
-   ```env
-   DB_HOST=localhost
-   DB_PORT=5432
-   DB_NAME=cocktails_db
-   DB_USER=postgres
-   DB_PASSWORD=your_password
-   MODEL_NAME=all-MiniLM-L6-v2
-   ```
-2. **Database Setup**:
-   - Install PostgreSQL
-   - Install pgvector extension
-   - Create database and user
-3. **Dataset**:
-   - Download from Kaggle
-   - Place `cocktails.csv` in `data/` directory
-## 🎮 Features
-### 🔍 Search Options
-- **By Name**: Find specific cocktails
-- **By Ingredients**: Get suggestions based on available ingredients
-- **By Style/Mood**: Find drinks matching your mood (sweet, strong, refreshing, etc.)
-- **By Occasion**: Perfect drinks for parties, date nights, etc.
-- **Mixed Preferences**: Combine multiple criteria
-- **By Category**: Browse by drink categories
-- **Random Discovery**: Let AI surprise you
-### 🎨 User Interface
-- Modern, responsive design
-- Real-time search and filtering
-- Similarity scores for recommendations
-- Detailed recipe information
-- Ingredient highlighting
-### 🧠 AI Features
-- Vector similarity search
-- Semantic understanding of preferences
-- Contextual recommendations
-- Personalized suggestions
-## 📁 Project Structure
-```text
-aio2025_cocktail_suggestions/
-├── app.py                 # Main Streamlit application
-├── database_setup.py      # Database initialization
-├── data_processor.py      # Data processing and embedding generation
-├── recommender.py         # Recommendation engine
-├── requirements.txt       # Python dependencies
-├── docker-compose.yml     # Docker setup
-├── Dockerfile            # Docker configuration
-├── quickstart.py         # Quick setup script
-├── setup.sh             # Bash setup script
-├── .env.example         # Environment variables template
-├── data/                # Dataset directory
-│   ├── README.md
-│   └── final_cocktails.csv    # (Download required)
-└── README.md           # This file
-```
-## 🔬 How It Works
-1. **Data Processing**: Cocktail recipes are processed and converted into high-dimensional vectors using SentenceTransformers
-2. **Vector Storage**: Embeddings are stored in PostgreSQL with pgvector for efficient similarity search
-3. **Recommendation**: User preferences are converted to vectors and matched against the database using cosine similarity
-4. **Ranking**: Results are ranked by similarity score and presented through the web interface
-## 🎯 Use Cases
-- **Home Bartenders**: Discover new cocktails based on available ingredients
-- **Cocktail Enthusiasts**: Explore drinks by style and preference
-- **Event Planning**: Find perfect drinks for specific occasions
-- **Learning**: Understand cocktail composition and flavor profiles
-## 🔮 Future Enhancements
-- User rating system
-- Personal cocktail collection
-- Ingredient substitution suggestions
-- Nutritional information
-- Social sharing features
-- Mobile app version
-## 🤝 Contributing
-1. Fork the repository
-2. Create a feature branch
-3. Make your changes
-4. Add tests if applicable
-5. Submit a pull request
-## 📄 License
-This project is open source and available under the MIT License.
-## 🆘 Troubleshooting
-**Common Issues:**
-1. **Database Connection Error**: Check your `.env` file and ensure PostgreSQL is running
-2. **pgvector Extension**: Make sure pgvector is properly installed in PostgreSQL
-3. **Dataset Not Found**: Download the cocktails.csv file and place it in the data/ directory
-4. **Memory Issues**: The embedding generation can be memory-intensive; consider processing in batches
-**Support:**
-- Check the logs in the `logs/` directory
-- Ensure all dependencies are installed correctly
-- Verify database credentials and connectivity

src/SETUP_GUIDE.md DELETED Viewed

@@ -1,142 +0,0 @@
-# 🍹 Cocktail Suggestions - Project Setup Guide
-## 📁 What We Built
-A complete AI-powered cocktail recommendation system with:
-### Core Components
-1. **`database_setup.py`** - PostgreSQL + pgvector setup
-2. **`data_processor.py`** - Kaggle dataset processing and embedding generation
-3. **`recommender.py`** - AI-powered recommendation engine
-4. **`app.py`** - Beautiful Streamlit web interface
-### Supporting Files
-- **`requirements.txt`** - All Python dependencies
-- **`docker-compose.yml`** - Complete Docker setup
-- **`quickstart.py`** - Automated setup script
-- **`test_system.py`** - System verification
-- **`.env.example`** - Configuration template
-## 🚀 Getting Started (Choose One Method)
-### Method 1: Docker (Easiest) 🐳
-```bash
-# 1. Download the cocktail dataset
-# Go to: https://www.kaggle.com/datasets/aadyasingh55/cocktails/data
-# Download and place cocktails.csv in data/ folder
-# 2. Start everything with Docker
-docker-compose up -d
-# 3. Initialize the database (one-time setup)
-docker-compose exec cocktail-app python database_setup.py
-docker-compose exec cocktail-app python data_processor.py
-# 4. Open http://localhost:8501 in your browser
-```
-### Method 2: Local Setup 💻
-```bash
-# 1. Install dependencies
-pip install -r requirements.txt
-# 2. Set up environment
-cp .env.example .env
-# Edit .env with your PostgreSQL credentials
-# 3. Download dataset to data/cocktails.csv
-# 4. Set up database
-python database_setup.py
-# 5. Process data and generate embeddings
-python data_processor.py
-# 6. Start the web app
-streamlit run app.py
-```
-### Method 3: Quick Setup Script 🔧
-```bash
-# Run the automated setup
-python quickstart.py
-# Follow the instructions shown
-```
-## 📋 Prerequisites
-### For Docker
-- Docker and Docker Compose
-- The cocktail dataset (cocktails.csv)
-### For Local Setup
-- Python 3.8+
-- PostgreSQL with pgvector extension
-- The cocktail dataset (cocktails.csv)
-## 🎯 How to Use the App
-### Search Options
-1. **🔍 By Name** - Search for specific cocktails
-2. **🥃 By Ingredients** - Get suggestions based on what you have
-3. **🎭 By Style** - Find drinks by mood (sweet, strong, fruity, etc.)
-4. **🎉 By Occasion** - Perfect drinks for parties, dates, etc.
-5. **🎲 Mixed Preferences** - Combine multiple criteria
-6. **📂 By Category** - Browse drink categories
-7. **🎰 Random Discovery** - Let AI surprise you
-### Features
-- Real-time similarity matching
-- Beautiful, responsive interface
-- Detailed recipes and ingredients
-- Similarity scores for each recommendation
-## 🔧 Troubleshooting
-### Common Issues
-1. **"Import errors"** - Install requirements: `pip install -r requirements.txt`
-2. **"Database connection failed"** - Check PostgreSQL is running and .env file
-3. **"pgvector extension not found"** - Install pgvector in PostgreSQL
-4. **"Dataset not found"** - Download cocktails.csv to data/ folder
-5. **"Memory issues"** - The AI model needs ~2GB RAM for embeddings
-### Test Your Setup
-```bash
-python test_system.py
-```
-## 🎉 What's Next?
-After setup, you can
-- Explore 600+ cocktail recipes
-- Get personalized recommendations
-- Discover new drinks based on your preferences
-- Learn about cocktail ingredients and preparation
-## 🆘 Need Help?
-1. Check the detailed README.md
-2. Run the test script: `python test_system.py`
-3. Check logs in the logs/ directory
-4. Ensure all dependencies are installed correctly
----
-**Enjoy discovering your perfect cocktail! 🍹**

src/app.py CHANGED Viewed

@@ -1,6 +1,11 @@
 import streamlit as st
 from recommender import CocktailRecommender
 # Page config
 st.set_page_config(
     page_title="🍹 Cocktail Suggestions",
@@ -52,8 +57,13 @@ st.markdown("""
 @st.cache_resource
 def get_recommender():
-    """Initialize the cocktail recommender"""
-    return CocktailRecommender()
 def display_cocktail(cocktail):
     """Display a cocktail in a nice card format"""
@@ -93,11 +103,10 @@ def main():
         st.session_state.last_search_type = ""
     # Initialize recommender
-    try:
-        recommender = get_recommender()
-    except Exception as e:
-        st.error(f"Error initializing recommender: {e}")
-        st.info("Make sure your database is set up and the environment variables are configured.")
         return
     # Sidebar for filters and preferences

 import streamlit as st
+import os
 from recommender import CocktailRecommender
+# Set environment variables for Hugging Face Spaces compatibility
+os.environ.setdefault('STREAMLIT_SERVER_HEADLESS', 'true')
+os.environ.setdefault('STREAMLIT_BROWSER_GATHER_USAGE_STATS', 'false')
 # Page config
 st.set_page_config(
     page_title="🍹 Cocktail Suggestions",
 @st.cache_resource
 def get_recommender():
+    """Initialize the cocktail recommender with error handling"""
+    try:
+        return CocktailRecommender()
+    except Exception as e:
+        st.error(f"Error initializing cocktail recommender: {str(e)}")
+        st.info("This might be due to model loading issues. Please try refreshing the page.")
+        return None
 def display_cocktail(cocktail):
     """Display a cocktail in a nice card format"""
         st.session_state.last_search_type = ""
     # Initialize recommender
+    recommender = get_recommender()
+    if recommender is None:
+        st.error("Failed to initialize the cocktail recommender. Please try refreshing the page.")
+        st.info("If the problem persists, there might be an issue with model loading or database connection.")
         return
     # Sidebar for filters and preferences

src/data_processor.py DELETED Viewed

@@ -1,256 +0,0 @@
-import pandas as pd
-from sentence_transformers import SentenceTransformer
-from database_setup import DatabaseSetup
-import os
-from dotenv import load_dotenv
-load_dotenv()
-class CocktailDataProcessor:
-    def __init__(self):
-        self.model_name = os.getenv('MODEL_NAME', 'all-MiniLM-L6-v2')
-        self.model = SentenceTransformer(self.model_name)
-        self.db_setup = DatabaseSetup()
-    def load_data(self, csv_path):
-        """Load cocktail data from CSV file"""
-        try:
-            df = pd.read_csv(csv_path)
-            print(f"Loaded {len(df)} cocktails from {csv_path}")
-            return df
-        except Exception as e:
-            print(f"Error loading data: {e}")
-            return None
-    def clean_data(self, df):
-        """Clean and preprocess the cocktail data"""
-        # Auto-detect column names (handle both old and new formats)
-        name_col = 'name' if 'name' in df.columns else 'strDrink'
-        category_col = 'category' if 'category' in df.columns else 'strCategory'
-        alcoholic_col = 'alcoholic' if 'alcoholic' in df.columns else 'strAlcoholic'
-        glass_col = 'glassType' if 'glassType' in df.columns else 'strGlass'
-        instructions_col = 'instructions' if 'instructions' in df.columns else 'strInstructions'
-        print(f"Detected columns: name='{name_col}', category='{category_col}', alcoholic='{alcoholic_col}', glass='{glass_col}'")
-        # Remove duplicates based on name
-        if name_col in df.columns:
-            df = df.drop_duplicates(subset=[name_col])
-            print(f"After removing duplicates: {len(df)} cocktails")
-        # Fill missing values
-        df = df.fillna('')
-        # Create a combined text for embedding
-        df['combined_text'] = ''
-        if name_col in df.columns:
-            df['combined_text'] += df[name_col].astype(str) + ' '
-        if category_col in df.columns:
-            df['combined_text'] += df[category_col].astype(str) + ' '
-        if alcoholic_col in df.columns:
-            df['combined_text'] += df[alcoholic_col].astype(str) + ' '
-        if glass_col in df.columns:
-            df['combined_text'] += df[glass_col].astype(str) + ' '
-        # Handle ingredients (could be in different formats)
-        if 'ingredients' in df.columns:
-            # New format: ingredients as string/list
-            df['combined_text'] += df['ingredients'].astype(str) + ' '
-        else:
-            # Old format: strIngredient1, strIngredient2, etc.
-            ingredient_cols = [col for col in df.columns if col.startswith('strIngredient')]
-            for col in ingredient_cols:
-                df['combined_text'] += df[col].astype(str) + ' '
-        # Add instructions if available
-        if instructions_col in df.columns:
-            df['combined_text'] += df[instructions_col].astype(str) + ' '
-        # Clean the combined text
-        df['combined_text'] = df['combined_text'].str.replace(r'\s+', ' ', regex=True).str.strip()
-        print(f"Sample combined text: {df['combined_text'].iloc[0][:100]}...")
-        return df
-    def generate_embeddings(self, texts):
-        """Generate embeddings for the given texts"""
-        embeddings = self.model.encode(texts, show_progress_bar=True)
-        return embeddings
-    def create_recipe_text(self, row):
-        """Create a readable recipe from the row data"""
-        # Auto-detect column names
-        name_col = 'name' if 'name' in row else 'strDrink'
-        category_col = 'category' if 'category' in row else 'strCategory'
-        alcoholic_col = 'alcoholic' if 'alcoholic' in row else 'strAlcoholic'
-        glass_col = 'glassType' if 'glassType' in row else 'strGlass'
-        instructions_col = 'instructions' if 'instructions' in row else 'strInstructions'
-        recipe = f"Drink: {row.get(name_col, '')}\n"
-        recipe += f"Category: {row.get(category_col, '')}\n"
-        recipe += f"Type: {row.get(alcoholic_col, '')}\n"
-        recipe += f"Glass: {row.get(glass_col, '')}\n"
-        if row.get(instructions_col):
-            recipe += f"Instructions: {row[instructions_col]}\n"
-        recipe += "Ingredients:\n"
-        # Handle new format (ingredients as string/list)
-        if 'ingredients' in row and row['ingredients']:
-            try:
-                import ast
-                ingredients_str = row['ingredients']
-                # Parse ingredients list
-                if ingredients_str.startswith('['):
-                    ingredients = ast.literal_eval(ingredients_str)
-                else:
-                    ingredients = [ingredients_str]
-                # Parse measures if available
-                measures = []
-                if 'ingredientMeasures' in row and row['ingredientMeasures']:
-                    measures_str = row['ingredientMeasures']
-                    if measures_str.startswith('['):
-                        measures = ast.literal_eval(measures_str)
-                    else:
-                        measures = [measures_str]
-                # Combine ingredients with measures
-                for i, ingredient in enumerate(ingredients):
-                    if ingredient and str(ingredient).strip() and str(ingredient).strip() != 'None':
-                        if i < len(measures) and measures[i] and str(measures[i]).strip() != 'None':
-                            recipe += f"- {measures[i]} {ingredient}\n"
-                        else:
-                            recipe += f"- {ingredient}\n"
-            except Exception as e:
-                # Fallback: treat as simple string
-                recipe += f"- {row['ingredients']}\n"
-        else:
-            # Handle old format (strIngredient1, strIngredient2, etc.)
-            for i in range(1, 16):  # Assuming max 15 ingredients
-                ingredient = row.get(f'strIngredient{i}')
-                measure = row.get(f'strMeasure{i}')
-                if ingredient and str(ingredient).strip() and str(ingredient).strip() != 'nan':
-                    if measure and str(measure).strip() and str(measure).strip() != 'nan':
-                        recipe += f"- {measure} {ingredient}\n"
-                    else:
-                        recipe += f"- {ingredient}\n"
-        return recipe
-    def get_ingredients_list(self, row):
-        """Extract ingredients as a comma-separated string"""
-        ingredients = []
-        # Handle new format (ingredients as string/list)
-        if 'ingredients' in row and row['ingredients']:
-            try:
-                import ast
-                ingredients_str = row['ingredients']
-                # Parse ingredients list
-                if ingredients_str.startswith('['):
-                    ingredients_list = ast.literal_eval(ingredients_str)
-                    for ingredient in ingredients_list:
-                        if ingredient and str(ingredient).strip() and str(ingredient).strip() != 'None':
-                            ingredients.append(str(ingredient).strip())
-                else:
-                    # Single ingredient as string
-                    if ingredients_str.strip():
-                        ingredients.append(ingredients_str.strip())
-            except Exception as e:
-                # Fallback: treat as simple string
-                if row['ingredients'].strip():
-                    ingredients.append(row['ingredients'].strip())
-        else:
-            # Handle old format (strIngredient1, strIngredient2, etc.)
-            for i in range(1, 16):
-                ingredient = row.get(f'strIngredient{i}')
-                if ingredient and str(ingredient).strip() and str(ingredient).strip() != 'nan':
-                    ingredients.append(str(ingredient).strip())
-        return ', '.join(ingredients)
-    def store_cocktails(self, df):
-        """Store cocktails with embeddings in the database"""
-        try:
-            conn = self.db_setup.get_connection()
-            cursor = conn.cursor()
-            # Clear existing data
-            cursor.execute("DELETE FROM cocktails")
-            print(f"Generating embeddings for {len(df)} cocktails...")
-            # Generate all embeddings at once (much more efficient)
-            all_embeddings = self.generate_embeddings(df['combined_text'].tolist())
-            print("Storing cocktails in database...")
-            for idx, (_, row) in enumerate(df.iterrows()):
-                # Get pre-computed embedding
-                embedding = all_embeddings[idx]
-                # Prepare data with auto-detected column names
-                name_col = 'name' if 'name' in row else 'strDrink'
-                category_col = 'category' if 'category' in row else 'strCategory'
-                alcoholic_col = 'alcoholic' if 'alcoholic' in row else 'strAlcoholic'
-                glass_col = 'glassType' if 'glassType' in row else 'strGlass'
-                name = row.get(name_col, '')
-                ingredients = self.get_ingredients_list(row)
-                recipe = self.create_recipe_text(row)
-                glass = row.get(glass_col, '')
-                category = row.get(category_col, '')
-                iba = row.get('strIBA', '')  # This might not exist in new format
-                alcoholic = row.get(alcoholic_col, '')
-                # Insert into database
-                cursor.execute("""
-                    INSERT INTO cocktails (name, ingredients, recipe, glass, category, iba, alcoholic, embedding)
-                    VALUES (%s, %s, %s, %s, %s, %s, %s, %s)
-                """, (name, ingredients, recipe, glass, category, iba, alcoholic, embedding.tolist()))
-                if (idx + 1) % 100 == 0:
-                    print(f"Stored {idx + 1} cocktails...")
-            conn.commit()
-            cursor.close()
-            conn.close()
-            print(f"Successfully stored {len(df)} cocktails in the database")
-        except Exception as e:
-            print(f"Error storing cocktails: {e}")
-            if 'conn' in locals():
-                conn.rollback()
-                conn.close()
-    def process_and_store(self, csv_path):
-        """Complete pipeline to process and store cocktail data"""
-        # Load data
-        df = self.load_data(csv_path)
-        if df is None:
-            return
-        # Clean data
-        df = self.clean_data(df)
-        # Store in database
-        self.store_cocktails(df)
-if __name__ == "__main__":
-    processor = CocktailDataProcessor()
-    # Assuming the CSV file is in the data directory
-    csv_path = "data/final_cocktails.csv"
-    if os.path.exists(csv_path):
-        processor.process_and_store(csv_path)
-    else:
-        print(f"Please download the cocktails dataset and place it at {csv_path}")
-        print("Dataset URL: https://www.kaggle.com/datasets/aadyasingh55/cocktails/data")

src/debug.py DELETED Viewed

@@ -1,203 +0,0 @@
-#!/usr/bin/env python3
-"""
-Debug script to help troubleshoot the cocktail recommendation system
-"""
-import os
-import sys
-from dotenv import load_dotenv
-load_dotenv()
-def check_database():
-    """Check database connection and contents"""
-    print("🔍 Checking database...")
-    try:
-        from database_setup import DatabaseSetup
-        db = DatabaseSetup()
-        # Test connection
-        conn = db.get_connection()
-        cursor = conn.cursor()
-        # Check if cocktails table exists
-        cursor.execute("""
-            SELECT EXISTS (
-                SELECT FROM information_schema.tables
-                WHERE table_name = 'cocktails'
-            );
-        """)
-        table_exists = cursor.fetchone()[0]
-        if not table_exists:
-            print("❌ Cocktails table doesn't exist")
-            print("Run: python database_setup.py")
-            return False
-        print("✅ Cocktails table exists")
-        # Check number of cocktails
-        cursor.execute("SELECT COUNT(*) FROM cocktails")
-        count = cursor.fetchone()[0]
-        print(f"📊 Found {count} cocktails in database")
-        if count == 0:
-            print("❌ No cocktails in database")
-            print("Run: python data_processor.py")
-            return False
-        # Check if embeddings exist
-        cursor.execute("SELECT COUNT(*) FROM cocktails WHERE embedding IS NOT NULL")
-        embedding_count = cursor.fetchone()[0]
-        print(f"🧠 {embedding_count} cocktails have embeddings")
-        # Test a simple query
-        cursor.execute("SELECT name FROM cocktails LIMIT 3")
-        samples = cursor.fetchall()
-        print("📝 Sample cocktails:")
-        for sample in samples:
-            print(f"  - {sample[0]}")
-        cursor.close()
-        conn.close()
-        return count > 0 and embedding_count > 0
-    except Exception as e:
-        print(f"❌ Database error: {e}")
-        return False
-def test_recommender():
-    """Test the recommendation engine"""
-    print("\n🧠 Testing recommender...")
-    try:
-        from recommender import CocktailRecommender
-        recommender = CocktailRecommender()
-        # Test random cocktails (simplest query)
-        print("Testing random cocktails...")
-        random_results = recommender.get_random_cocktails(3)
-        if random_results:
-            print(f"✅ Random query returned {len(random_results)} results")
-            for result in random_results:
-                cocktail = recommender.format_cocktail_result(result)
-                print(f"  - {cocktail['name']}")
-        else:
-            print("❌ Random query returned no results")
-            return False
-        # Test ingredient search
-        print("\nTesting ingredient search...")
-        ingredient_results = recommender.recommend_by_ingredients(['vodka'], limit=3)
-        if ingredient_results:
-            print(f"✅ Ingredient search returned {len(ingredient_results)} results")
-            for result in ingredient_results:
-                cocktail = recommender.format_cocktail_result(result)
-                print(f"  - {cocktail['name']} (Similarity: {cocktail.get('similarity', 'N/A')}%)")
-        else:
-            print("❌ Ingredient search returned no results")
-        return True
-    except Exception as e:
-        print(f"❌ Recommender error: {e}")
-        import traceback
-        traceback.print_exc()
-        return False
-def check_environment():
-    """Check environment variables"""
-    print("🔧 Checking environment...")
-    required_vars = ['DB_HOST', 'DB_PORT', 'DB_NAME', 'DB_USER', 'DB_PASSWORD']
-    for var in required_vars:
-        value = os.getenv(var)
-        if value:
-            # Hide password
-            display_value = "***" if "PASSWORD" in var else value
-            print(f"✅ {var}: {display_value}")
-        else:
-            print(f"❌ {var}: Not set")
-    # Check if .env file exists
-    if os.path.exists('.env'):
-        print("✅ .env file exists")
-    else:
-        print("❌ .env file not found")
-        print("Copy .env.example to .env and configure it")
-def check_dataset():
-    """Check if dataset exists"""
-    print("\n📊 Checking dataset...")
-    csv_path = "data/final_cocktails.csv"
-    if os.path.exists(csv_path):
-        print(f"✅ Dataset found at {csv_path}")
-        # Check file size
-        size = os.path.getsize(csv_path)
-        print(f"📏 File size: {size / 1024 / 1024:.1f} MB")
-        # Try to read first few lines
-        try:
-            import pandas as pd
-            df = pd.read_csv(csv_path, nrows=5)
-            print(f"📋 Columns: {list(df.columns)[:5]}...")
-            print(f"📈 Sample rows: {len(df)}")
-            return True
-        except Exception as e:
-            print(f"��� Error reading dataset: {e}")
-            return False
-    else:
-        print(f"❌ Dataset not found at {csv_path}")
-        print("Download from: https://www.kaggle.com/datasets/aadyasingh55/cocktails/data")
-        return False
-def main():
-    print("🍹 Cocktail Recommendation System - Debug Tool")
-    print("=" * 50)
-    # Check environment
-    check_environment()
-    # Check dataset
-    dataset_ok = check_dataset()
-    # Check database
-    db_ok = check_database()
-    # Test recommender if database is OK
-    if db_ok:
-        recommender_ok = test_recommender()
-    else:
-        recommender_ok = False
-    print("\n📋 Summary:")
-    print(f"Dataset: {'✅' if dataset_ok else '❌'}")
-    print(f"Database: {'✅' if db_ok else '❌'}")
-    print(f"Recommender: {'✅' if recommender_ok else '❌'}")
-    if not dataset_ok:
-        print("\n💡 Next steps:")
-        print("1. Download the cocktail dataset")
-        print("2. Place it as data/cocktails.csv")
-    elif not db_ok:
-        print("\n💡 Next steps:")
-        print("1. Configure .env file")
-        print("2. Run: python database_setup.py")
-        print("3. Run: python data_processor.py")
-    elif not recommender_ok:
-        print("\n💡 Next steps:")
-        print("1. Check the error messages above")
-        print("2. Verify database connectivity")
-    else:
-        print("\n🎉 Everything looks good!")
-        print("You can now run: streamlit run app.py")
-if __name__ == "__main__":
-    main()

src/demo_setup.py DELETED Viewed

@@ -1,96 +0,0 @@
-#!/usr/bin/env python3
-"""
-Quick demo setup using sample data
-"""
-import os
-import sys
-def setup_demo():
-    print("🍹 Setting up demo with sample data...")
-    # Check if we have the full dataset
-    full_dataset = "data/cocktails.csv"
-    sample_dataset = "data/sample_cocktails.csv"
-    dataset_to_use = None
-    if os.path.exists(full_dataset):
-        print(f"✅ Found full dataset: {full_dataset}")
-        dataset_to_use = full_dataset
-    elif os.path.exists(sample_dataset):
-        print(f"✅ Using sample dataset: {sample_dataset}")
-        dataset_to_use = sample_dataset
-    else:
-        print("❌ No dataset found")
-        return False
-    # Set up database
-    print("🗄️ Setting up database...")
-    try:
-        from database_setup import DatabaseSetup
-        db_setup = DatabaseSetup()
-        db_setup.create_database()
-        db_setup.setup_pgvector()
-        print("✅ Database setup complete")
-    except Exception as e:
-        print(f"❌ Database setup failed: {e}")
-        return False
-    # Process data
-    print("🧠 Processing cocktail data...")
-    try:
-        from data_processor import CocktailDataProcessor
-        processor = CocktailDataProcessor()
-        processor.process_and_store(dataset_to_use)
-        print("✅ Data processing complete")
-    except Exception as e:
-        print(f"❌ Data processing failed: {e}")
-        import traceback
-        traceback.print_exc()
-        return False
-    # Test the system
-    print("🧪 Testing the system...")
-    try:
-        from recommender import CocktailRecommender
-        recommender = CocktailRecommender()
-        # Test random cocktails
-        results = recommender.get_random_cocktails(3)
-        if results:
-            print(f"✅ System test successful - found {len(results)} cocktails")
-            for result in results:
-                cocktail = recommender.format_cocktail_result(result)
-                print(f"  - {cocktail['name']}")
-        else:
-            print("❌ System test failed - no cocktails returned")
-            return False
-    except Exception as e:
-        print(f"❌ System test failed: {e}")
-        return False
-    return True
-def main():
-    print("🚀 Cocktail Demo Setup")
-    print("=" * 30)
-    if setup_demo():
-        print("\n🎉 Demo setup complete!")
-        print("\nYou can now run:")
-        print("  streamlit run app.py")
-        print("\nOr test with:")
-        print("  python debug.py")
-        print("\n💡 To use the full dataset:")
-        print("1. Download cocktails.csv from Kaggle")
-        print("2. Place it in data/cocktails.csv")
-        print("3. Run this script again")
-    else:
-        print("\n❌ Demo setup failed!")
-        print("Please check the error messages above")
-if __name__ == "__main__":
-    main()

src/docker-compose.yml DELETED Viewed

@@ -1,37 +0,0 @@
-services:
-  postgres:
-    image: pgvector/pgvector:pg15
-    environment:
-      POSTGRES_DB: cocktails_db
-      POSTGRES_USER: postgres
-      POSTGRES_PASSWORD: your_password
-    ports:
-      - "5432:5432"
-    volumes:
-      - postgres_data:/var/lib/postgresql/data
-    healthcheck:
-      test: ["CMD-SHELL", "pg_isready -U postgres"]
-      interval: 30s
-      timeout: 10s
-      retries: 3
-  cocktail-app:
-    build: .
-    ports:
-      - "8501:8501"
-    environment:
-      - DB_HOST=postgres
-      - DB_PORT=5432
-      - DB_NAME=cocktails_db
-      - DB_USER=postgres
-      - DB_PASSWORD=your_password
-      - MODEL_NAME=all-MiniLM-L6-v2
-    depends_on:
-      postgres:
-        condition: service_healthy
-    volumes:
-      - ./data:/app/data
-      - ./logs:/app/logs
-volumes:
-  postgres_data:

src/quickstart.py DELETED Viewed

@@ -1,116 +0,0 @@
-#!/usr/bin/env python3
-"""
-Quick start script for the Cocktail Suggestions project
-"""
-import os
-import sys
-import subprocess
-def check_python_version():
-    """Check if Python version is compatible"""
-    if sys.version_info < (3, 8):
-        print("❌ Python 3.8 or higher is required")
-        return False
-    print(f"✅ Python {sys.version_info.major}.{sys.version_info.minor} detected")
-    return True
-def install_dependencies():
-    """Install required dependencies"""
-    print("📦 Installing dependencies...")
-    try:
-        subprocess.check_call([sys.executable, "-m", "pip", "install", "-r", "requirements.txt"])
-        print("✅ Dependencies installed successfully")
-        return True
-    except subprocess.CalledProcessError:
-        print("❌ Failed to install dependencies")
-        return False
-def check_env_file():
-    """Check if .env file exists"""
-    if not os.path.exists('.env'):
-        print("📝 Creating .env file from template...")
-        if os.path.exists('.env.example'):
-            import shutil
-            shutil.copy('.env.example', '.env')
-            print("⚠️  Please edit .env file with your database credentials!")
-        else:
-            print("❌ .env.example not found")
-            return False
-    print("✅ .env file exists")
-    return True
-def create_directories():
-    """Create necessary directories"""
-    dirs = ['data', 'logs']
-    for dir_name in dirs:
-        os.makedirs(dir_name, exist_ok=True)
-        print(f"📁 Created directory: {dir_name}")
-def check_dataset():
-    """Check if dataset exists"""
-    csv_path = "data/cocktails.csv"
-    if os.path.exists(csv_path):
-        print(f"✅ Dataset found at {csv_path}")
-        return True
-    else:
-        print(f"⚠️  Dataset not found at {csv_path}")
-        print("Please download from: https://www.kaggle.com/datasets/aadyasingh55/cocktails/data")
-        return False
-def main():
-    print("🍹 Cocktail Suggestions - Quick Start")
-    print("=" * 40)
-    # Check Python version
-    if not check_python_version():
-        return
-    # Create directories
-    create_directories()
-    # Check/create .env file
-    if not check_env_file():
-        return
-    # Install dependencies
-    if not install_dependencies():
-        return
-    # Check dataset
-    dataset_exists = check_dataset()
-    print("\n🎉 Setup completed!")
-    print("\nNext steps:")
-    print("1. Configure your database credentials in .env")
-    if not dataset_exists:
-        print("2. Download and place the cocktail dataset in data/cocktails.csv")
-        print("3. Run: python database_setup.py")
-        print("4. Run: python data_processor.py")
-    else:
-        print("2. Run: python database_setup.py")
-        print("3. Run: python data_processor.py")
-    print("4. Run: streamlit run app.py")
-    # Try to import key dependencies to verify installation
-    print("\n🔍 Verifying installations...")
-    try:
-        import streamlit
-        print("✅ Streamlit installed")
-    except ImportError:
-        print("❌ Streamlit not installed")
-    try:
-        import psycopg2
-        print("✅ psycopg2 installed")
-    except ImportError:
-        print("❌ psycopg2 not installed")
-    try:
-        import sentence_transformers
-        print("✅ sentence-transformers installed")
-    except ImportError:
-        print("❌ sentence-transformers not installed")
-if __name__ == "__main__":
-    main()

src/recommender.py CHANGED Viewed

@@ -7,8 +7,18 @@ load_dotenv()
 class CocktailRecommender:
     def __init__(self):
-        self.model_name = os.getenv('MODEL_NAME', 'all-MiniLM-L6-v2')
-        self.model = SentenceTransformer(self.model_name)
         self.db_setup = DatabaseSetup()
     def get_user_preferences_embedding(self, preferences):

 class CocktailRecommender:
     def __init__(self):
+        self.model_name = os.getenv('MODEL_NAME', 'sentence-transformers/all-MiniLM-L6-v2')
+        # Initialize model with proper cache handling
+        try:
+            self.model = SentenceTransformer(self.model_name)
+        except Exception as e:
+            print(f"Warning: Could not load model {self.model_name}: {e}")
+            # Fallback to a simpler model name format
+            fallback_model = 'all-MiniLM-L6-v2'
+            print(f"Trying fallback model: {fallback_model}")
+            self.model = SentenceTransformer(fallback_model)
         self.db_setup = DatabaseSetup()
     def get_user_preferences_embedding(self, preferences):

src/requirements.txt DELETED Viewed

@@ -1,9 +0,0 @@
-streamlit
-pandas
-numpy
-psycopg2-binary
-pgvector
-sentence-transformers
-scikit-learn
-python-dotenv
-requests

src/setup.sh DELETED Viewed

@@ -1,33 +0,0 @@
-#!/bin/bash
-# Setup script for the Cocktail Suggestions project
-echo "🍹 Setting up Cocktail Suggestions Project..."
-# Create necessary directories
-echo "📁 Creating directories..."
-mkdir -p data
-mkdir -p logs
-# Create .env file if it doesn't exist
-if [ ! -f .env ]; then
-    echo "📝 Creating .env file..."
-    cp .env.example .env
-    echo "⚠️  Please edit .env file with your database credentials!"
-fi
-# Install Python dependencies
-echo "📦 Installing Python dependencies..."
-pip install -r requirements.txt
-echo "✅ Setup complete!"
-echo ""
-echo "Next steps:"
-echo "1. Edit .env file with your database credentials"
-echo "2. Set up PostgreSQL with pgvector extension"
-echo "3. Download the cocktail dataset from:"
-echo "   https://www.kaggle.com/datasets/aadyasingh55/cocktails/data"
-echo "4. Place the CSV file in the data/ directory"
-echo "5. Run: python database_setup.py"
-echo "6. Run: python data_processor.py"
-echo "7. Run: streamlit run app.py"

src/test_system.py DELETED Viewed

@@ -1,165 +0,0 @@
-#!/usr/bin/env python3
-"""
-Simple test script to verify the system components
-"""
-import sys
-import os
-def test_imports():
-    """Test if all required packages can be imported"""
-    print("Testing imports...")
-    try:
-        import pandas as pd
-        print("✅ pandas imported successfully")
-    except ImportError as e:
-        print(f"❌ pandas import failed: {e}")
-        return False
-    try:
-        import numpy as np
-        print("✅ numpy imported successfully")
-    except ImportError as e:
-        print(f"❌ numpy import failed: {e}")
-        return False
-    try:
-        import streamlit as st
-        print("✅ streamlit imported successfully")
-    except ImportError as e:
-        print(f"❌ streamlit import failed: {e}")
-        return False
-    try:
-        import psycopg2
-        print("✅ psycopg2 imported successfully")
-    except ImportError as e:
-        print(f"❌ psycopg2 import failed: {e}")
-        return False
-    try:
-        from sentence_transformers import SentenceTransformer
-        print("✅ sentence-transformers imported successfully")
-    except ImportError as e:
-        print(f"❌ sentence-transformers import failed: {e}")
-        return False
-    try:
-        from dotenv import load_dotenv
-        print("✅ python-dotenv imported successfully")
-    except ImportError as e:
-        print(f"❌ python-dotenv import failed: {e}")
-        return False
-    return True
-def test_files():
-    """Test if required files exist"""
-    print("\nTesting file structure...")
-    required_files = [
-        'app.py',
-        'database_setup.py',
-        'data_processor.py',
-        'recommender.py',
-        'requirements.txt',
-        '.env.example'
-    ]
-    all_good = True
-    for file in required_files:
-        if os.path.exists(file):
-            print(f"✅ {file} exists")
-        else:
-            print(f"❌ {file} missing")
-            all_good = False
-    return all_good
-def test_database_connection():
-    """Test database connection"""
-    print("\nTesting database connection...")
-    try:
-        from dotenv import load_dotenv
-        load_dotenv()
-        import psycopg2
-        # Try to connect to default postgres database first
-        host = os.getenv('DB_HOST', 'localhost')
-        port = os.getenv('DB_PORT', '5432')
-        user = os.getenv('DB_USER', 'postgres')
-        password = os.getenv('DB_PASSWORD', 'your_password')
-        conn = psycopg2.connect(
-            host=host,
-            port=port,
-            user=user,
-            password=password,
-            database='postgres'
-        )
-        conn.close()
-        print("✅ Database connection successful")
-        return True
-    except Exception as e:
-        print(f"❌ Database connection failed: {e}")
-        print("Make sure PostgreSQL is running and credentials are correct in .env")
-        return False
-def test_model_loading():
-    """Test if the AI model can be loaded"""
-    print("\nTesting AI model loading...")
-    try:
-        from sentence_transformers import SentenceTransformer
-        model = SentenceTransformer('all-MiniLM-L6-v2')
-        print("✅ AI model loaded successfully")
-        # Test embedding generation
-        test_text = "vodka cranberry cocktail"
-        embedding = model.encode([test_text])
-        print(f"✅ Embedding generated successfully (shape: {embedding.shape})")
-        return True
-    except Exception as e:
-        print(f"❌ Model loading failed: {e}")
-        return False
-def main():
-    print("🧪 Running System Tests")
-    print("=" * 40)
-    # Test imports
-    imports_ok = test_imports()
-    # Test files
-    files_ok = test_files()
-    # Test database (only if .env exists)
-    db_ok = True
-    if os.path.exists('.env'):
-        db_ok = test_database_connection()
-    else:
-        print("\n⚠️  Skipping database test (.env file not found)")
-    # Test model loading
-    model_ok = test_model_loading()
-    print("\n📊 Test Summary:")
-    print(f"Imports: {'✅' if imports_ok else '❌'}")
-    print(f"Files: {'✅' if files_ok else '❌'}")
-    print(f"Database: {'✅' if db_ok else '❌'}")
-    print(f"AI Model: {'✅' if model_ok else '❌'}")
-    if all([imports_ok, files_ok, db_ok, model_ok]):
-        print("\n🎉 All tests passed! System is ready.")
-        return 0
-    else:
-        print("\n❌ Some tests failed. Please check the issues above.")
-        return 1
-if __name__ == "__main__":
-    sys.exit(main())

start.sh ADDED Viewed

	@@ -0,0 +1,23 @@

+#!/bin/bash
+# Ensure proper permissions and directories for Hugging Face Spaces
+mkdir -p $HOME/.cache/huggingface
+mkdir -p $HOME/.streamlit
+mkdir -p $HOME/.cache/torch
+# Set proper permissions
+chmod -R 755 $HOME/.cache
+chmod -R 755 $HOME/.streamlit
+# Export environment variables
+export HF_HOME=$HOME/.cache/huggingface
+export TRANSFORMERS_CACHE=$HOME/.cache/huggingface
+export SENTENCE_TRANSFORMERS_HOME=$HOME/.cache/huggingface
+export TORCH_HOME=$HOME/.cache/torch
+# Disable Streamlit usage stats
+export STREAMLIT_SERVER_HEADLESS=true
+export STREAMLIT_BROWSER_GATHER_USAGE_STATS=false
+echo "Starting Streamlit app..."
+exec streamlit run src/app.py --server.port=8501 --server.address=0.0.0.0 --server.headless=true --browser.gatherUsageStats=false