Spaces:
Sleeping
Sleeping
| title: Visual Search System | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: streamlit | |
| sdk_version: "1.37.0" | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| # π Visual Search System | |
| A comprehensive Streamlit application for browsing and searching through a large dataset of high-quality images from Unsplash. | |
| ## β¨ Features | |
| - **π Search by ID**: Find specific images by their ID number | |
| - **π¦ Browse by Block**: Navigate through images in organized blocks of 100 | |
| - **π₯ Automatic Downloads**: Automatically downloads missing images with parallel processing | |
| - **π Smart Dependencies**: Auto-installs required packages | |
| - **π± Responsive UI**: Clean, modern interface optimized for all devices | |
| ## π Quick Start | |
| ### Local Development | |
| 1. **Clone the repository:** | |
| ```bash | |
| git clone <your-repo-url> | |
| cd visual-search-system | |
| ``` | |
| 2. **Install dependencies:** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 3. **Run the app:** | |
| ```bash | |
| streamlit run app.py | |
| ``` | |
| ### Hugging Face Spaces Deployment | |
| 1. **Create a new Space** on Hugging Face | |
| 2. **Choose Streamlit** as the SDK | |
| 3. **Upload these files:** | |
| - `app.py` (main application) | |
| - `download_images.py` (image downloading logic) | |
| - `photos_url.csv` (image dataset) | |
| - `requirements.txt` (dependencies) | |
| - `README.md` (this file) | |
| The app will automatically: | |
| - Install dependencies | |
| - Check for downloaded images | |
| - Download missing images if needed | |
| - Launch the Streamlit interface | |
| ## π Project Structure | |
| ``` | |
| visual-search-system/ | |
| βββ app.py # Main Streamlit application | |
| βββ download_images.py # Image downloading utilities | |
| βββ photos_url.csv # Dataset with 25,000+ image URLs | |
| βββ requirements.txt # Python dependencies | |
| βββ README.md # This file | |
| βββ images/ # Downloaded images (created automatically) | |
| ``` | |
| ## π― How It Works | |
| ### Search by ID | |
| - Enter a specific image ID (e.g., "0001", "1234") | |
| - Leave empty to browse the first 500 images | |
| - Results update in real-time | |
| ### Range by Block | |
| - Each block contains 100 images | |
| - Enter a number between 1-250 | |
| - Example: Block 100 shows images 10001-10100 | |
| ### Image Management | |
| - Automatically detects existing images | |
| - Downloads missing images in parallel (20 workers) | |
| - Optimizes images to 800x800 pixels | |
| - Saves as compressed JPEGs | |
| ## π Dataset Information | |
| - **Total Images**: 25,000+ | |
| - **Source**: Unsplash (high-quality stock photos) | |
| - **Format**: JPEG, optimized for web | |
| - **Size**: Approximately 1.5GB total | |
| - **Resolution**: 800x800 pixels (maintains aspect ratio) | |
| ## π οΈ Technical Details | |
| ### Dependencies | |
| - `streamlit` - Web interface framework | |
| - `pandas` - Data manipulation | |
| - `requests` - HTTP requests for image downloads | |
| - `pillow` - Image processing | |
| - `tqdm` - Progress bars | |
| ### Performance Features | |
| - **Parallel Downloads**: Uses ThreadPoolExecutor for speed | |
| - **Retry Logic**: Handles failed downloads gracefully | |
| - **Smart Caching**: Skips already downloaded images | |
| - **Memory Efficient**: Processes images in chunks | |
| ## π§ Configuration | |
| ### Environment Variables | |
| - No environment variables required | |
| - All configuration is built-in | |
| ### Customization | |
| - Modify `MAX_DISPLAY_IMAGES` in `app.py` to change display limit | |
| - Adjust `max_workers` in download functions for different performance | |
| - Change `target_size` for different image resolutions | |
| ## π¨ Troubleshooting | |
| ### Common Issues | |
| 1. **"No application file found" on Hugging Face** | |
| - Ensure `app.py` is the main file (not `start_app.py`) | |
| - Check that `requirements.txt` is present | |
| - Verify Streamlit SDK is selected | |
| 2. **Image download failures** | |
| - Check internet connection | |
| - Verify `photos_url.csv` is present | |
| - Check available disk space | |
| 3. **Dependency issues** | |
| - Ensure Python 3.8+ is used | |
| - Try updating pip: `pip install --upgrade pip` | |
| ### Performance Tips | |
| - **Faster Downloads**: Increase `max_workers` in download functions | |
| - **Memory Usage**: Reduce `MAX_DISPLAY_IMAGES` for lower memory usage | |
| - **Image Quality**: Adjust JPEG quality in `download_images.py` | |
| ## π License | |
| This project is open source. Feel free to modify and distribute. | |
| ## π€ Contributing | |
| 1. Fork the repository | |
| 2. Create a feature branch | |
| 3. Make your changes | |
| 4. Submit a pull request | |
| ## π Support | |
| If you encounter issues: | |
| 1. Check the troubleshooting section above | |
| 2. Review the console output for error messages | |
| 3. Ensure all required files are present | |
| 4. Verify Python version compatibility | |
| --- | |
| **Built with β€οΈ using Streamlit and Python** |