metadata
title: Visual Search System
emoji: π
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: 1.37.0
app_file: app.py
pinned: false
license: mit
π Visual Search System
A comprehensive Streamlit application for browsing and searching through a large dataset of high-quality images from Unsplash.
β¨ Features
- π Search by ID: Find specific images by their ID number
- π¦ Browse by Block: Navigate through images in organized blocks of 100
- π₯ Automatic Downloads: Automatically downloads missing images with parallel processing
- π Smart Dependencies: Auto-installs required packages
- π± Responsive UI: Clean, modern interface optimized for all devices
π Quick Start
Local Development
Clone the repository:
git clone <your-repo-url> cd visual-search-system
Install dependencies:
pip install -r requirements.txt
Run the app:
streamlit run app.py
Hugging Face Spaces Deployment
- Create a new Space on Hugging Face
- Choose Streamlit as the SDK
- Upload these files:
app.py
(main application)download_images.py
(image downloading logic)photos_url.csv
(image dataset)requirements.txt
(dependencies)README.md
(this file)
The app will automatically:
- Install dependencies
- Check for downloaded images
- Download missing images if needed
- Launch the Streamlit interface
π Project Structure
visual-search-system/
βββ app.py # Main Streamlit application
βββ download_images.py # Image downloading utilities
βββ photos_url.csv # Dataset with 25,000+ image URLs
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ images/ # Downloaded images (created automatically)
π― How It Works
Search by ID
- Enter a specific image ID (e.g., "0001", "1234")
- Leave empty to browse the first 500 images
- Results update in real-time
Range by Block
- Each block contains 100 images
- Enter a number between 1-250
- Example: Block 100 shows images 10001-10100
Image Management
- Automatically detects existing images
- Downloads missing images in parallel (20 workers)
- Optimizes images to 800x800 pixels
- Saves as compressed JPEGs
π Dataset Information
- Total Images: 25,000+
- Source: Unsplash (high-quality stock photos)
- Format: JPEG, optimized for web
- Size: Approximately 1.5GB total
- Resolution: 800x800 pixels (maintains aspect ratio)
π οΈ Technical Details
Dependencies
streamlit
- Web interface frameworkpandas
- Data manipulationrequests
- HTTP requests for image downloadspillow
- Image processingtqdm
- Progress bars
Performance Features
- Parallel Downloads: Uses ThreadPoolExecutor for speed
- Retry Logic: Handles failed downloads gracefully
- Smart Caching: Skips already downloaded images
- Memory Efficient: Processes images in chunks
π§ Configuration
Environment Variables
- No environment variables required
- All configuration is built-in
Customization
- Modify
MAX_DISPLAY_IMAGES
inapp.py
to change display limit - Adjust
max_workers
in download functions for different performance - Change
target_size
for different image resolutions
π¨ Troubleshooting
Common Issues
"No application file found" on Hugging Face
- Ensure
app.py
is the main file (notstart_app.py
) - Check that
requirements.txt
is present - Verify Streamlit SDK is selected
- Ensure
Image download failures
- Check internet connection
- Verify
photos_url.csv
is present - Check available disk space
Dependency issues
- Ensure Python 3.8+ is used
- Try updating pip:
pip install --upgrade pip
Performance Tips
- Faster Downloads: Increase
max_workers
in download functions - Memory Usage: Reduce
MAX_DISPLAY_IMAGES
for lower memory usage - Image Quality: Adjust JPEG quality in
download_images.py
π License
This project is open source. Feel free to modify and distribute.
π€ Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
π Support
If you encounter issues:
- Check the troubleshooting section above
- Review the console output for error messages
- Ensure all required files are present
- Verify Python version compatibility
Built with β€οΈ using Streamlit and Python