Spaces:

manisharma494
/

Virtual-Search-System

Sleeping

App Files Files

xet

Community

Virtual-Search-System / README.md

manisharma494

Upload 4 files

795cdcd verified about 2 months ago

preview code

raw

history blame

4.61 kB

metadata

title: Visual Search System
emoji: 🔍
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: 1.37.0
app_file: app.py
pinned: false
license: mit

🔍 Visual Search System

A comprehensive Streamlit application for browsing and searching through a large dataset of high-quality images from Unsplash.

✨ Features

🔎 Search by ID: Find specific images by their ID number
📦 Browse by Block: Navigate through images in organized blocks of 100
📥 Automatic Downloads: Automatically downloads missing images with parallel processing
🚀 Smart Dependencies: Auto-installs required packages
📱 Responsive UI: Clean, modern interface optimized for all devices

🚀 Quick Start

Local Development

Clone the repository:

git clone <your-repo-url>
cd visual-search-system

Install dependencies:
```
pip install -r requirements.txt
```
Run the app:
```
streamlit run app.py
```

Hugging Face Spaces Deployment

Create a new Space on Hugging Face
Choose Streamlit as the SDK
Upload these files:
- app.py (main application)
- download_images.py (image downloading logic)
- photos_url.csv (image dataset)
- requirements.txt (dependencies)
- README.md (this file)

The app will automatically:

Install dependencies
Check for downloaded images
Download missing images if needed
Launch the Streamlit interface

📁 Project Structure

visual-search-system/
├── app.py                 # Main Streamlit application
├── download_images.py     # Image downloading utilities
├── photos_url.csv        # Dataset with 25,000+ image URLs
├── requirements.txt      # Python dependencies
├── README.md            # This file
└── images/              # Downloaded images (created automatically)

🎯 How It Works

Search by ID

Enter a specific image ID (e.g., "0001", "1234")
Leave empty to browse the first 500 images
Results update in real-time

Range by Block

Each block contains 100 images
Enter a number between 1-250
Example: Block 100 shows images 10001-10100

Image Management

Automatically detects existing images
Downloads missing images in parallel (20 workers)
Optimizes images to 800x800 pixels
Saves as compressed JPEGs

📊 Dataset Information

Total Images: 25,000+
Source: Unsplash (high-quality stock photos)
Format: JPEG, optimized for web
Size: Approximately 1.5GB total
Resolution: 800x800 pixels (maintains aspect ratio)

🛠️ Technical Details

Dependencies

streamlit - Web interface framework
pandas - Data manipulation
requests - HTTP requests for image downloads
pillow - Image processing
tqdm - Progress bars

Performance Features

Parallel Downloads: Uses ThreadPoolExecutor for speed
Retry Logic: Handles failed downloads gracefully
Smart Caching: Skips already downloaded images
Memory Efficient: Processes images in chunks

🔧 Configuration

Environment Variables

No environment variables required
All configuration is built-in

Customization

Modify MAX_DISPLAY_IMAGES in app.py to change display limit
Adjust max_workers in download functions for different performance
Change target_size for different image resolutions

🚨 Troubleshooting

Common Issues

"No application file found" on Hugging Face
- Ensure app.py is the main file (not start_app.py)
- Check that requirements.txt is present
- Verify Streamlit SDK is selected
Image download failures
- Check internet connection
- Verify photos_url.csv is present
- Check available disk space
Dependency issues
- Ensure Python 3.8+ is used
- Try updating pip: pip install --upgrade pip

Performance Tips

Faster Downloads: Increase max_workers in download functions
Memory Usage: Reduce MAX_DISPLAY_IMAGES for lower memory usage
Image Quality: Adjust JPEG quality in download_images.py

📝 License

This project is open source. Feel free to modify and distribute.

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

📞 Support

If you encounter issues:

Check the troubleshooting section above
Review the console output for error messages
Ensure all required files are present
Verify Python version compatibility

Built with ❤️ using Streamlit and Python