Spaces:

manisharma494
/

Virtual-Search-System

Sleeping

App Files Files Community

Virtual-Search-System / README.md

manisharma494

Upload 4 files

795cdcd verified about 2 months ago

preview code

raw

history blame

4.61 kB

	---
	title: Visual Search System
	emoji: 🔍
	colorFrom: blue
	colorTo: green
	sdk: streamlit
	sdk_version: "1.37.0"
	app_file: app.py
	pinned: false
	license: mit
	---

	# 🔍 Visual Search System

	A comprehensive Streamlit application for browsing and searching through a large dataset of high-quality images from Unsplash.

	## ✨ Features

	- 🔎 Search by ID: Find specific images by their ID number
	- 📦 Browse by Block: Navigate through images in organized blocks of 100
	- 📥 Automatic Downloads: Automatically downloads missing images with parallel processing
	- 🚀 Smart Dependencies: Auto-installs required packages
	- 📱 Responsive UI: Clean, modern interface optimized for all devices

	## 🚀 Quick Start

	### Local Development

	1. Clone the repository:
	```bash
	git clone <your-repo-url>
	cd visual-search-system
	```

	2. Install dependencies:
	```bash
	pip install -r requirements.txt
	```

	3. Run the app:
	```bash
	streamlit run app.py
	```

	### Hugging Face Spaces Deployment

	1. Create a new Space on Hugging Face
	2. Choose Streamlit as the SDK
	3. Upload these files:
	- `app.py` (main application)
	- `download_images.py` (image downloading logic)
	- `photos_url.csv` (image dataset)
	- `requirements.txt` (dependencies)
	- `README.md` (this file)

	The app will automatically:
	- Install dependencies
	- Check for downloaded images
	- Download missing images if needed
	- Launch the Streamlit interface

	## 📁 Project Structure

	```
	visual-search-system/
	├── app.py # Main Streamlit application
	├── download_images.py # Image downloading utilities
	├── photos_url.csv # Dataset with 25,000+ image URLs
	├── requirements.txt # Python dependencies
	├── README.md # This file
	└── images/ # Downloaded images (created automatically)
	```

	## 🎯 How It Works

	### Search by ID
	- Enter a specific image ID (e.g., "0001", "1234")
	- Leave empty to browse the first 500 images
	- Results update in real-time

	### Range by Block
	- Each block contains 100 images
	- Enter a number between 1-250
	- Example: Block 100 shows images 10001-10100

	### Image Management
	- Automatically detects existing images
	- Downloads missing images in parallel (20 workers)
	- Optimizes images to 800x800 pixels
	- Saves as compressed JPEGs

	## 📊 Dataset Information

	- Total Images: 25,000+
	- Source: Unsplash (high-quality stock photos)
	- Format: JPEG, optimized for web
	- Size: Approximately 1.5GB total
	- Resolution: 800x800 pixels (maintains aspect ratio)

	## 🛠️ Technical Details

	### Dependencies
	- `streamlit` - Web interface framework
	- `pandas` - Data manipulation
	- `requests` - HTTP requests for image downloads
	- `pillow` - Image processing
	- `tqdm` - Progress bars

	### Performance Features
	- Parallel Downloads: Uses ThreadPoolExecutor for speed
	- Retry Logic: Handles failed downloads gracefully
	- Smart Caching: Skips already downloaded images
	- Memory Efficient: Processes images in chunks

	## 🔧 Configuration

	### Environment Variables
	- No environment variables required
	- All configuration is built-in

	### Customization
	- Modify `MAX_DISPLAY_IMAGES` in `app.py` to change display limit
	- Adjust `max_workers` in download functions for different performance
	- Change `target_size` for different image resolutions

	## 🚨 Troubleshooting

	### Common Issues

	1. "No application file found" on Hugging Face
	- Ensure `app.py` is the main file (not `start_app.py`)
	- Check that `requirements.txt` is present
	- Verify Streamlit SDK is selected

	2. Image download failures
	- Check internet connection
	- Verify `photos_url.csv` is present
	- Check available disk space

	3. Dependency issues
	- Ensure Python 3.8+ is used
	- Try updating pip: `pip install --upgrade pip`

	### Performance Tips

	- Faster Downloads: Increase `max_workers` in download functions
	- Memory Usage: Reduce `MAX_DISPLAY_IMAGES` for lower memory usage
	- Image Quality: Adjust JPEG quality in `download_images.py`

	## 📝 License

	This project is open source. Feel free to modify and distribute.

	## 🤝 Contributing

	1. Fork the repository
	2. Create a feature branch
	3. Make your changes
	4. Submit a pull request

	## 📞 Support

	If you encounter issues:
	1. Check the troubleshooting section above
	2. Review the console output for error messages
	3. Ensure all required files are present
	4. Verify Python version compatibility

	---

	Built with ❤️ using Streamlit and Python