manisharma494's picture
Upload 4 files
795cdcd verified
|
raw
history blame
4.61 kB
---
title: Visual Search System
emoji: πŸ”
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: "1.37.0"
app_file: app.py
pinned: false
license: mit
---
# πŸ” Visual Search System
A comprehensive Streamlit application for browsing and searching through a large dataset of high-quality images from Unsplash.
## ✨ Features
- **πŸ”Ž Search by ID**: Find specific images by their ID number
- **πŸ“¦ Browse by Block**: Navigate through images in organized blocks of 100
- **πŸ“₯ Automatic Downloads**: Automatically downloads missing images with parallel processing
- **πŸš€ Smart Dependencies**: Auto-installs required packages
- **πŸ“± Responsive UI**: Clean, modern interface optimized for all devices
## πŸš€ Quick Start
### Local Development
1. **Clone the repository:**
```bash
git clone <your-repo-url>
cd visual-search-system
```
2. **Install dependencies:**
```bash
pip install -r requirements.txt
```
3. **Run the app:**
```bash
streamlit run app.py
```
### Hugging Face Spaces Deployment
1. **Create a new Space** on Hugging Face
2. **Choose Streamlit** as the SDK
3. **Upload these files:**
- `app.py` (main application)
- `download_images.py` (image downloading logic)
- `photos_url.csv` (image dataset)
- `requirements.txt` (dependencies)
- `README.md` (this file)
The app will automatically:
- Install dependencies
- Check for downloaded images
- Download missing images if needed
- Launch the Streamlit interface
## πŸ“ Project Structure
```
visual-search-system/
β”œβ”€β”€ app.py # Main Streamlit application
β”œβ”€β”€ download_images.py # Image downloading utilities
β”œβ”€β”€ photos_url.csv # Dataset with 25,000+ image URLs
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ README.md # This file
└── images/ # Downloaded images (created automatically)
```
## 🎯 How It Works
### Search by ID
- Enter a specific image ID (e.g., "0001", "1234")
- Leave empty to browse the first 500 images
- Results update in real-time
### Range by Block
- Each block contains 100 images
- Enter a number between 1-250
- Example: Block 100 shows images 10001-10100
### Image Management
- Automatically detects existing images
- Downloads missing images in parallel (20 workers)
- Optimizes images to 800x800 pixels
- Saves as compressed JPEGs
## πŸ“Š Dataset Information
- **Total Images**: 25,000+
- **Source**: Unsplash (high-quality stock photos)
- **Format**: JPEG, optimized for web
- **Size**: Approximately 1.5GB total
- **Resolution**: 800x800 pixels (maintains aspect ratio)
## πŸ› οΈ Technical Details
### Dependencies
- `streamlit` - Web interface framework
- `pandas` - Data manipulation
- `requests` - HTTP requests for image downloads
- `pillow` - Image processing
- `tqdm` - Progress bars
### Performance Features
- **Parallel Downloads**: Uses ThreadPoolExecutor for speed
- **Retry Logic**: Handles failed downloads gracefully
- **Smart Caching**: Skips already downloaded images
- **Memory Efficient**: Processes images in chunks
## πŸ”§ Configuration
### Environment Variables
- No environment variables required
- All configuration is built-in
### Customization
- Modify `MAX_DISPLAY_IMAGES` in `app.py` to change display limit
- Adjust `max_workers` in download functions for different performance
- Change `target_size` for different image resolutions
## 🚨 Troubleshooting
### Common Issues
1. **"No application file found" on Hugging Face**
- Ensure `app.py` is the main file (not `start_app.py`)
- Check that `requirements.txt` is present
- Verify Streamlit SDK is selected
2. **Image download failures**
- Check internet connection
- Verify `photos_url.csv` is present
- Check available disk space
3. **Dependency issues**
- Ensure Python 3.8+ is used
- Try updating pip: `pip install --upgrade pip`
### Performance Tips
- **Faster Downloads**: Increase `max_workers` in download functions
- **Memory Usage**: Reduce `MAX_DISPLAY_IMAGES` for lower memory usage
- **Image Quality**: Adjust JPEG quality in `download_images.py`
## πŸ“ License
This project is open source. Feel free to modify and distribute.
## 🀝 Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Submit a pull request
## πŸ“ž Support
If you encounter issues:
1. Check the troubleshooting section above
2. Review the console output for error messages
3. Ensure all required files are present
4. Verify Python version compatibility
---
**Built with ❀️ using Streamlit and Python**