Spaces:
Sleeping
Sleeping
File size: 4,609 Bytes
795cdcd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
---
title: Visual Search System
emoji: π
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: "1.37.0"
app_file: app.py
pinned: false
license: mit
---
# π Visual Search System
A comprehensive Streamlit application for browsing and searching through a large dataset of high-quality images from Unsplash.
## β¨ Features
- **π Search by ID**: Find specific images by their ID number
- **π¦ Browse by Block**: Navigate through images in organized blocks of 100
- **π₯ Automatic Downloads**: Automatically downloads missing images with parallel processing
- **π Smart Dependencies**: Auto-installs required packages
- **π± Responsive UI**: Clean, modern interface optimized for all devices
## π Quick Start
### Local Development
1. **Clone the repository:**
```bash
git clone <your-repo-url>
cd visual-search-system
```
2. **Install dependencies:**
```bash
pip install -r requirements.txt
```
3. **Run the app:**
```bash
streamlit run app.py
```
### Hugging Face Spaces Deployment
1. **Create a new Space** on Hugging Face
2. **Choose Streamlit** as the SDK
3. **Upload these files:**
- `app.py` (main application)
- `download_images.py` (image downloading logic)
- `photos_url.csv` (image dataset)
- `requirements.txt` (dependencies)
- `README.md` (this file)
The app will automatically:
- Install dependencies
- Check for downloaded images
- Download missing images if needed
- Launch the Streamlit interface
## π Project Structure
```
visual-search-system/
βββ app.py # Main Streamlit application
βββ download_images.py # Image downloading utilities
βββ photos_url.csv # Dataset with 25,000+ image URLs
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ images/ # Downloaded images (created automatically)
```
## π― How It Works
### Search by ID
- Enter a specific image ID (e.g., "0001", "1234")
- Leave empty to browse the first 500 images
- Results update in real-time
### Range by Block
- Each block contains 100 images
- Enter a number between 1-250
- Example: Block 100 shows images 10001-10100
### Image Management
- Automatically detects existing images
- Downloads missing images in parallel (20 workers)
- Optimizes images to 800x800 pixels
- Saves as compressed JPEGs
## π Dataset Information
- **Total Images**: 25,000+
- **Source**: Unsplash (high-quality stock photos)
- **Format**: JPEG, optimized for web
- **Size**: Approximately 1.5GB total
- **Resolution**: 800x800 pixels (maintains aspect ratio)
## π οΈ Technical Details
### Dependencies
- `streamlit` - Web interface framework
- `pandas` - Data manipulation
- `requests` - HTTP requests for image downloads
- `pillow` - Image processing
- `tqdm` - Progress bars
### Performance Features
- **Parallel Downloads**: Uses ThreadPoolExecutor for speed
- **Retry Logic**: Handles failed downloads gracefully
- **Smart Caching**: Skips already downloaded images
- **Memory Efficient**: Processes images in chunks
## π§ Configuration
### Environment Variables
- No environment variables required
- All configuration is built-in
### Customization
- Modify `MAX_DISPLAY_IMAGES` in `app.py` to change display limit
- Adjust `max_workers` in download functions for different performance
- Change `target_size` for different image resolutions
## π¨ Troubleshooting
### Common Issues
1. **"No application file found" on Hugging Face**
- Ensure `app.py` is the main file (not `start_app.py`)
- Check that `requirements.txt` is present
- Verify Streamlit SDK is selected
2. **Image download failures**
- Check internet connection
- Verify `photos_url.csv` is present
- Check available disk space
3. **Dependency issues**
- Ensure Python 3.8+ is used
- Try updating pip: `pip install --upgrade pip`
### Performance Tips
- **Faster Downloads**: Increase `max_workers` in download functions
- **Memory Usage**: Reduce `MAX_DISPLAY_IMAGES` for lower memory usage
- **Image Quality**: Adjust JPEG quality in `download_images.py`
## π License
This project is open source. Feel free to modify and distribute.
## π€ Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Submit a pull request
## π Support
If you encounter issues:
1. Check the troubleshooting section above
2. Review the console output for error messages
3. Ensure all required files are present
4. Verify Python version compatibility
---
**Built with β€οΈ using Streamlit and Python** |