Spaces:
Sleeping
Sleeping
Upload 4 files
Browse files- DEPLOYMENT.md +99 -0
- README.md +171 -0
- app.py +402 -0
- requirements.txt +5 -0
DEPLOYMENT.md
ADDED
@@ -0,0 +1,99 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# π Hugging Face Spaces Deployment Checklist
|
2 |
+
|
3 |
+
## β
Pre-Deployment Checklist
|
4 |
+
|
5 |
+
### 1. File Structure
|
6 |
+
- [x] `app.py` - Main Streamlit application (entry point)
|
7 |
+
- [x] `download_images.py` - Image downloading utilities
|
8 |
+
- [x] `photos_url.csv` - Image dataset (25,000+ URLs)
|
9 |
+
- [x] `requirements.txt` - Python dependencies
|
10 |
+
- [x] `README.md` - Project documentation
|
11 |
+
- [x] `.gitignore` - Clean repository
|
12 |
+
|
13 |
+
### 2. File Names (Critical for Hugging Face)
|
14 |
+
- [x] **Main app file**: `app.py` (NOT `start_app.py`)
|
15 |
+
- [x] **Dependencies**: `requirements.txt` (lowercase package names)
|
16 |
+
- [x] **Documentation**: `README.md`
|
17 |
+
|
18 |
+
### 3. Code Verification
|
19 |
+
- [x] App imports successfully
|
20 |
+
- [x] Dependencies are correctly specified
|
21 |
+
- [x] No syntax errors
|
22 |
+
- [x] Proper error handling
|
23 |
+
|
24 |
+
## π― Hugging Face Spaces Setup
|
25 |
+
|
26 |
+
### Step 1: Create New Space
|
27 |
+
1. Go to [Hugging Face Spaces](https://huggingface.co/spaces)
|
28 |
+
2. Click "Create new Space"
|
29 |
+
3. Choose your repository
|
30 |
+
4. **Select SDK**: `Streamlit` β (CRITICAL)
|
31 |
+
5. **Select License**: Choose appropriate license
|
32 |
+
6. Click "Create Space"
|
33 |
+
|
34 |
+
### Step 2: Upload Files
|
35 |
+
Upload these files in this exact order:
|
36 |
+
1. `app.py` (main application)
|
37 |
+
2. `download_images.py` (helper functions)
|
38 |
+
3. `photos_url.csv` (dataset)
|
39 |
+
4. `requirements.txt` (dependencies)
|
40 |
+
5. `README.md` (documentation)
|
41 |
+
|
42 |
+
### Step 3: Verify Deployment
|
43 |
+
1. Check that the Space shows "Building" status
|
44 |
+
2. Wait for build to complete (usually 2-5 minutes)
|
45 |
+
3. Verify the app loads without "No application file found" error
|
46 |
+
4. Test the interface functionality
|
47 |
+
|
48 |
+
## π§ Troubleshooting Common Issues
|
49 |
+
|
50 |
+
### Issue: "No application file found"
|
51 |
+
**Solution**: Ensure `app.py` is the main file (not `start_app.py`)
|
52 |
+
|
53 |
+
### Issue: Build fails
|
54 |
+
**Solution**: Check `requirements.txt` has correct package names
|
55 |
+
|
56 |
+
### Issue: App loads but doesn't work
|
57 |
+
**Solution**: Check console logs for Python errors
|
58 |
+
|
59 |
+
### Issue: Images not downloading
|
60 |
+
**Solution**: Verify `photos_url.csv` is present and accessible
|
61 |
+
|
62 |
+
## π Post-Deployment Verification
|
63 |
+
|
64 |
+
### 1. App Loading
|
65 |
+
- [ ] App loads without errors
|
66 |
+
- [ ] No "No application file found" message
|
67 |
+
- [ ] Streamlit interface appears
|
68 |
+
|
69 |
+
### 2. Functionality
|
70 |
+
- [ ] Search by ID works
|
71 |
+
- [ ] Range by Block works
|
72 |
+
- [ ] Images display correctly
|
73 |
+
- [ ] No Python errors in console
|
74 |
+
|
75 |
+
### 3. Performance
|
76 |
+
- [ ] App responds within reasonable time
|
77 |
+
- [ ] Image downloads work (if needed)
|
78 |
+
- [ ] No memory issues
|
79 |
+
|
80 |
+
## π Success Indicators
|
81 |
+
|
82 |
+
β
**App loads successfully**
|
83 |
+
β
**No "No application file found" error**
|
84 |
+
β
**Streamlit interface appears**
|
85 |
+
β
**Search functionality works**
|
86 |
+
β
**Images display correctly**
|
87 |
+
β
**No Python errors in logs**
|
88 |
+
|
89 |
+
## π If Issues Persist
|
90 |
+
|
91 |
+
1. **Check Space logs** in Hugging Face interface
|
92 |
+
2. **Verify file names** match exactly
|
93 |
+
3. **Ensure Streamlit SDK** is selected
|
94 |
+
4. **Check requirements.txt** format
|
95 |
+
5. **Verify app.py** is the main entry point
|
96 |
+
|
97 |
+
---
|
98 |
+
|
99 |
+
**Your app should now deploy successfully on Hugging Face Spaces! π**
|
README.md
ADDED
@@ -0,0 +1,171 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
title: Visual Search System
|
3 |
+
emoji: π
|
4 |
+
colorFrom: blue
|
5 |
+
colorTo: green
|
6 |
+
sdk: streamlit
|
7 |
+
sdk_version: "1.37.0"
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
license: mit
|
11 |
+
---
|
12 |
+
|
13 |
+
# π Visual Search System
|
14 |
+
|
15 |
+
A comprehensive Streamlit application for browsing and searching through a large dataset of high-quality images from Unsplash.
|
16 |
+
|
17 |
+
## β¨ Features
|
18 |
+
|
19 |
+
- **π Search by ID**: Find specific images by their ID number
|
20 |
+
- **π¦ Browse by Block**: Navigate through images in organized blocks of 100
|
21 |
+
- **π₯ Automatic Downloads**: Automatically downloads missing images with parallel processing
|
22 |
+
- **π Smart Dependencies**: Auto-installs required packages
|
23 |
+
- **π± Responsive UI**: Clean, modern interface optimized for all devices
|
24 |
+
|
25 |
+
## π Quick Start
|
26 |
+
|
27 |
+
### Local Development
|
28 |
+
|
29 |
+
1. **Clone the repository:**
|
30 |
+
```bash
|
31 |
+
git clone <your-repo-url>
|
32 |
+
cd visual-search-system
|
33 |
+
```
|
34 |
+
|
35 |
+
2. **Install dependencies:**
|
36 |
+
```bash
|
37 |
+
pip install -r requirements.txt
|
38 |
+
```
|
39 |
+
|
40 |
+
3. **Run the app:**
|
41 |
+
```bash
|
42 |
+
streamlit run app.py
|
43 |
+
```
|
44 |
+
|
45 |
+
### Hugging Face Spaces Deployment
|
46 |
+
|
47 |
+
1. **Create a new Space** on Hugging Face
|
48 |
+
2. **Choose Streamlit** as the SDK
|
49 |
+
3. **Upload these files:**
|
50 |
+
- `app.py` (main application)
|
51 |
+
- `download_images.py` (image downloading logic)
|
52 |
+
- `photos_url.csv` (image dataset)
|
53 |
+
- `requirements.txt` (dependencies)
|
54 |
+
- `README.md` (this file)
|
55 |
+
|
56 |
+
The app will automatically:
|
57 |
+
- Install dependencies
|
58 |
+
- Check for downloaded images
|
59 |
+
- Download missing images if needed
|
60 |
+
- Launch the Streamlit interface
|
61 |
+
|
62 |
+
## π Project Structure
|
63 |
+
|
64 |
+
```
|
65 |
+
visual-search-system/
|
66 |
+
βββ app.py # Main Streamlit application
|
67 |
+
βββ download_images.py # Image downloading utilities
|
68 |
+
βββ photos_url.csv # Dataset with 25,000+ image URLs
|
69 |
+
βββ requirements.txt # Python dependencies
|
70 |
+
βββ README.md # This file
|
71 |
+
βββ images/ # Downloaded images (created automatically)
|
72 |
+
```
|
73 |
+
|
74 |
+
## π― How It Works
|
75 |
+
|
76 |
+
### Search by ID
|
77 |
+
- Enter a specific image ID (e.g., "0001", "1234")
|
78 |
+
- Leave empty to browse the first 500 images
|
79 |
+
- Results update in real-time
|
80 |
+
|
81 |
+
### Range by Block
|
82 |
+
- Each block contains 100 images
|
83 |
+
- Enter a number between 1-250
|
84 |
+
- Example: Block 100 shows images 10001-10100
|
85 |
+
|
86 |
+
### Image Management
|
87 |
+
- Automatically detects existing images
|
88 |
+
- Downloads missing images in parallel (20 workers)
|
89 |
+
- Optimizes images to 800x800 pixels
|
90 |
+
- Saves as compressed JPEGs
|
91 |
+
|
92 |
+
## π Dataset Information
|
93 |
+
|
94 |
+
- **Total Images**: 25,000+
|
95 |
+
- **Source**: Unsplash (high-quality stock photos)
|
96 |
+
- **Format**: JPEG, optimized for web
|
97 |
+
- **Size**: Approximately 1.5GB total
|
98 |
+
- **Resolution**: 800x800 pixels (maintains aspect ratio)
|
99 |
+
|
100 |
+
## π οΈ Technical Details
|
101 |
+
|
102 |
+
### Dependencies
|
103 |
+
- `streamlit` - Web interface framework
|
104 |
+
- `pandas` - Data manipulation
|
105 |
+
- `requests` - HTTP requests for image downloads
|
106 |
+
- `pillow` - Image processing
|
107 |
+
- `tqdm` - Progress bars
|
108 |
+
|
109 |
+
### Performance Features
|
110 |
+
- **Parallel Downloads**: Uses ThreadPoolExecutor for speed
|
111 |
+
- **Retry Logic**: Handles failed downloads gracefully
|
112 |
+
- **Smart Caching**: Skips already downloaded images
|
113 |
+
- **Memory Efficient**: Processes images in chunks
|
114 |
+
|
115 |
+
## π§ Configuration
|
116 |
+
|
117 |
+
### Environment Variables
|
118 |
+
- No environment variables required
|
119 |
+
- All configuration is built-in
|
120 |
+
|
121 |
+
### Customization
|
122 |
+
- Modify `MAX_DISPLAY_IMAGES` in `app.py` to change display limit
|
123 |
+
- Adjust `max_workers` in download functions for different performance
|
124 |
+
- Change `target_size` for different image resolutions
|
125 |
+
|
126 |
+
## π¨ Troubleshooting
|
127 |
+
|
128 |
+
### Common Issues
|
129 |
+
|
130 |
+
1. **"No application file found" on Hugging Face**
|
131 |
+
- Ensure `app.py` is the main file (not `start_app.py`)
|
132 |
+
- Check that `requirements.txt` is present
|
133 |
+
- Verify Streamlit SDK is selected
|
134 |
+
|
135 |
+
2. **Image download failures**
|
136 |
+
- Check internet connection
|
137 |
+
- Verify `photos_url.csv` is present
|
138 |
+
- Check available disk space
|
139 |
+
|
140 |
+
3. **Dependency issues**
|
141 |
+
- Ensure Python 3.8+ is used
|
142 |
+
- Try updating pip: `pip install --upgrade pip`
|
143 |
+
|
144 |
+
### Performance Tips
|
145 |
+
|
146 |
+
- **Faster Downloads**: Increase `max_workers` in download functions
|
147 |
+
- **Memory Usage**: Reduce `MAX_DISPLAY_IMAGES` for lower memory usage
|
148 |
+
- **Image Quality**: Adjust JPEG quality in `download_images.py`
|
149 |
+
|
150 |
+
## π License
|
151 |
+
|
152 |
+
This project is open source. Feel free to modify and distribute.
|
153 |
+
|
154 |
+
## π€ Contributing
|
155 |
+
|
156 |
+
1. Fork the repository
|
157 |
+
2. Create a feature branch
|
158 |
+
3. Make your changes
|
159 |
+
4. Submit a pull request
|
160 |
+
|
161 |
+
## π Support
|
162 |
+
|
163 |
+
If you encounter issues:
|
164 |
+
1. Check the troubleshooting section above
|
165 |
+
2. Review the console output for error messages
|
166 |
+
3. Ensure all required files are present
|
167 |
+
4. Verify Python version compatibility
|
168 |
+
|
169 |
+
---
|
170 |
+
|
171 |
+
**Built with β€οΈ using Streamlit and Python**
|
app.py
ADDED
@@ -0,0 +1,402 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
"""
|
3 |
+
Visual Search System - Complete Streamlit App
|
4 |
+
============================================
|
5 |
+
|
6 |
+
A comprehensive Streamlit application that:
|
7 |
+
1. Automatically installs required dependencies
|
8 |
+
2. Downloads images from photos_url.csv if needed
|
9 |
+
3. Provides a clean UI for searching and viewing images
|
10 |
+
4. Supports both search by ID and range by block functionality
|
11 |
+
|
12 |
+
Requirements:
|
13 |
+
- photos_url.csv: Contains image URLs
|
14 |
+
- download_images.py: Contains parallel downloading logic
|
15 |
+
- images/ folder: Will be created and populated with downloaded images
|
16 |
+
|
17 |
+
Usage:
|
18 |
+
streamlit run app.py
|
19 |
+
|
20 |
+
Hugging Face Deployment:
|
21 |
+
This app is configured for Hugging Face Spaces deployment.
|
22 |
+
Upload all files and it will run automatically.
|
23 |
+
"""
|
24 |
+
|
25 |
+
import os
|
26 |
+
import sys
|
27 |
+
import subprocess
|
28 |
+
import importlib
|
29 |
+
from pathlib import Path
|
30 |
+
import pandas as pd
|
31 |
+
import streamlit as st
|
32 |
+
from typing import List, Tuple, Optional
|
33 |
+
import time
|
34 |
+
|
35 |
+
# Configuration
|
36 |
+
REQUIRED_PACKAGES = [
|
37 |
+
"streamlit",
|
38 |
+
"pandas",
|
39 |
+
"requests",
|
40 |
+
"PIL",
|
41 |
+
"tqdm"
|
42 |
+
]
|
43 |
+
|
44 |
+
IMAGES_DIR = "images"
|
45 |
+
CSV_FILE = "photos_url.csv"
|
46 |
+
DOWNLOAD_SCRIPT = "download_images.py"
|
47 |
+
MAX_DISPLAY_IMAGES = 500
|
48 |
+
IMAGES_PER_BLOCK = 100
|
49 |
+
TOTAL_BLOCKS = 250
|
50 |
+
|
51 |
+
def install_package(package: str) -> bool:
|
52 |
+
"""
|
53 |
+
Install a Python package using pip
|
54 |
+
|
55 |
+
Args:
|
56 |
+
package: Package name to install
|
57 |
+
|
58 |
+
Returns:
|
59 |
+
True if successful, False otherwise
|
60 |
+
"""
|
61 |
+
try:
|
62 |
+
subprocess.check_call([sys.executable, "-m", "pip", "install", package])
|
63 |
+
return True
|
64 |
+
except subprocess.CalledProcessError:
|
65 |
+
return False
|
66 |
+
|
67 |
+
def check_and_install_dependencies() -> bool:
|
68 |
+
"""
|
69 |
+
Check if required packages are installed, install if missing
|
70 |
+
|
71 |
+
Returns:
|
72 |
+
True if all dependencies are available, False otherwise
|
73 |
+
"""
|
74 |
+
print("π Checking dependencies...")
|
75 |
+
|
76 |
+
missing_packages = []
|
77 |
+
|
78 |
+
for package in REQUIRED_PACKAGES:
|
79 |
+
try:
|
80 |
+
importlib.import_module(package)
|
81 |
+
print(f"β
{package} is already installed")
|
82 |
+
except ImportError:
|
83 |
+
print(f"π¦ Installing {package}...")
|
84 |
+
missing_packages.append(package)
|
85 |
+
|
86 |
+
if missing_packages:
|
87 |
+
print(f"π Installing {len(missing_packages)} missing packages...")
|
88 |
+
|
89 |
+
for package in missing_packages:
|
90 |
+
print(f"π₯ Installing {package}...")
|
91 |
+
if install_package(package):
|
92 |
+
print(f"β
Successfully installed {package}")
|
93 |
+
else:
|
94 |
+
print(f"β Failed to install {package}")
|
95 |
+
return False
|
96 |
+
|
97 |
+
# Verify installations
|
98 |
+
for package in missing_packages:
|
99 |
+
try:
|
100 |
+
importlib.import_module(package)
|
101 |
+
print(f"β
{package} verified after installation")
|
102 |
+
except ImportError:
|
103 |
+
print(f"β {package} still not available after installation")
|
104 |
+
return False
|
105 |
+
|
106 |
+
print("β
All dependencies are available!")
|
107 |
+
return True
|
108 |
+
|
109 |
+
def check_images_status() -> Tuple[bool, int, int]:
|
110 |
+
"""
|
111 |
+
Check the status of downloaded images
|
112 |
+
|
113 |
+
Returns:
|
114 |
+
Tuple of (is_complete, current_count, total_count)
|
115 |
+
"""
|
116 |
+
images_path = Path(IMAGES_DIR)
|
117 |
+
|
118 |
+
if not images_path.exists():
|
119 |
+
return False, 0, 0
|
120 |
+
|
121 |
+
# Count existing images
|
122 |
+
existing_images = list(images_path.glob("*.jpg"))
|
123 |
+
current_count = len(existing_images)
|
124 |
+
|
125 |
+
# Get total count from CSV
|
126 |
+
try:
|
127 |
+
df = pd.read_csv(CSV_FILE)
|
128 |
+
total_count = len(df)
|
129 |
+
except Exception as e:
|
130 |
+
print(f"β Error reading {CSV_FILE}: {e}")
|
131 |
+
return False, current_count, 0
|
132 |
+
|
133 |
+
is_complete = current_count >= total_count * 0.95 # Consider complete if 95%+ downloaded
|
134 |
+
|
135 |
+
return is_complete, current_count, total_count
|
136 |
+
|
137 |
+
def download_images_if_needed() -> bool:
|
138 |
+
"""
|
139 |
+
Download images if they're missing or incomplete
|
140 |
+
|
141 |
+
Returns:
|
142 |
+
True if images are available, False otherwise
|
143 |
+
"""
|
144 |
+
print("π Checking image status...")
|
145 |
+
|
146 |
+
is_complete, current_count, total_count = check_images_status()
|
147 |
+
|
148 |
+
if is_complete:
|
149 |
+
print(f"β
Images are ready! Have {current_count:,} of {total_count:,} images")
|
150 |
+
return True
|
151 |
+
|
152 |
+
print(f"π₯ Images incomplete: {current_count:,} of {total_count:,} available")
|
153 |
+
print("π Starting image download...")
|
154 |
+
|
155 |
+
try:
|
156 |
+
# Import download functions from download_images.py
|
157 |
+
sys.path.append('.')
|
158 |
+
from download_images import download_images
|
159 |
+
|
160 |
+
success = download_images(
|
161 |
+
num_images=None, # Download all images
|
162 |
+
output_dir=IMAGES_DIR,
|
163 |
+
max_workers=20
|
164 |
+
)
|
165 |
+
|
166 |
+
if success:
|
167 |
+
print("β
Image download completed successfully!")
|
168 |
+
return True
|
169 |
+
else:
|
170 |
+
print("β οΈ Image download had some issues, but continuing...")
|
171 |
+
return True
|
172 |
+
|
173 |
+
except Exception as e:
|
174 |
+
print(f"β Error during image download: {e}")
|
175 |
+
return False
|
176 |
+
|
177 |
+
def get_image_path(image_id: str) -> Optional[str]:
|
178 |
+
"""
|
179 |
+
Get the file path for a given image ID
|
180 |
+
|
181 |
+
Args:
|
182 |
+
image_id: Image ID (e.g., "0001", "1234")
|
183 |
+
|
184 |
+
Returns:
|
185 |
+
File path if exists, None otherwise
|
186 |
+
"""
|
187 |
+
try:
|
188 |
+
# Convert image ID to filename format
|
189 |
+
if image_id.isdigit():
|
190 |
+
filename = f"{int(image_id):04d}.jpg"
|
191 |
+
else:
|
192 |
+
filename = f"{image_id}.jpg"
|
193 |
+
|
194 |
+
image_path = os.path.join(IMAGES_DIR, filename)
|
195 |
+
|
196 |
+
if os.path.exists(image_path):
|
197 |
+
return image_path
|
198 |
+
else:
|
199 |
+
return None
|
200 |
+
except:
|
201 |
+
return None
|
202 |
+
|
203 |
+
def get_block_images(block_number: int) -> List[str]:
|
204 |
+
"""
|
205 |
+
Get all images for a specific block
|
206 |
+
|
207 |
+
Args:
|
208 |
+
block_number: Block number (1-250)
|
209 |
+
|
210 |
+
Returns:
|
211 |
+
List of image paths for the block
|
212 |
+
"""
|
213 |
+
if not (1 <= block_number <= TOTAL_BLOCKS):
|
214 |
+
return []
|
215 |
+
|
216 |
+
# Calculate start and end image numbers for this block
|
217 |
+
start_num = (block_number - 1) * IMAGES_PER_BLOCK + 1
|
218 |
+
end_num = block_number * IMAGES_PER_BLOCK
|
219 |
+
|
220 |
+
image_paths = []
|
221 |
+
|
222 |
+
for i in range(start_num, end_num + 1):
|
223 |
+
image_path = get_image_path(str(i))
|
224 |
+
if image_path:
|
225 |
+
image_paths.append(image_path)
|
226 |
+
|
227 |
+
return image_paths
|
228 |
+
|
229 |
+
def search_images_by_id(search_id: str) -> List[str]:
|
230 |
+
"""
|
231 |
+
Search for images by ID
|
232 |
+
|
233 |
+
Args:
|
234 |
+
search_id: Search term (can be partial)
|
235 |
+
|
236 |
+
Returns:
|
237 |
+
List of matching image paths
|
238 |
+
"""
|
239 |
+
if not search_id.strip():
|
240 |
+
# Return first 500 images if no search term
|
241 |
+
return [get_image_path(str(i)) for i in range(1, MAX_DISPLAY_IMAGES + 1)
|
242 |
+
if get_image_path(str(i))]
|
243 |
+
|
244 |
+
# Search for exact or partial matches
|
245 |
+
matching_paths = []
|
246 |
+
|
247 |
+
# Try exact match first
|
248 |
+
exact_path = get_image_path(search_id)
|
249 |
+
if exact_path:
|
250 |
+
matching_paths.append(exact_path)
|
251 |
+
|
252 |
+
# Search for partial matches
|
253 |
+
for i in range(1, 25001): # Total images in dataset
|
254 |
+
image_path = get_image_path(str(i))
|
255 |
+
if image_path and search_id.lower() in str(i):
|
256 |
+
if image_path not in matching_paths:
|
257 |
+
matching_paths.append(image_path)
|
258 |
+
if len(matching_paths) >= MAX_DISPLAY_IMAGES:
|
259 |
+
break
|
260 |
+
|
261 |
+
return matching_paths
|
262 |
+
|
263 |
+
def display_image_grid(image_paths: List[str], title: str):
|
264 |
+
"""
|
265 |
+
Display a grid of images using Streamlit
|
266 |
+
|
267 |
+
Args:
|
268 |
+
image_paths: List of image file paths
|
269 |
+
title: Title for the image grid
|
270 |
+
"""
|
271 |
+
if not image_paths:
|
272 |
+
st.warning("No images found matching your criteria.")
|
273 |
+
return
|
274 |
+
|
275 |
+
st.subheader(f"{title} ({len(image_paths)} images)")
|
276 |
+
|
277 |
+
# Create columns for the grid (3 columns)
|
278 |
+
cols = st.columns(3)
|
279 |
+
|
280 |
+
for idx, image_path in enumerate(image_paths):
|
281 |
+
col_idx = idx % 3
|
282 |
+
with cols[col_idx]:
|
283 |
+
try:
|
284 |
+
st.image(image_path, caption=f"Image {os.path.basename(image_path)}", use_column_width=True)
|
285 |
+
except Exception as e:
|
286 |
+
st.error(f"Error loading image: {e}")
|
287 |
+
|
288 |
+
def main():
|
289 |
+
"""Main Streamlit application"""
|
290 |
+
|
291 |
+
# Page configuration
|
292 |
+
st.set_page_config(
|
293 |
+
page_title="Visual Search System",
|
294 |
+
page_icon="π",
|
295 |
+
layout="wide",
|
296 |
+
initial_sidebar_state="expanded"
|
297 |
+
)
|
298 |
+
|
299 |
+
# Main title
|
300 |
+
st.title("π Visual Search System")
|
301 |
+
st.markdown("---")
|
302 |
+
|
303 |
+
# Sidebar for navigation
|
304 |
+
st.sidebar.header("Navigation")
|
305 |
+
search_option = st.sidebar.selectbox(
|
306 |
+
"Choose search method:",
|
307 |
+
["Search by ID", "Range by Block"]
|
308 |
+
)
|
309 |
+
|
310 |
+
# Main content area
|
311 |
+
if search_option == "Search by ID":
|
312 |
+
st.header("π Search Images by ID")
|
313 |
+
|
314 |
+
# Search input
|
315 |
+
search_id = st.text_input(
|
316 |
+
"Enter image ID (e.g., '0001', '1234') or leave empty to see first 500 images:",
|
317 |
+
placeholder="Enter ID or leave empty",
|
318 |
+
help="Enter a specific image ID or leave empty to browse the first 500 images"
|
319 |
+
)
|
320 |
+
|
321 |
+
# Search button
|
322 |
+
if st.button("π Search", type="primary") or search_id != "":
|
323 |
+
with st.spinner("Searching images..."):
|
324 |
+
matching_images = search_images_by_id(search_id)
|
325 |
+
|
326 |
+
if matching_images:
|
327 |
+
display_image_grid(
|
328 |
+
matching_images,
|
329 |
+
f"Showing {len(matching_images)} matching images"
|
330 |
+
)
|
331 |
+
else:
|
332 |
+
st.info("No images found matching your search criteria.")
|
333 |
+
|
334 |
+
else: # Range by Block
|
335 |
+
st.header("π¦ Browse Images by Block")
|
336 |
+
|
337 |
+
st.markdown(f"""
|
338 |
+
**How it works:**
|
339 |
+
- Each block contains **{IMAGES_PER_BLOCK} images**
|
340 |
+
- Enter a number between **1 and {TOTAL_BLOCKS}**
|
341 |
+
- Example: Enter **100** to see images **10001-10100**
|
342 |
+
""")
|
343 |
+
|
344 |
+
# Block input
|
345 |
+
block_number = st.number_input(
|
346 |
+
f"Enter block number (1-{TOTAL_BLOCKS}):",
|
347 |
+
min_value=1,
|
348 |
+
max_value=TOTAL_BLOCKS,
|
349 |
+
value=1,
|
350 |
+
step=1,
|
351 |
+
help=f"Choose a block number from 1 to {TOTAL_BLOCKS}"
|
352 |
+
)
|
353 |
+
|
354 |
+
# Calculate and display block info
|
355 |
+
start_num = (block_number - 1) * IMAGES_PER_BLOCK + 1
|
356 |
+
end_num = block_number * IMAGES_PER_BLOCK
|
357 |
+
|
358 |
+
st.info(f"**Block {block_number}**: Images {start_num:,} to {end_num:,}")
|
359 |
+
|
360 |
+
# Get block images
|
361 |
+
with st.spinner(f"Loading block {block_number}..."):
|
362 |
+
block_images = get_block_images(block_number)
|
363 |
+
|
364 |
+
if block_images:
|
365 |
+
display_image_grid(
|
366 |
+
block_images,
|
367 |
+
f"Block {block_number} - Images {start_num:,} to {end_num:,}"
|
368 |
+
)
|
369 |
+
else:
|
370 |
+
st.warning(f"No images found for block {block_number}.")
|
371 |
+
|
372 |
+
# Footer
|
373 |
+
st.markdown("---")
|
374 |
+
st.markdown(
|
375 |
+
"**Dataset Info:** 25,000+ high-quality images from Unsplash | "
|
376 |
+
"Built with Streamlit and Python"
|
377 |
+
)
|
378 |
+
|
379 |
+
def setup_and_run():
|
380 |
+
"""Setup dependencies and run the app"""
|
381 |
+
print("π Starting Visual Search System...")
|
382 |
+
|
383 |
+
# Step 1: Install dependencies
|
384 |
+
if not check_and_install_dependencies():
|
385 |
+
print("β Failed to install dependencies. Exiting.")
|
386 |
+
sys.exit(1)
|
387 |
+
|
388 |
+
print("β
Dependencies ready!")
|
389 |
+
|
390 |
+
# Step 2: Check and download images
|
391 |
+
if not download_images_if_needed():
|
392 |
+
print("β Failed to prepare images. Exiting.")
|
393 |
+
sys.exit(1)
|
394 |
+
|
395 |
+
print("β
Images ready!")
|
396 |
+
|
397 |
+
# Step 3: Launch Streamlit app
|
398 |
+
print("π Launching Streamlit app...")
|
399 |
+
main()
|
400 |
+
|
401 |
+
if __name__ == "__main__":
|
402 |
+
setup_and_run()
|
requirements.txt
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
streamlit>=1.28.0
|
2 |
+
pandas>=1.5.0
|
3 |
+
requests>=2.28.0
|
4 |
+
pillow>=9.0.0
|
5 |
+
tqdm>=4.64.0
|