Spaces:
Running
Running
title: SAM-Grounding-DINO | |
emoji: π | |
colorFrom: indigo | |
colorTo: purple | |
sdk: gradio | |
app_file: app.py | |
# π SAM 2.1 + Grounding DINO Interactive Segmentation | |
A web application combining Meta's SAM 2.1 and Grounding DINO for both text-based and point-based image segmentation to enable creating and downloading a desired mask. | |
## β¨ Features | |
- **π Text-Based Segmentation**: Type what you want to segment (e.g., "snoopy", "person", "car") | |
- **π Point-Based Segmentation**: Click on objects for precise manual control | |
- **π Multiple Mask Generation**: Generate 1-5 masks and browse through them | |
- **π€ SAM 2.1 + Grounding DINO**: Powered by Meta's SAM 2.1 and IDEA Research's Grounding DINO | |
- **π± Smart Auto-Detection**: Automatically chooses between text and point modes | |
- **πΎ Multiple Export Formats**: Download masks as PNG, JPG, or PyTorch tensors | |
- **πΌοΈ High-Resolution Display**: View images and masks in full detail | |
- **β‘ Real-Time Processing**: Fast inference with GPU acceleration | |
## π Quick Start | |
### Installation | |
1. Clone or download the repository | |
2. Install dependencies: | |
```bash | |
pip install -r requirements.txt | |
``` | |
### Running the App | |
```bash | |
streamlit run streamlit_sam_app.py | |
``` | |
The app will open in your browser at `http://localhost:8501` | |
## π― How to Use | |
### 1. Upload an Image | |
- Click "π· Upload an image" to select an image file | |
- Supported formats: JPG, JPEG, PNG, BMP | |
### 2. Add Points | |
Choose between **Positive** (include) or **Negative** (exclude) point mode: | |
#### Quick Presets: | |
- **π― Center**: Add point at image center | |
- **βοΈ Top-Left**: Add point at top-left quarter | |
- **βοΈ Top-Right**: Add point at top-right quarter | |
- **π² Random**: Add random point anywhere | |
#### Manual Input: | |
- Enter X,Y coordinates manually | |
- Points are validated against image boundaries | |
### 3. Generate Segmentation Mask | |
- Click "π― Generate Segmentation Mask" | |
- Adjust the mask threshold in the sidebar (0.0-1.0) | |
- Wait for SAM 2.0 to process (may take 10-30 seconds) | |
### 4. View Results | |
- **Original Image with Points**: Shows your input selections | |
- **Generated Segmentation Mask**: Red overlay on original image | |
- **Binary Mask Preview**: Black/white mask for download | |
- **Statistics**: Pixel counts and coverage percentage | |
### 5. Download Results | |
- **π₯ Download Mask (PNG)**: Binary mask file | |
- **π₯ Download Overlay (PNG)**: Mask overlaid on original | |
- **π₯ Download Data (JSON)**: Complete metadata and statistics | |
## ποΈ Advanced Controls | |
### Sidebar Options: | |
- **Point Mode**: Switch between Positive/Negative points | |
- **Mask Threshold**: Control mask sensitivity (lower = larger masks) | |
- **Clear Points**: Remove all points at once | |
### Point Management: | |
- View all current points with coordinates | |
- Delete individual points with ποΈ buttons | |
- Real-time count of positive/negative points | |
## π§ Technical Details | |
### SAM 2.0 Model | |
- Uses `facebook/sam2-hiera-small` by default | |
- Automatically downloads model weights on first run | |
- Runs on GPU if available, CPU otherwise | |
### Dependencies | |
- `streamlit`: Web interface | |
- `torch`: PyTorch for model inference | |
- `transformers`: Hugging Face model loading | |
- `PIL`: Image processing | |
- `matplotlib`: Visualization | |
- `numpy`: Numerical operations | |
- `opencv-python`: Image processing utilities | |
### System Requirements | |
- Python 3.8+ | |
- 4GB+ RAM recommended | |
- GPU recommended for faster processing | |
## π Troubleshooting | |
### Common Issues: | |
1. **Model Download Fails**: | |
- Check internet connection | |
- Ensure Hugging Face access (may require token for some models) | |
2. **CUDA Out of Memory**: | |
- Try smaller model size | |
- Reduce image resolution | |
- Use CPU mode: set `CUDA_VISIBLE_DEVICES=""` | |
3. **Slow Processing**: | |
- Use GPU if available | |
- Try `sam2-hiera-tiny` model for faster inference | |
4. **Import Errors**: | |
- Ensure all dependencies are installed: `pip install -r requirements.txt` | |
## π File Structure | |
``` | |
SAM/ | |
βββ streamlit_sam_app.py # Main application | |
βββ fixed_sam_interface.py # Original Gradio version | |
βββ requirements.txt # Dependencies | |
βββ README.md # This file | |
``` | |
## π¨ Interface Screenshots | |
The app features a clean, modern interface with: | |
- Full-width image display | |
- Intuitive sidebar controls | |
- Real-time point visualization | |
- Side-by-side result comparison | |
- Comprehensive download options | |
## π€ Contributing | |
Feel free to submit issues, feature requests, or pull requests! | |
## π License | |
This project uses Meta's SAM 2.0 model. Please refer to Meta's license terms for the model weights. | |
## π Acknowledgments | |
- Meta AI for the incredible SAM 2.0 model | |
- Streamlit for the amazing web app framework | |
- Hugging Face for model hosting | |
- The open-source community for all the dependencies |