Spaces:
Running
Running
A newer version of the Gradio SDK is available:
5.46.0
metadata
title: SAM-Grounding-DINO
emoji: π
colorFrom: indigo
colorTo: purple
sdk: gradio
app_file: app.py
π SAM 2.1 + Grounding DINO Interactive Segmentation
A web application combining Meta's SAM 2.1 and Grounding DINO for both text-based and point-based image segmentation to enable creating and downloading a desired mask.
β¨ Features
- π Text-Based Segmentation: Type what you want to segment (e.g., "snoopy", "person", "car")
- π Point-Based Segmentation: Click on objects for precise manual control
- π Multiple Mask Generation: Generate 1-5 masks and browse through them
- π€ SAM 2.1 + Grounding DINO: Powered by Meta's SAM 2.1 and IDEA Research's Grounding DINO
- π± Smart Auto-Detection: Automatically chooses between text and point modes
- πΎ Multiple Export Formats: Download masks as PNG, JPG, or PyTorch tensors
- πΌοΈ High-Resolution Display: View images and masks in full detail
- β‘ Real-Time Processing: Fast inference with GPU acceleration
π Quick Start
Installation
- Clone or download the repository
- Install dependencies:
pip install -r requirements.txt
Running the App
streamlit run streamlit_sam_app.py
The app will open in your browser at http://localhost:8501
π― How to Use
1. Upload an Image
- Click "π· Upload an image" to select an image file
- Supported formats: JPG, JPEG, PNG, BMP
2. Add Points
Choose between Positive (include) or Negative (exclude) point mode:
Quick Presets:
- π― Center: Add point at image center
- βοΈ Top-Left: Add point at top-left quarter
- βοΈ Top-Right: Add point at top-right quarter
- π² Random: Add random point anywhere
Manual Input:
- Enter X,Y coordinates manually
- Points are validated against image boundaries
3. Generate Segmentation Mask
- Click "π― Generate Segmentation Mask"
- Adjust the mask threshold in the sidebar (0.0-1.0)
- Wait for SAM 2.0 to process (may take 10-30 seconds)
4. View Results
- Original Image with Points: Shows your input selections
- Generated Segmentation Mask: Red overlay on original image
- Binary Mask Preview: Black/white mask for download
- Statistics: Pixel counts and coverage percentage
5. Download Results
- π₯ Download Mask (PNG): Binary mask file
- π₯ Download Overlay (PNG): Mask overlaid on original
- π₯ Download Data (JSON): Complete metadata and statistics
ποΈ Advanced Controls
Sidebar Options:
- Point Mode: Switch between Positive/Negative points
- Mask Threshold: Control mask sensitivity (lower = larger masks)
- Clear Points: Remove all points at once
Point Management:
- View all current points with coordinates
- Delete individual points with ποΈ buttons
- Real-time count of positive/negative points
π§ Technical Details
SAM 2.0 Model
- Uses
facebook/sam2-hiera-small
by default - Automatically downloads model weights on first run
- Runs on GPU if available, CPU otherwise
Dependencies
streamlit
: Web interfacetorch
: PyTorch for model inferencetransformers
: Hugging Face model loadingPIL
: Image processingmatplotlib
: Visualizationnumpy
: Numerical operationsopencv-python
: Image processing utilities
System Requirements
- Python 3.8+
- 4GB+ RAM recommended
- GPU recommended for faster processing
π Troubleshooting
Common Issues:
Model Download Fails:
- Check internet connection
- Ensure Hugging Face access (may require token for some models)
CUDA Out of Memory:
- Try smaller model size
- Reduce image resolution
- Use CPU mode: set
CUDA_VISIBLE_DEVICES=""
Slow Processing:
- Use GPU if available
- Try
sam2-hiera-tiny
model for faster inference
Import Errors:
- Ensure all dependencies are installed:
pip install -r requirements.txt
- Ensure all dependencies are installed:
π File Structure
SAM/
βββ streamlit_sam_app.py # Main application
βββ fixed_sam_interface.py # Original Gradio version
βββ requirements.txt # Dependencies
βββ README.md # This file
π¨ Interface Screenshots
The app features a clean, modern interface with:
- Full-width image display
- Intuitive sidebar controls
- Real-time point visualization
- Side-by-side result comparison
- Comprehensive download options
π€ Contributing
Feel free to submit issues, feature requests, or pull requests!
π License
This project uses Meta's SAM 2.0 model. Please refer to Meta's license terms for the model weights.
π Acknowledgments
- Meta AI for the incredible SAM 2.0 model
- Streamlit for the amazing web app framework
- Hugging Face for model hosting
- The open-source community for all the dependencies