SAM-Grounding-DINO / README.md
tokeron's picture
Upload folder using huggingface_hub
0a9b595 verified
---
title: SAM-Grounding-DINO
emoji: 🎭
colorFrom: indigo
colorTo: purple
sdk: gradio
app_file: app.py
---
# 🎭 SAM 2.1 + Grounding DINO Interactive Segmentation
A web application combining Meta's SAM 2.1 and Grounding DINO for both text-based and point-based image segmentation to enable creating and downloading a desired mask.
## ✨ Features
- **πŸ” Text-Based Segmentation**: Type what you want to segment (e.g., "snoopy", "person", "car")
- **πŸ“ Point-Based Segmentation**: Click on objects for precise manual control
- **🎭 Multiple Mask Generation**: Generate 1-5 masks and browse through them
- **πŸ€– SAM 2.1 + Grounding DINO**: Powered by Meta's SAM 2.1 and IDEA Research's Grounding DINO
- **πŸ“± Smart Auto-Detection**: Automatically chooses between text and point modes
- **πŸ’Ύ Multiple Export Formats**: Download masks as PNG, JPG, or PyTorch tensors
- **πŸ–ΌοΈ High-Resolution Display**: View images and masks in full detail
- **⚑ Real-Time Processing**: Fast inference with GPU acceleration
## πŸš€ Quick Start
### Installation
1. Clone or download the repository
2. Install dependencies:
```bash
pip install -r requirements.txt
```
### Running the App
```bash
streamlit run streamlit_sam_app.py
```
The app will open in your browser at `http://localhost:8501`
## 🎯 How to Use
### 1. Upload an Image
- Click "πŸ“· Upload an image" to select an image file
- Supported formats: JPG, JPEG, PNG, BMP
### 2. Add Points
Choose between **Positive** (include) or **Negative** (exclude) point mode:
#### Quick Presets:
- **🎯 Center**: Add point at image center
- **↖️ Top-Left**: Add point at top-left quarter
- **↗️ Top-Right**: Add point at top-right quarter
- **🎲 Random**: Add random point anywhere
#### Manual Input:
- Enter X,Y coordinates manually
- Points are validated against image boundaries
### 3. Generate Segmentation Mask
- Click "🎯 Generate Segmentation Mask"
- Adjust the mask threshold in the sidebar (0.0-1.0)
- Wait for SAM 2.0 to process (may take 10-30 seconds)
### 4. View Results
- **Original Image with Points**: Shows your input selections
- **Generated Segmentation Mask**: Red overlay on original image
- **Binary Mask Preview**: Black/white mask for download
- **Statistics**: Pixel counts and coverage percentage
### 5. Download Results
- **πŸ“₯ Download Mask (PNG)**: Binary mask file
- **πŸ“₯ Download Overlay (PNG)**: Mask overlaid on original
- **πŸ“₯ Download Data (JSON)**: Complete metadata and statistics
## πŸŽ›οΈ Advanced Controls
### Sidebar Options:
- **Point Mode**: Switch between Positive/Negative points
- **Mask Threshold**: Control mask sensitivity (lower = larger masks)
- **Clear Points**: Remove all points at once
### Point Management:
- View all current points with coordinates
- Delete individual points with πŸ—‘οΈ buttons
- Real-time count of positive/negative points
## πŸ”§ Technical Details
### SAM 2.0 Model
- Uses `facebook/sam2-hiera-small` by default
- Automatically downloads model weights on first run
- Runs on GPU if available, CPU otherwise
### Dependencies
- `streamlit`: Web interface
- `torch`: PyTorch for model inference
- `transformers`: Hugging Face model loading
- `PIL`: Image processing
- `matplotlib`: Visualization
- `numpy`: Numerical operations
- `opencv-python`: Image processing utilities
### System Requirements
- Python 3.8+
- 4GB+ RAM recommended
- GPU recommended for faster processing
## πŸ› Troubleshooting
### Common Issues:
1. **Model Download Fails**:
- Check internet connection
- Ensure Hugging Face access (may require token for some models)
2. **CUDA Out of Memory**:
- Try smaller model size
- Reduce image resolution
- Use CPU mode: set `CUDA_VISIBLE_DEVICES=""`
3. **Slow Processing**:
- Use GPU if available
- Try `sam2-hiera-tiny` model for faster inference
4. **Import Errors**:
- Ensure all dependencies are installed: `pip install -r requirements.txt`
## πŸ“ File Structure
```
SAM/
β”œβ”€β”€ streamlit_sam_app.py # Main application
β”œβ”€β”€ fixed_sam_interface.py # Original Gradio version
β”œβ”€β”€ requirements.txt # Dependencies
└── README.md # This file
```
## 🎨 Interface Screenshots
The app features a clean, modern interface with:
- Full-width image display
- Intuitive sidebar controls
- Real-time point visualization
- Side-by-side result comparison
- Comprehensive download options
## 🀝 Contributing
Feel free to submit issues, feature requests, or pull requests!
## πŸ“„ License
This project uses Meta's SAM 2.0 model. Please refer to Meta's license terms for the model weights.
## πŸ™ Acknowledgments
- Meta AI for the incredible SAM 2.0 model
- Streamlit for the amazing web app framework
- Hugging Face for model hosting
- The open-source community for all the dependencies