Spaces:
Running
Running
File size: 4,849 Bytes
f9cb207 0a9b595 f9cb207 0a9b595 f9cb207 0a9b595 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 |
---
title: SAM-Grounding-DINO
emoji: π
colorFrom: indigo
colorTo: purple
sdk: gradio
app_file: app.py
---
# π SAM 2.1 + Grounding DINO Interactive Segmentation
A web application combining Meta's SAM 2.1 and Grounding DINO for both text-based and point-based image segmentation to enable creating and downloading a desired mask.
## β¨ Features
- **π Text-Based Segmentation**: Type what you want to segment (e.g., "snoopy", "person", "car")
- **π Point-Based Segmentation**: Click on objects for precise manual control
- **π Multiple Mask Generation**: Generate 1-5 masks and browse through them
- **π€ SAM 2.1 + Grounding DINO**: Powered by Meta's SAM 2.1 and IDEA Research's Grounding DINO
- **π± Smart Auto-Detection**: Automatically chooses between text and point modes
- **πΎ Multiple Export Formats**: Download masks as PNG, JPG, or PyTorch tensors
- **πΌοΈ High-Resolution Display**: View images and masks in full detail
- **β‘ Real-Time Processing**: Fast inference with GPU acceleration
## π Quick Start
### Installation
1. Clone or download the repository
2. Install dependencies:
```bash
pip install -r requirements.txt
```
### Running the App
```bash
streamlit run streamlit_sam_app.py
```
The app will open in your browser at `http://localhost:8501`
## π― How to Use
### 1. Upload an Image
- Click "π· Upload an image" to select an image file
- Supported formats: JPG, JPEG, PNG, BMP
### 2. Add Points
Choose between **Positive** (include) or **Negative** (exclude) point mode:
#### Quick Presets:
- **π― Center**: Add point at image center
- **βοΈ Top-Left**: Add point at top-left quarter
- **βοΈ Top-Right**: Add point at top-right quarter
- **π² Random**: Add random point anywhere
#### Manual Input:
- Enter X,Y coordinates manually
- Points are validated against image boundaries
### 3. Generate Segmentation Mask
- Click "π― Generate Segmentation Mask"
- Adjust the mask threshold in the sidebar (0.0-1.0)
- Wait for SAM 2.0 to process (may take 10-30 seconds)
### 4. View Results
- **Original Image with Points**: Shows your input selections
- **Generated Segmentation Mask**: Red overlay on original image
- **Binary Mask Preview**: Black/white mask for download
- **Statistics**: Pixel counts and coverage percentage
### 5. Download Results
- **π₯ Download Mask (PNG)**: Binary mask file
- **π₯ Download Overlay (PNG)**: Mask overlaid on original
- **π₯ Download Data (JSON)**: Complete metadata and statistics
## ποΈ Advanced Controls
### Sidebar Options:
- **Point Mode**: Switch between Positive/Negative points
- **Mask Threshold**: Control mask sensitivity (lower = larger masks)
- **Clear Points**: Remove all points at once
### Point Management:
- View all current points with coordinates
- Delete individual points with ποΈ buttons
- Real-time count of positive/negative points
## π§ Technical Details
### SAM 2.0 Model
- Uses `facebook/sam2-hiera-small` by default
- Automatically downloads model weights on first run
- Runs on GPU if available, CPU otherwise
### Dependencies
- `streamlit`: Web interface
- `torch`: PyTorch for model inference
- `transformers`: Hugging Face model loading
- `PIL`: Image processing
- `matplotlib`: Visualization
- `numpy`: Numerical operations
- `opencv-python`: Image processing utilities
### System Requirements
- Python 3.8+
- 4GB+ RAM recommended
- GPU recommended for faster processing
## π Troubleshooting
### Common Issues:
1. **Model Download Fails**:
- Check internet connection
- Ensure Hugging Face access (may require token for some models)
2. **CUDA Out of Memory**:
- Try smaller model size
- Reduce image resolution
- Use CPU mode: set `CUDA_VISIBLE_DEVICES=""`
3. **Slow Processing**:
- Use GPU if available
- Try `sam2-hiera-tiny` model for faster inference
4. **Import Errors**:
- Ensure all dependencies are installed: `pip install -r requirements.txt`
## π File Structure
```
SAM/
βββ streamlit_sam_app.py # Main application
βββ fixed_sam_interface.py # Original Gradio version
βββ requirements.txt # Dependencies
βββ README.md # This file
```
## π¨ Interface Screenshots
The app features a clean, modern interface with:
- Full-width image display
- Intuitive sidebar controls
- Real-time point visualization
- Side-by-side result comparison
- Comprehensive download options
## π€ Contributing
Feel free to submit issues, feature requests, or pull requests!
## π License
This project uses Meta's SAM 2.0 model. Please refer to Meta's license terms for the model weights.
## π Acknowledgments
- Meta AI for the incredible SAM 2.0 model
- Streamlit for the amazing web app framework
- Hugging Face for model hosting
- The open-source community for all the dependencies |