File size: 4,849 Bytes
f9cb207
0a9b595
 
 
f9cb207
 
 
 
0a9b595
f9cb207
0a9b595
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
---
title: SAM-Grounding-DINO
emoji: 🎭
colorFrom: indigo
colorTo: purple
sdk: gradio
app_file: app.py
---
# 🎭 SAM 2.1 + Grounding DINO Interactive Segmentation

A web application combining Meta's SAM 2.1 and Grounding DINO for both text-based and point-based image segmentation to enable creating and downloading a desired mask.

## ✨ Features

- **πŸ” Text-Based Segmentation**: Type what you want to segment (e.g., "snoopy", "person", "car")
- **πŸ“ Point-Based Segmentation**: Click on objects for precise manual control
- **🎭 Multiple Mask Generation**: Generate 1-5 masks and browse through them
- **πŸ€– SAM 2.1 + Grounding DINO**: Powered by Meta's SAM 2.1 and IDEA Research's Grounding DINO
- **πŸ“± Smart Auto-Detection**: Automatically chooses between text and point modes
- **πŸ’Ύ Multiple Export Formats**: Download masks as PNG, JPG, or PyTorch tensors
- **πŸ–ΌοΈ High-Resolution Display**: View images and masks in full detail
- **⚑ Real-Time Processing**: Fast inference with GPU acceleration

## πŸš€ Quick Start

### Installation

1. Clone or download the repository
2. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```

### Running the App

```bash
streamlit run streamlit_sam_app.py
```

The app will open in your browser at `http://localhost:8501`

## 🎯 How to Use

### 1. Upload an Image
- Click "πŸ“· Upload an image" to select an image file
- Supported formats: JPG, JPEG, PNG, BMP

### 2. Add Points
Choose between **Positive** (include) or **Negative** (exclude) point mode:

#### Quick Presets:
- **🎯 Center**: Add point at image center
- **↖️ Top-Left**: Add point at top-left quarter
- **↗️ Top-Right**: Add point at top-right quarter
- **🎲 Random**: Add random point anywhere

#### Manual Input:
- Enter X,Y coordinates manually
- Points are validated against image boundaries

### 3. Generate Segmentation Mask
- Click "🎯 Generate Segmentation Mask"
- Adjust the mask threshold in the sidebar (0.0-1.0)
- Wait for SAM 2.0 to process (may take 10-30 seconds)

### 4. View Results
- **Original Image with Points**: Shows your input selections
- **Generated Segmentation Mask**: Red overlay on original image
- **Binary Mask Preview**: Black/white mask for download
- **Statistics**: Pixel counts and coverage percentage

### 5. Download Results
- **πŸ“₯ Download Mask (PNG)**: Binary mask file
- **πŸ“₯ Download Overlay (PNG)**: Mask overlaid on original
- **πŸ“₯ Download Data (JSON)**: Complete metadata and statistics

## πŸŽ›οΈ Advanced Controls

### Sidebar Options:
- **Point Mode**: Switch between Positive/Negative points
- **Mask Threshold**: Control mask sensitivity (lower = larger masks)
- **Clear Points**: Remove all points at once

### Point Management:
- View all current points with coordinates
- Delete individual points with πŸ—‘οΈ buttons
- Real-time count of positive/negative points

## πŸ”§ Technical Details

### SAM 2.0 Model
- Uses `facebook/sam2-hiera-small` by default
- Automatically downloads model weights on first run
- Runs on GPU if available, CPU otherwise

### Dependencies
- `streamlit`: Web interface
- `torch`: PyTorch for model inference
- `transformers`: Hugging Face model loading
- `PIL`: Image processing
- `matplotlib`: Visualization
- `numpy`: Numerical operations
- `opencv-python`: Image processing utilities

### System Requirements
- Python 3.8+
- 4GB+ RAM recommended
- GPU recommended for faster processing

## πŸ› Troubleshooting

### Common Issues:

1. **Model Download Fails**:
   - Check internet connection
   - Ensure Hugging Face access (may require token for some models)

2. **CUDA Out of Memory**:
   - Try smaller model size
   - Reduce image resolution
   - Use CPU mode: set `CUDA_VISIBLE_DEVICES=""`

3. **Slow Processing**:
   - Use GPU if available
   - Try `sam2-hiera-tiny` model for faster inference

4. **Import Errors**:
   - Ensure all dependencies are installed: `pip install -r requirements.txt`

## πŸ“ File Structure

```
SAM/
β”œβ”€β”€ streamlit_sam_app.py    # Main application
β”œβ”€β”€ fixed_sam_interface.py  # Original Gradio version
β”œβ”€β”€ requirements.txt        # Dependencies
└── README.md              # This file
```

## 🎨 Interface Screenshots

The app features a clean, modern interface with:
- Full-width image display
- Intuitive sidebar controls
- Real-time point visualization
- Side-by-side result comparison
- Comprehensive download options

## 🀝 Contributing

Feel free to submit issues, feature requests, or pull requests!

## πŸ“„ License

This project uses Meta's SAM 2.0 model. Please refer to Meta's license terms for the model weights.

## πŸ™ Acknowledgments

- Meta AI for the incredible SAM 2.0 model
- Streamlit for the amazing web app framework
- Hugging Face for model hosting
- The open-source community for all the dependencies