Spaces:

tokeron
/

SAM-Grounding-DINO

Running

File size: 4,849 Bytes

f9cb207
0a9b595
 
 
f9cb207
 
 
 
0a9b595
f9cb207
0a9b595

---
title: SAM-Grounding-DINO
emoji: 🎭
colorFrom: indigo
colorTo: purple
sdk: gradio
app_file: app.py
---
# 🎭 SAM 2.1 + Grounding DINO Interactive Segmentation

A web application combining Meta's SAM 2.1 and Grounding DINO for both text-based and point-based image segmentation to enable creating and downloading a desired mask.

## ✨ Features

- **🔍 Text-Based Segmentation**: Type what you want to segment (e.g., "snoopy", "person", "car")
- **📍 Point-Based Segmentation**: Click on objects for precise manual control
- **🎭 Multiple Mask Generation**: Generate 1-5 masks and browse through them
- **🤖 SAM 2.1 + Grounding DINO**: Powered by Meta's SAM 2.1 and IDEA Research's Grounding DINO
- **📱 Smart Auto-Detection**: Automatically chooses between text and point modes
- **💾 Multiple Export Formats**: Download masks as PNG, JPG, or PyTorch tensors
- **🖼️ High-Resolution Display**: View images and masks in full detail
- **⚡ Real-Time Processing**: Fast inference with GPU acceleration

## 🚀 Quick Start

### Installation

1. Clone or download the repository
2. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```

### Running the App

```bash
streamlit run streamlit_sam_app.py
```

The app will open in your browser at `http://localhost:8501`

## 🎯 How to Use

### 1. Upload an Image
- Click "📷 Upload an image" to select an image file
- Supported formats: JPG, JPEG, PNG, BMP

### 2. Add Points
Choose between **Positive** (include) or **Negative** (exclude) point mode:

#### Quick Presets:
- **🎯 Center**: Add point at image center
- **↖️ Top-Left**: Add point at top-left quarter
- **↗️ Top-Right**: Add point at top-right quarter
- **🎲 Random**: Add random point anywhere

#### Manual Input:
- Enter X,Y coordinates manually
- Points are validated against image boundaries

### 3. Generate Segmentation Mask
- Click "🎯 Generate Segmentation Mask"
- Adjust the mask threshold in the sidebar (0.0-1.0)
- Wait for SAM 2.0 to process (may take 10-30 seconds)

### 4. View Results
- **Original Image with Points**: Shows your input selections
- **Generated Segmentation Mask**: Red overlay on original image
- **Binary Mask Preview**: Black/white mask for download
- **Statistics**: Pixel counts and coverage percentage

### 5. Download Results
- **📥 Download Mask (PNG)**: Binary mask file
- **📥 Download Overlay (PNG)**: Mask overlaid on original
- **📥 Download Data (JSON)**: Complete metadata and statistics

## 🎛️ Advanced Controls

### Sidebar Options:
- **Point Mode**: Switch between Positive/Negative points
- **Mask Threshold**: Control mask sensitivity (lower = larger masks)
- **Clear Points**: Remove all points at once

### Point Management:
- View all current points with coordinates
- Delete individual points with 🗑️ buttons
- Real-time count of positive/negative points

## 🔧 Technical Details

### SAM 2.0 Model
- Uses `facebook/sam2-hiera-small` by default
- Automatically downloads model weights on first run
- Runs on GPU if available, CPU otherwise

### Dependencies
- `streamlit`: Web interface
- `torch`: PyTorch for model inference
- `transformers`: Hugging Face model loading
- `PIL`: Image processing
- `matplotlib`: Visualization
- `numpy`: Numerical operations
- `opencv-python`: Image processing utilities

### System Requirements
- Python 3.8+
- 4GB+ RAM recommended
- GPU recommended for faster processing

## 🐛 Troubleshooting

### Common Issues:

1. **Model Download Fails**:
   - Check internet connection
   - Ensure Hugging Face access (may require token for some models)

2. **CUDA Out of Memory**:
   - Try smaller model size
   - Reduce image resolution
   - Use CPU mode: set `CUDA_VISIBLE_DEVICES=""`

3. **Slow Processing**:
   - Use GPU if available
   - Try `sam2-hiera-tiny` model for faster inference

4. **Import Errors**:
   - Ensure all dependencies are installed: `pip install -r requirements.txt`

## 📁 File Structure

```
SAM/
├── streamlit_sam_app.py    # Main application
├── fixed_sam_interface.py  # Original Gradio version
├── requirements.txt        # Dependencies
└── README.md              # This file
```

## 🎨 Interface Screenshots

The app features a clean, modern interface with:
- Full-width image display
- Intuitive sidebar controls
- Real-time point visualization
- Side-by-side result comparison
- Comprehensive download options

## 🤝 Contributing

Feel free to submit issues, feature requests, or pull requests!

## 📄 License

This project uses Meta's SAM 2.0 model. Please refer to Meta's license terms for the model weights.

## 🙏 Acknowledgments

- Meta AI for the incredible SAM 2.0 model
- Streamlit for the amazing web app framework
- Hugging Face for model hosting
- The open-source community for all the dependencies