Model Details
This model is a port of the ViTMatte models, which are trained and tested on the Composition-1k and Distinctions-646 datasets. This port focuses on the performance and accuracy of the models.
Note: The porting of the model is for the convenience of use, and to better promote and learn from this excellent open-source project.
Usage
This model aims to perform various image processing tasks, such as image segmentation, object recognition, and object detection.
Training Data
The model undergoes training and validation using two datasets:
- Composition-1k, this dataset used for training and testing, includes 1000 samples.
- Distinctions-646, this dataset includes 646 samples and is used for model validation.
Training Procedure
The model is trained using the gradient descent algorithm and evaluates its performance using the following four metrics:
- SAD (Sum of Absolute Differences)
- MSE (Mean Squared Error)
- Grad (Gradient)
- Conn (Connectivity)
Performance
The models have shown the following performance on the two datasets:
On the Composition-1k dataset:
Model | SAD | MSE | Grad | Conn |
---|---|---|---|---|
ViTMatte-S | 21.46 | 3.3 | 7.24 | 16.21 |
ViTMatte-B | 20.33 | 3.0 | 6.74 | 14.78 |
On the Distinctions-646 dataset:
Model | SAD | MSE | Grad | Conn |
---|---|---|---|---|
ViTMatte-S | 21.22 | 2.1 | 8.78 | 17.55 |
ViTMatte-B | 17.05 | 1.5 | 7.03 | 12.95 |
Both models perform well on these datasets, with ViTMatte-B outperforming ViTMatte-S on most evaluation metrics.
Disclaimer
This model is ported from lufficc's ViTMatte project. All original rights belong to lufficc.
Citation
If you use these models, please cite the original author and project: https://github.com/hustvl/ViTMatte
Thank you for using these models. If you encounter any issues or have any feedback during your usage, please raise them on the original GitHub project page of the author.