Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
## Model Details
|
2 |
+
|
3 |
+
This model is a port of the ViTMatte models, which are trained and tested on the Composition-1k and Distinctions-646 datasets. This port focuses on the performance and accuracy of the models.
|
4 |
+
|
5 |
+
Note: The porting of the model is for the convenience of use, and to better promote and learn from this excellent open-source project.
|
6 |
+
|
7 |
+
## Usage
|
8 |
+
|
9 |
+
This model aims to perform various image processing tasks, such as image segmentation, object recognition, and object detection.
|
10 |
+
|
11 |
+
## Training Data
|
12 |
+
|
13 |
+
The model undergoes training and validation using two datasets:
|
14 |
+
- Composition-1k, this dataset used for training and testing, includes 1000 samples.
|
15 |
+
- Distinctions-646, this dataset includes 646 samples and is used for model validation.
|
16 |
+
|
17 |
+
## Training Procedure
|
18 |
+
|
19 |
+
The model is trained using the gradient descent algorithm and evaluates its performance using the following four metrics:
|
20 |
+
- SAD (Sum of Absolute Differences)
|
21 |
+
- MSE (Mean Squared Error)
|
22 |
+
- Grad (Gradient)
|
23 |
+
- Conn (Connectivity)
|
24 |
+
|
25 |
+
## Performance
|
26 |
+
|
27 |
+
The models have shown the following performance on the two datasets:
|
28 |
+
|
29 |
+
On the Composition-1k dataset:
|
30 |
+
|
31 |
+
| Model | SAD | MSE | Grad | Conn |
|
32 |
+
|------|----|----|-----|-----|
|
33 |
+
| ViTMatte-S | 21.46 | 3.3 | 7.24 | 16.21 |
|
34 |
+
| ViTMatte-B | 20.33 | 3.0 | 6.74 | 14.78 |
|
35 |
+
|
36 |
+
On the Distinctions-646 dataset:
|
37 |
+
|
38 |
+
| Model | SAD | MSE | Grad | Conn |
|
39 |
+
|------|----|----|-----|-----|
|
40 |
+
| ViTMatte-S | 21.22 | 2.1 | 8.78 | 17.55 |
|
41 |
+
| ViTMatte-B | 17.05 | 1.5 | 7.03 | 12.95 |
|
42 |
+
|
43 |
+
Both models perform well on these datasets, with ViTMatte-B outperforming ViTMatte-S on most evaluation metrics.
|
44 |
+
|
45 |
+
## Disclaimer
|
46 |
+
|
47 |
+
This model is ported from [lufficc's ViTMatte](https://github.com/hustvl/ViTMatte) project. All original rights belong to [lufficc](https://github.com/lufficc).
|
48 |
+
|
49 |
+
## Citation
|
50 |
+
|
51 |
+
If you use these models, please cite the original author and project: https://github.com/hustvl/ViTMatte
|
52 |
+
|
53 |
+
Thank you for using these models. If you encounter any issues or have any feedback during your usage, please raise them on the original GitHub project page of the author.
|