scene_classifier / README.md
anodev's picture
Update README.md
d999b19
metadata
license: apache-2.0

Classifier architecture

The classifier uses DenseNet161 as the encoder and some linear layers at classifier base.

Model accuracy:

Model achieves 91.3% accuracy on the validation set.
F1-score per class: {'digital': 0.9873773235685747, 'hard': 0.9338602782753218, 'soft': 0.8444277483052108}
Mean F1-score: 0.9218884500497024
Accuracy: 0.913

Training dataset metadata:

  1. Dataset classes: ['soft', 'digital', 'hard']
  2. Number of classes: 3
  3. Total number of images: 18415

Number of images per class:

  • soft : 5482
  • digital : 1206
  • hard : 11727

Classes description:

  1. The hard class denotes a group of scenes to which a coarser background removal method should be applied, intended for objects with an edge without small details. The hard class contains the following categories of objects: object, laptop, charger, pc mouse, pc, rocks, table, bed, box, sneakers, ship, wire, guitar, fork, spoon, plate, keyboard, car, bus, screwdriver, ball, door, flower, clocks, fruit , food, robot.

  2. The soft class denotes a group of scenes to which you want to apply a soft background removal method intended for people, hair, clothes, and other similar types of objects. The soft class contains the following categories of objects: animal, people, human, man, woman, t-shirt, hairs, hair, dog, cat, monkey, cow, medusa, clothes

  3. The digital class denotes a group of images with digital graphics, such as screenshots, logos, and so on. The digital class contains the following categories of scenes: screenshot