deki-yolo: Mobile UI Element Detection Model

This is a YOLO model trained to identify common UI elements in mobile screenshots. It is the core detection model for the deki huggingface space or deki github

Model Description

The model is trained to detect the following four classes of UI elements:

  • View: General-purpose containers.
  • ImageView: Icons and images.
  • Text: Text elements.
  • Line: Separators and lines.

This model can be used as a foundational component for applications that need to understand screen layouts, such as AI agents for mobile automation, accessibility tools, and UI code generation.


YOLO examples

Bounding boxes with classes for bb_1:

example1

Bounding boxes without classes but with IDs after NMS for bb_1:

example2

Bounding boxes with classes for bb_2:

example3

Bounding boxes without classes but with IDs after NMS for bb_2:

example4

YOLO model accuracy

The model was trained on 486 images and was tested on 60 images.

Current YOLO model accuracy: example5

Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support