niobures's picture
mmaction2
d3dbf03 verified

Gesture Recognition

Introduction

In this project, we present a skeleton based pipeline for gesture recognition. The pipeline is three-stage. The first stage consists of a hand detection module that outputs bounding boxes of human hands from video frames. Afterwards, the second stage employs a pose estimation module to generate keypoints of the detected hands. Finally, the third stage utilizes a skeleton-based gesture recognition module to classify hand actions based on the provided hand skeleton. The three-stage pipeline is lightweight and can achieve real-time on CPU devices. In this README, we provide the models and the inference demo for the project. Training data preparation and training scripts are described in TRAINING.md.

Hand detection stage

Hand detection results on OneHand10K validation dataset

Config Input Size bbox mAP bbox mAP 50 bbox mAP 75 ckpt log
rtmdet_nano 320x320 0.8100 0.9870 0.9190 ckpt log

Pose estimation stage

Pose estimation results on COCO-WholeBody-Hand validation set

Config Input Size [email protected] AUC EPE ckpt
rtmpose_m 256x256 0.815 0.837 4.51 ckpt

Gesture recognition stage

Skeleton base gesture recognition results on Jester validation

Config Input Size Top 1 accuracy Top 5 accuracy ckpt log
STGCNPP 100x17x3 89.22 97.52 ckpt log