Gesture Recognition
Introduction
In this project, we present a skeleton based pipeline for gesture recognition. The pipeline is three-stage. The first stage consists of a hand detection module that outputs bounding boxes of human hands from video frames. Afterwards, the second stage employs a pose estimation module to generate keypoints of the detected hands. Finally, the third stage utilizes a skeleton-based gesture recognition module to classify hand actions based on the provided hand skeleton. The three-stage pipeline is lightweight and can achieve real-time on CPU devices. In this README, we provide the models and the inference demo for the project. Training data preparation and training scripts are described in TRAINING.md.
Hand detection stage
Hand detection results on OneHand10K validation dataset
Config | Input Size | bbox mAP | bbox mAP 50 | bbox mAP 75 | ckpt | log |
---|---|---|---|---|---|---|
rtmdet_nano | 320x320 | 0.8100 | 0.9870 | 0.9190 | ckpt | log |
Pose estimation stage
Pose estimation results on COCO-WholeBody-Hand validation set
Config | Input Size | [email protected] | AUC | EPE | ckpt |
---|---|---|---|---|---|
rtmpose_m | 256x256 | 0.815 | 0.837 | 4.51 | ckpt |
Gesture recognition stage
Skeleton base gesture recognition results on Jester validation