An enhanced self-attention and A2J approach for 3D hand pose estimation
Published in Multimedia Tools and Applications, 2021
Recommended citation: M-Y Ng., C.-B. Chng, W-K Koh, CK Chui, MCH Chua. "An enhanced self-attention and A2J approach for 3D hand pose estimation." Multimedia Tools and Applications, 2021.
Three dimensional (3D) hand pose estimation is the task of estimating the 3D location of hand keypoints. In recent years, this task has received much research attention due to its diverse applications in human-computer interaction and virtual reality. To the best of our knowledge, there has been limited studies that model self-attention in 3D hand pose estimation despite its use in various computer vision tasks. Hence, we propose augmenting convolution with self-attention to capture long-range dependencies in a depth image. In addition, motivated by a recent work which uses anchor points set on a depth image, we extend anchor points to the depth dimension to regress 3D hand joint locations. Validation experiments using the proposed approaches are performed on various hand pose datasets, and we obtain performances that are comparable to other state-of-the-art methods. The results demonstrate the potential of these approaches in a hand-based recognition system.
Recommended citation: M-Y Ng., C.-B. Chng, W-K Koh, CK Chui, MCH Chua. An enhanced self-attention and A2J approach for 3D hand pose estimation. Multimedia Tools and Applications, 2021.