https://scholars.lib.ntu.edu.tw/handle/123456789/633895
標題: | Combined 2D and 3D Convolution Residual Attention Network for Hand Gesture Recognition | 作者: | Tsai, Chang Ting JIAN-JIUN DING |
公開日期: | 1-一月-2022 | 來源出版物: | Proceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022 | 摘要: | Hand gesture recognition is a classical problem in human-computer interaction research. In this paper, a learning-based model is proposed for hand gesture recognition. Our model receives RGB and depth channels input. To recognize the hand gesture, the segmentation of hand region is the important issue. At first, we apply patch embedding layer to encode all the frames as several patches. Then, these encoded patches are fed into 3D convolution network. The 3D convolution layer can simultaneously learn the spatial and temporal feature of the video. The 3D convolution network also contains attention block, which is used to enhance the crucial feature map value. Besides, the encoded patches pass through the local decoder to recover the depth frames of the video. This operation can preserve the depth information in encoded patches. At last, we perform the linear classifier for the output of 3D convolution network to get the result of hand gesture. Our method achieves 80.5% accuracy in the NV-Gesture dataset and 89.6% accuracy in the SKIG dataset. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/633895 | ISBN: | 9786165904773 | DOI: | 10.23919/APSIPAASC55919.2022.9980075 |
顯示於: | 電機工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。