GLPose: Global-Local Attention Network with Feature Interpolation Regularization for Head Pose Estimation of People Wearing Facial Masks
Journal
BMVC 2022 - 33rd British Machine Vision Conference Proceedings
Date Issued
2022-01-01
Author(s)
Abstract
To precisely estimate head poses based on RGB images is essential and useful for many applications, such as understanding the vehicle drivers' status for driving safety, and passengers' action conditions. Recently, due to the impact of the COVID-19 pandemic, people are required to wear masks in almost all public places, sometimes even in a vehicle, but the existing research works on head pose estimation have become more challenging when the face is occluded. To tackle this issue, we propose a novel siamese structure network integrating the global-local attention mechanisms with data augmentation and a multi-task learning strategy. Specifically, we initially incorporate data augmentation for synthesizing facial masks on human faces and landmark prediction in the training stage to help the model be generalized and robust. Next, a global-local attention mechanism is designed so that the relationship in whole feature maps can be learned and the critical spatial-channel information can be enhanced to obtain a better feature representation. Lastly, the feature interpolation regularization module utilizes pairs of feature embedding from the siamese network to optimize the feature embedding. To validate our proposed work, the proposed method is evaluated on AFLW2000, BIWI, and MAFA datasets. Extensive experiments show that our method can achieve highly promising performance on those public datasets.
SDGs
Type
conference paper
