DECCNet: Depth enhanced crowd counting
Journal
Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019
Pages
4521-4530
Date Issued
2019
Author(s)
Abstract
Crowd counting which aims to calculate the number of total instances on an image is a classic but crucial task that supports many applications. Most of the prior works are based on the RGB channels on the images and achieve satisfied performance. However, previous approaches suffer from counting highly congested region due to the incomplete and blurry shapes. In this paper, we present an effective crowd counting method, Depth Enhanced Crowd Counting Network (DECCNet), which leverages the estimated depth information with our novel Bidirectional Cross-modal Attention (BCA) mechanism. Utilizing the depth information enables our model to explicitly learn to pay attention to those congested regions on the basis of the depth information. Our BCA mechanism interactively fuses two different input modalities by learning to focus on the informative parts according to each other. In our experiments, we demonstrate that DECCNet outperforms the state-of-the-art on the two largest crowd counting datasets available, including UCF-QNRF, which has the highest crowd density. The visualized result shows that our method can accurately regress dense regions through leveraging depth information. Ablation studies also indicate that each component of our method is beneficial to final prediction. © 2019 IEEE.
Subjects
Cross modal fusion; Crowd counting; RGBD
Other Subjects
Computer science; Computers; Counting networks; Cross-modal; Crowd counting; Crowd density; Depth information; Input modalities; RGBD; State of the art; Computer vision
Type
conference paper
