Abstract:
Recently, deep learning-based crowd counting methods have achieved promising performance on test data with the same distribution as training set, while performance degrad...Show MoreMetadata
Abstract:
Recently, deep learning-based crowd counting methods have achieved promising performance on test data with the same distribution as training set, while performance degradation usually occurs when testing on other or unseen domains. Due to the variations in scene contexts, crowd densities and head scales, it is a very challenging issue to tackle multi-domain crowd counting using one deep model. In this work, we propose a domain-guided channel attention network (DCANet) towards learning multi-domain crowd counting. In particular, our DCANet consists of feature extraction module, channel attention-guided multi-dilation (CAMD) module and density map prediction module. Given a testing image from a certain domain, channel attention is adopted to guide the extraction of domain-specific feature representation, and thus our DCANet can adaptively handle images from multiple domains. We further propose two domain-guided learning strategies, i.e., dataset-level domain kernel (DDK) supervision and image-level domain kernel (IDK) supervision, by which channel attention in CAMD can be explicitly optimized to emphasize the channels corresponding to the domain of an input image. Furthermore, IDK can be adaptively updated when training DCANet, thereby improving the generalization ability to unseen scenes. Experimental results on benchmark datasets show that our DCANet performs favorably for handling multi-domain datasets using one single model. Moreover, our IDK training strategy can be applied to boost state-of-the-art methods on single domain dataset.
Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Volume: 33, Issue: 11, November 2023)