Transferring priors from virtual data for crowd counting in real world

Jiang, Xiaoheng; Liu, Hao; Zhang, Li; Li, Geyang; Xu, Mingliang; Lv, Pei; Zhou, Bing

doi:10.1007/s11704-021-0387-8

Transferring priors from virtual data for crowd counting in real world

Research Article
Published: 11 November 2021

Volume 16, article number 163314, (2022)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Xiaoheng Jiang¹,
Hao Liu¹,
Li Zhang¹,
Geyang Li²,
Mingliang Xu¹,
Pei Lv¹ &
…
Bing Zhou¹

67 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

In recent years, crowd counting has increasingly drawn attention due to its widespread applications in the field of computer vision. Most of the existing methods rely on datasets with scarce labeled images to train networks. They are prone to suffer from the over-fitting problem. Further, these existing datasets usually just give manually labeled annotations related to the head center position. This kind of annotation provides limited information. In this paper, we propose to exploit virtual synthetic crowd scenes to improve the performance of the counting network in the real world. Since we can obtain people masks easily in a synthetic dataset, we first learn to distinguish people from the background via a segmentation network using the synthetic data. Then we transfer the learned segmentation priors from synthetic data to real-world data. Finally, we train a density estimation network on real-world data by utilizing the obtained people masks. Our experiments on two crowd counting datasets demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SD-Net: Understanding overcrowded scenes in real-time via an efficient dilated convolutional neural network

Article 28 September 2020

Learning to Count in the Crowd from Limited Labeled Data

Arbitrary Perspective Crowd Counting via Multi Convolutional Kernels

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Wang Q, Gao J Y, Lin W, Yuan Y. Learning from synthetic data for crowd counting in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019, 8198–8207
Shang C, Ai H Z, Yang Yi. Crowd counting via learning perspective for multi-scale multi-view web images. Frontiers of Computer Science, 2019, 13(3): 579–587
Article Google Scholar
Liu W Z, Salzmann M, Fua P. Context-aware crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019, 5099–5108
Zhang Y Y, Zhou D S, Chen S Q, Gao S G, Ma Y. Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 589–597
Zhang J G, Huang K Q, Tan T N, Zhang Z X. Local structured representation for generic object detection. Frontiers of Computer Science, 2017, 11(4): 632–648
Article Google Scholar
Jiang H Z, Cheng M M, Li S J, Borji A, Wang J D. Joint salient object detection and existence prediction. Frontiers of Computer Science, 2019, 13(4): 778–788
Article Google Scholar
Li H, Liu Y, Xiong S W, Wang L. Pedestrian detection algorithm based on video sequences and laser point cloud. Frontiers of Computer Science, 2015, 9(3): 402–414
Article Google Scholar
Gadekallu T R, Rajput D S, Reddy M. P K, Lakshmanna K, Bhattacharya S, Singh S, Jolfaei A, Alazab M. A novel pca-whale optimization-based deep neural network model for classification of tomato plant diseases using gpu. Journal of Real-Time Image Processing, 2020
Shrivastava R, Kumar P, Tripathi S, Tiwari V, Rajput D S, Gadekallu T R, Suthar B, Singh S, Ra I H. A novel grid and place neuron’s computational modeling to learn spatial semantics of an environment. Applied Sciences, 2020, 10(15): 5147
Article Google Scholar
Thippa R G, Swarna P R, Parimala M, Chiranji L C, Praveen K R, Saqib H, Wazir Z K. A deep neural networks based model for uninterrupted marine environment monitoring. Computer Communications, 2020, 157: 64–75
Article Google Scholar
Boominathan L, Kruthiventi S S, Babu R V. Crowdnet: a deep convolutional network for dense crowd counting. In: Proceedings of the ACM on Multimedia Conference. 2016, 640–644
Onoro-Rubio D, López-Sastre R J. Towards perspective-free object counting with deep learning. In: Proceedings of the European Conference on Computer Vision. 2016, 615–629
Kang D, Ma Z, Chan A B. Beyond counting: comparisons of density maps for crowd analysis tasks-counting, detection, and tracking. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(5): 1408–1422
Article Google Scholar
Marsden M, McGuinness K, Little S, O’Connor N E. Resnetcrowd: a residual deep learning architecture for crowd counting, violent behaviour detection and crowd density level classification. In: Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance. 2017, 1–7
Walach E, Wolf L. Learning to count with cnn boosting. In: Proceedings of the European Conference on Computer Vision. 2016, 660–676
Sam D B, Surya S, and Babu R V. Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 4031–4039
Xu M L, Ge Z Y, Jiang X H, Cui G G, Lv P, Zhou B, Xu C S. Depth information guided crowd counting for complex crowd scenes. Pattern Recognition Letters, 2019, 125: 563–569
Article Google Scholar
Jiang X H, Zhang L, Lv P, Guo Y B, Zhu R J, Li Y F, Pang Y W, Li X, Zhou B, Xu M L. Learning multi-level density maps for crowd counting. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(8): 2705–2715
Article Google Scholar
Jiang X H, Zhang L, Zhang T Z, Lv P, Zhou B, Pang Y W, Xu M L, Xu C S. Density-aware multi-task learning for crowd counting. IEEE Transactions on Multimedia, 2020, 23: 443–453
Article Google Scholar
Jiang X H, Zhang L, Xu M L, Zhang T Z, Lv P, Zhou B, Yang X, Pang Y W. Attention scaling for crowd counting. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2020, 4705–4714
Sindagi V A, Patel V M. Generating high-quality crowd density maps using contextual pyramid cnns. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 1879–1888
Sindagi V A, Patel V M. Multi-level bottom-top and top-bottom feature fusion for crowd counting. In: Proceedings of The IEEE International Conference on Computer Vision. 2019, 1002–1012
Zhang A, Yue L, Shen J Y, Zhu F, Zhen X T, Cao X B, Shao L. Attentional neural fields for crowd counting. In: Proceedings of The IEEE International Conference on Computer Vision. 2019, 5713–5722
Zhang A, Shen J Y, Xiao Z H, Zhu F, Zhen X T, Cao X H, and Ling Shao. Relational attention network for crowd counting. In: Proceedings of The IEEE International Conference on Computer Vision. 2019, 6787–6796
Liu N, Long Y C, Zou C Q, Niu Q, Pan L, Wu H F. Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019, 3225–3234
Liu C C, Weng X Y, Mu Y D. Recurrent attentive zooming for joint crowd counting and precise localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019, 1217–1226
Sindagi V A, Patel V M. Cnn-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance. 2017, 1–6
Zhao M M, Zhang J, Zhang C Y, Zhang W J. Leveraging heterogeneous auxiliary tasks to assist crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019, 12736–12745
Liu X L, Weijer J D, Bagdanov A D. Leveraging unlabeled data for crowd counting by learning to rank. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 7661–7669
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations. 2015
Li Y H, Zhang X F, Chen D M. Csrnet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 1091–1100
Ma Z H, Wei X, Hong X P, Gong Y H. Bayesian loss for crowd count estimation with point supervision. In: Proceedings of The IEEE International Conference on Computer Vision. 2019, 6141–6150
Kingma D P, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations. 2015
Li X H, Shen H F, Zhang L P, Zhang H Y, Yuan Q Q, Yang G. Recovering quantitative remote sensing products contaminated by thick clouds and shadows using multitemporal dictionary learning. IEEE Transactions on Geoscience and Remote Sensing, 2014, 52(11): 7086–7098
Article Google Scholar
Deng J, Dong W, Socher R, Li L J, Li K, Li F F. Imagenet: A large-scale hierarchical image database. In: Proceedings of The IEEE conference on computer vision and pattern recognition. 2009, 248–255
Sam D B, Babu R V. Top-down feedback for crowd counting convolutional neural network. In: Proceedings of AAAI Conference on Artificial Intelligence. 2018, 7323–7330
Ma J J, Dai Y P, Tan Y P. Atrous convolutions spatial pyramid network for crowd counting and density estimation. Neurocomputing, 2019, 350: 91–101
Article Google Scholar
Zeng L K, Xu X M, Cai B L, Qiu S, Zhang T. Multi-scale convolutional neural networks for crowd counting. In: Proceedings of The IEEE International Conference on Image Processing. 2017, 465–469
Shen Z, Xu Y, Ni B B, Wang M S, Hu J G, Yang X K. Crowd counting via adversarial cross-scale consistency pursuit. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 5245–5254
Zhang Y M, Zhou C L, Chang F L, Kot A C. Multi-resolution attention convolutional neural network for crowd counting. Neurocomputing, 2019, 329: 144–152
Article Google Scholar
Zhang L, Shi Z L, Cheng M M, Liu Y, Bian J W, Zhou J T, Zheng G Y, Zeng Z. Nonlinear regression via deep negative correlation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019
Sam D B, Sajjan N N, Babu R V, Srinivasan M. Divide and grow: Capturing huge diversity in crowd images with incrementally growing cnn. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 3618–3626
Zou Z K, Cheng Y, Qu X Y, Ji S L, Guo X X, Zhou P. Attend to count: Crowd counting with adaptive capacity multi-scale cnns. Neurocomputing, 2019, 367: 75–83
Article Google Scholar
Wang L Y, Yin B Q, Tang X, Li Y. Removing background interference for crowd counting via de-background detail convolutional network. Neurocomputing, 2019, 332: 360–371
Article Google Scholar
Liu L B, Wang H J, Li G B, Ouyang W L, Lin L. Crowd counting using deep recurrent spatial-aware network. In: Proceedings of the International Joint Conference on Artificial Intelligence. 2018, 849–855
Ranjan V, Le H U, Hoai M. Iterative crowd counting. In: Proceedings of the European Conference on Computer Vision. 2018, 270–285
Chen J W, Wen S, Wang Z F. Crowd counting with crowd attention convolutional neural network. Neurocomputing, 2019, 382: 210–220
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61802351, 61822701, 61872324, 61772474, 62036010), in part by China Postdoctoral Science Foundation (2018M632802), and in part by Key R&D and Promotion Projects in Henan Province (192102310258).

Author information

Authors and Affiliations

School of Information Engineering, Zhengzhou University, Zhengzhou, 450001, China
Xiaoheng Jiang, Hao Liu, Li Zhang, Mingliang Xu, Pei Lv & Bing Zhou
School of Information Engineering, Henan University of Science and Technology, Luoyang, 471000, China
Geyang Li

Authors

Xiaoheng Jiang
View author publications
Search author on:PubMed Google Scholar
Hao Liu
View author publications
Search author on:PubMed Google Scholar
Li Zhang
View author publications
Search author on:PubMed Google Scholar
Geyang Li
View author publications
Search author on:PubMed Google Scholar
Mingliang Xu
View author publications
Search author on:PubMed Google Scholar
Pei Lv
View author publications
Search author on:PubMed Google Scholar
Bing Zhou
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Mingliang Xu.

Additional information

Xiaoheng Jiang received the BS degree, MS degree and PhD degree in electronic information engineering from the Tianjin University, China, in 2010, 2013 and 2017, respectively. Currently, he is an associate professor with the School of Information Engineering, Zhengzhou University, China. His research interests include computer vision and deep learning.

Hao Liu received the BS degree from Zhengzhou University, China in 2018. Currently, he is a Master student with the School of Information Engineering, Zhengzhou University, China. His research interests include computer vision and deep learning.

Li Zhang received the BS degree from University of Electronic Science and Technology of China, China in 2017, and the MS degree from Zheng-zhou University, China in 2020. Currently, he is a PhD candidate at Beihang University, China. His research interests include computer vision and deep learning.

Geyang Li is currently a senior student at Henan University of Science and Technology, China. His research interests include computer vision and deep learning.

Mingliang Xu received the PhD degree in computer science and technology from the State Key Laboratory of CAD&CG, Zhejiang University, China. He is currently a Professor with the School of Information Engineering, Zhengzhou University, China, and also the Director of the Center for Interdisciplinary Information Science Research, and the General Secretary of the ACM SIGAI China. His research interests include virtual reality and artificial intelligence.

Pei Lv is an associate professor in School of Information Engineering, Zhengzhou University, China. His research interests include video analysis and crowd simulation. He received his PhD in 2013 from the State Key Lab of CAD&CG, Zhejiang University, China. He has authored more than 20 journal and conference papers in these areas, including IEEE TIP, IEEE TCSVT, ACM MM, etc.

Bing Zhou received the BS and MS degrees from Xi’an Jiao Tong University, China in 1986 and 1989, respectively, and the PhD degree from Beihang University, China in 2003, all in computer science. He is currently a Professor with the School of Information Engineering, Zhengzhou University, China. His research interests cover video processing and understanding, surveillance, computer vision, and multimedia applications.

Electronic supplementary material