Dynamic Kernel CNN-LR model for people counting

Tomar, Ankit; Kumar, Santosh; Pant, Bhaskar; Tiwari, Umesh Kumar

doi:10.1007/s10489-021-02375-6

Dynamic Kernel CNN-LR model for people counting

Published: 22 April 2021

Volume 52, pages 55–70, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Ankit Tomar¹,
Santosh Kumar ORCID: orcid.org/0000-0002-1008-0804¹,
Bhaskar Pant¹ &
…
Umesh Kumar Tiwari¹

638 Accesses
13 Citations
Explore all metrics

Abstract

People Counting in images is a worthwhile task as it is widely used for public safety, emergency people planning, intelligent crowd flow, and countless other reasons. Counting the objects manually in images does not make practical sense, since it is very time-consuming, and it never gives accurate results for dense crowded images. In crowded images, as the density of the people increases, object appear to be partially encircling each other. This occlusion problem of objects limits the crowd counting ability of any traditional computer vision model. To overcome this problem, here we addressed a dynamic kernel convolution neural network-linear regression (DKCNN-LR) model for counting the exact number of people in image frames even if crowd is very dense and occlusion problem. The proposed model works in two phases, first a DKCNN model use convolution layers in such a fashion that the kernel weight of each subsequent successive layer is half of its previous convolution layer’s weight. The first three heavy kernel weight layers identify far camera regions (low-level) features, and the later light kernel weight layers help identify near-camera region (high-level) features. Second, a linear regression model is employed to perform parametric regression between the actual people count (ground truth) and the estimated count (predicted values). The performance of the proposed model tested on three challenging and different quality benchmark datasets in terms of MAE, RMSE, Pearson-R and R². The DKCNN-LR model secured MAE, RMSE on Mall dataset is 1.65, 2.76, on Beijing-BRT 1.43, 1.87 and on SmartCity dataset it is 2.69 and 10.69. These results confirm that the proposed model is quite reliable, effective and robust for real situations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

References

He L, Wen S, Wang L, Li F (2020) Vehicle theft recognition from surveillance video based on spatiotemporal attention. Applied Intelligence. https://doi.org/10.1007/s10489-020-01933-8
Albi G, Bellomo N, Fermo L, Ha SY, Kim J, Pareschi L, Poyato D, Soler J (2019) Vehicular traffic, crowds, and swarms: From kinetic theory and multiscale methods to applications and research perspectives. Mathematical Models and Methods in Applied Sciences. https://doi.org/10.1142/S0218202519500374
Li T, Chang H, Wang M, Ni B, Hong R, Yan S (2015) Crowded Scene Analysis: A Survey. IEEE Trans Circ Syst Video Technol 25(3):367–386. https://doi.org/10.1109/TCSVT.2014.2358029
Article Google Scholar
Yogameena B, Nagananthini C (2017) Computer Vision based crowd disaster avoidance system: a survey. International Journal of Disaster Risk Reduction. https://doi.org/10.1016/j.ijdrr.2017.02.021
Bai H, Wen S, Gary Chan S-H (2019) Crowd counting on images with scale variation and isolated clusters. Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019. https://doi.org/10.1109/ICCVW.2019.00009
Yu X, Wu X, Luo C, Ren P (2017) Deep learning in remote sensing scene classification: a data augmentation enhanced convolutional neural network framework. GIScience Remote Sens 54(5):741–758. https://doi.org/10.1080/15481603.2017.1323377
Article Google Scholar
Satyanarayana P, Sai Priya K, Sai Chandu MV, Sahithi M (2018) Automated raspberry pi controlled people counting system for pilgrim crowd management. In: Dash S, Naidu P, Bayindir R, Das S (eds) Artificial intelligence and evolutionary computations in engineering systems. Advances in intelligent systems and computing. https://doi.org/10.1007/978-981-10-7868-2-41, vol 668. Springer, Singapore
Ouyang W, Wang X (2013) Single-Pedestrian Detection aided by Multi-Pedestrian detection. Proceedings of the IEEE computer society conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2013.411
Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 589–597
Chen K, Loy CC, Gong S, Xiang T (2012) Feature mining for localised crowd counting. BMVC 2012 - Electronic Proceedings of the British Machine Vision Conference 2012. https://doi.org/10.5244/C.26.21
Xu Z-F, Jia R-S, Sun H-M, Liu Q-M, Cui Z (2020) Light-YOLOv3: fast method for detecting green mangoes in complex scenes using picking robots. Appl Intell 50:4670–4687. https://doi.org/10.1007/s10489-020-01818-w
Article Google Scholar
Frontoni E, Paolanti M, Pietrini R (2019) People Counting in Crowded Environment and Re-identification. In: Rosin P., Lai Y. K., Shao L., Liu Y (eds) RGB-D image analysis and processing. Advances in computer vision and pattern recognition. https://doi.org/10.1007/978-3-030-28603-3-18. Springer, Cham
Wang L, Yin B, Guo A, Ma H, Cao J (2018) Skip-Connection Convolutional Neural Network for Still Image Crowd Counting. Applied Intelligence. https://doi.org/10.1007/s10489-018-1150-1
Szabo P (2018) Urbanization And mental health: a developing world perspective. Current Opinion in Psychiatry. https://doi.org/10.1097/YCO.0000000000000414
Shahriare Satu Md, Tania Akter Md, Sadrul Arifen Md, Mia M. d., Raza Mia Md (2017) Predicting accidental locations of Dhaka-Aricha highway in Bangladesh using different data mining techniques. International Journal of Computer Applications. https://doi.org/10.5120/ijca2017914096
Haque S, SadiMd M, RafiMd EH, IslamMd M, Hasan K (2020) Real-Time Crowd Detection to Prevent Stampede. https://doi.org/10.1007/978-981-13-7564-4-56
Felemban EA, Rehman FU, Biabani SAA, Ahmad A, Naseer A, Majid ARMA, Hussain OK, Qamar AM, Falemban R, Zanjir F (2020) Digital Revolution for Hajj Crowd Management: A Technology Survey. IEEE Access. https://doi.org/10.1109/ACCESS.2020.3037396
Kok VJ, Lim MK, SengChan C (2016) Crowd Behavior analysis: a review where physics meets biology. Neurocomputing. https://doi.org/10.1016/j.neucom.2015.11.021
Hsu CL, Lin JCC (2016) An Empirical examination of consumer adoption of internet of things services: Network externalities and concern for information privacy perspectives. Computers in Human Behavior. https://doi.org/10.1016/j.chb.2016.04.023
Ji Q, Zhu T, Bao D (2020) A hybrid model of convolutional neural networks and deep regression forests for crowd counting. Appl Intell 50:2818–2832. https://doi.org/10.1007/s10489-020-01688-2
Article Google Scholar
Paul V, Jones MJ (2004) Robust Real-Time Face Detection. International Journal of Computer Vision. https://doi.org/10.1023/B:VISI.0000013087.49260.fb
Sindagi VA, Patel VM (2018) A Survey of Recent Advances in CNN-Based Single Image Crowd Counting and Density Estimation, pp 1–16
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. Proceedings - 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005. https://doi.org/10.1109/CVPR.2005.177
Konstantinidis D, Stathaki T, Argyriou V, Grammalidis N (2017) Building Detection Using Enhanced HOG–LBP Features and Region Refinement Processes. IEEE J Sel Top Appl Earth Observ Remote Sens 10(3):888–905. https://doi.org/10.1109/JSTARS.2016.2602439
Article Google Scholar
Lin Z, Davis LS (2010) Shape-Based Human detection and segmentation via hierarchical Part-Template matching. IEEE transactions on pattern analysis and machine intelligence. https://doi.org/10.1109/TPAMI.2009.204
Meshgi K, Maeda S-I, Oba S, Skibbe H, Li Y-z, Ishii S (2016) An Occlusion-Aware Particle Filter Tracker to Handle Complex and Persistent Occlusions. Computer Vision and Image Understanding. https://doi.org/10.1016/j.cviu.2016.05.011
Choudhury SK, Padhy RP, Sa PK, Bakshi S, Sa PK, Bakshi S (2019) Human Detection Using Orientation Shape Histogram and Co ocurrence Textures. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-018-6866-8
Dollar P, Wojek C, Schiele B, Perona P (2012) Pedestrian Detection: An Evaluation of the State of the Art. IEEE Trans Pattern Anal Mach Intell 34(4):743–761. https://doi.org/10.1109/TPAMI.2011.155
Article Google Scholar
Xiong F, Shi X, Yeung D (2017) Spatiotemporal modeling for crowd counting in videos, 2017 IEEE international conference on computer vision (ICCV), Venice, pp 5161–5169. https://doi.org/10.1109/ICCV.2017.551
Bolei X, Qiu G (2016) Crowd density estimation based on rich features and random projection forest. 2016 IEEE winter conference on applications of computer vision, WACV 2016. https://doi.org/10.1109/WACV.2016.7477682
Pham V, Kozakaya T, Yamaguchi O, Okada A (2015) COUNT Forest: CO-Voting Uncertain Number of Targets Using Random Forest for Crowd Density Estimation, 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, pp 3253–3261. https://doi.org/10.1109/ICCV.2015.372
Li Y, Zhang X, Chen D (2018) CSRNEt: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2018.00120
Kumagai S, Hotta K, Kurita T (2018) Mixture of counting CNNs. Mach Vis Appl 29, 1119–1126. https://doi.org/10.1007/s00138-018-0955-6
Chan AB, Liang Z-SJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: Counting peoplewithout people models or tracking, 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, pp 1–7. https://doi.org/10.1109/CVPR.2008.4587569
Ryan D, Denman S, Fookes C, Sridharan S (2009) Crowd counting using multiple local features, 2009 digital image computing: Techniques and applications, Melbourne, pp 81–88. https://doi.org/10.1109/DICTA.2009.22
Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597
Everingham SM, Eslami A, Van Gool L, Williams CKI, Winn J, Zisserman A (2015) The Pascal Visual Object Classes Challenge: A Retrospective.international Journal of Computer Vision. https://doi.org/10.1007/s11263-014-0733-5
Chang X, Nie Feiping, Wang S, Yang Y, Zhou X, Zhang C (2016) Compound Rank-k Projections for Bi linear Analysis. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2015.2441735
Wang T, Li G, Lei J, Li S, Xu S (2017) Crowd Counting Based on MMCNN in Still Images. Lecture Notes in Computer Science (Including Sub series Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). https://doi.org/10.1007/978-3-319-59126-1-39
Zhou Q, Zhang J, Che L, Shan H, Wang JZ (2019) Crowd Counting With Limited Labeling Through Submodular Frame Selection. IEEE Trans Intell Transp Syst 20(5):1728–1738. https://doi.org/10.1109/TITS.2018.2829987
Article Google Scholar
Miao Y, Han J, Gao Y, Zhang B (2019) ST-CNN: Spatial-Temporal Convolutional Neural Network for Crowd Counting in Videos. Pattern Recognition Letters, vol 125, Elsevier, pp 113–18. https://doi.org/10.1016/j.patrec.2019.04.012
Xua M, Ge Z, Jiang X, Cui G, Lv P, Zhou B, Xub C (2019) Depth information guided crowd counting for complex crowd scenes. Pattern Recognition Letters. https://doi.org/10.1016/j.patrec.2019.02.026
Wu X, Xu B, Zheng Y, Ye H, Yang J, He J (2020) Fast Video Crowd Counting with a Temporal Aware Network. Neurocomputing, vol 403, Elsevier, pp 13–20. https://doi.org/10.1016/j.neucom.2020.04.071
Li Y, Khoshelham K, Sarvi M (2019) Direct Generation of level of service maps from images using convolutional and long Short-Term memory networks. Journal of Intelligent Transportation Systems, Technology, Planning, and Operations. https://doi.org/10.1080/15472450.2018.1563865
Kong X, Zhao M, Zhou H, Zhang C (2020) Weakly Supervised Crowd-Wise Attention For Robust Crowd Counting. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, pp 2722–2726, https://doi.org/10.1109/ICASSP40776.2020.9054258
Zhang L, Shi M, Chen Q (2018) Crowd counting via Scale-Adaptive convolution neural network. 2018 IEEE winter conference on applications of computer vision (WACV), Lake Tahoe. 1113–1121. https://doi.org/10.1109/WACV.2018.00127
Marsden M, McGuinness K, Little S, O’Connor N (2017) Fully convolutional crowd counting on highly congested scenes. VISIGRAPP 2017 - Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, vol 5, pp 27–33
Ding X, Lin Z, He F, Wang Y (2018) A Deeply-Recursive convolutional network for crowd counting. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp 1942–46. https://doi.org/10.1109/ICASSP.2018.8461772
Sam DB, Surya S, Babu RV (2017) Switching Convolutional Neural Network for Crowd Counting. 2017 IEEE Conference onComputer Vision and Pattern Recognition (CVPR), Honolulu, pp 4031–4039. https://doi.org/10.1109/CVPR.2017.429
Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, pp 833–841. https://doi.org/10.1109/CVPR.2015.7298684
Sheng B, Shen C, Lin G, Li J (2018) Crowd Counting via Weighted VLAD on aDense Attribute Feature Map. IEEE Transactions on Circuits and Systems forVideo Technology. https://doi.org/10.1109/TCSVT.2016.2637379
Cabada RZ, Rangel HR, Estrada MLB, Lopez HMC (2020) Hyperparameter Optimization in CNN for Learning-Centered emotion recognition for intelligent tutoring systems. Soft Computing. https://doi.org/10.1007/s00500-019-04387-4

Download references

Acknowledgments

We are grateful to Graphic Era University, to facilitate the computational resources execution of this research work. We also give special thanks to the research team and websites for providing the Mall, SmartCity and Beijing-BRT dataset repositories. Finally we are grateful to the anonymous reviewers for providing us valuable comments to make this research work better.

Author information

Authors and Affiliations

CSE Department, Graphic Era Deemed to be University, Dehradun, India
Ankit Tomar, Santosh Kumar, Bhaskar Pant & Umesh Kumar Tiwari

Authors

Ankit Tomar
View author publications
You can also search for this author in PubMed Google Scholar
Santosh Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Bhaskar Pant
View author publications
You can also search for this author in PubMed Google Scholar
Umesh Kumar Tiwari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Santosh Kumar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tomar, A., Kumar, S., Pant, B. et al. Dynamic Kernel CNN-LR model for people counting. Appl Intell 52, 55–70 (2022). https://doi.org/10.1007/s10489-021-02375-6

Download citation

Accepted: 23 March 2021
Published: 22 April 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10489-021-02375-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic Kernel CNN-LR model for people counting

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic Kernel CNN-LR model for people counting

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation