Skip to main content
Log in

RDC-SAL: Refine distance compensating with quantum scale-aware learning for crowd counting and localization

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

As one of the most meaningful research topics in computer vision, crowd counting and localization problems have been applied in many applications such as Video surveillance and Dense object detection. The most recent works solved the crowd counting and localization problems as a regression task via convolutional neural networks (CNNs). However, it is relatively hard for a basic CNN framework to extract adequate features of the crowd scenes. In this work, a refine distance compensating with quantum scale-aware learning framework (RDC-SAL) is proposed to solve crowd counting and localization task based on the Front-end quantum feature extraction, Multi-scale and Refine distance compensating modules. First, the Front-end quantum feature extraction module is adopted with qubit rotation and Pauli operators to calculate the crowd feature using classical CNN architecture. Then the Multi-scale feature extraction module is used to handle the quantum feature with different feature extraction branches by branching procedure. Finally, the Refine distance compensating module is proposed to estimate the density map, which uses the Refine distance compensating factor to fuse several feature extraction branches with different Upsample layers. To the best of our knowledge, it’s the first time to introduce the hybrid classical-quantum network to model the crowd counting and localization problem. Experimental results on some benchmark datasets show that the proposed RDC-SAL can restore the predicted density maps with the high spatial resolution for crowd scenes and achieve improved performance to deal with the localization task compared with state-of-the-art works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Belagiannis V, Rupprecht C, Carneiro G, Navab N (2015) Robust optimization for deep regression. In: Proceedings of the IEEE international conference on computer vision, pp 2830–2838

  2. Cheng ZQ, Li JX, Dai Q, Wu X, Hauptmann AG (2019) Learning spatial awareness to improve crowd counting. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6152–6161

  3. Da K (2014) A method for stochastic optimization. arXiv:1412.6980

  4. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9):1627–1645

    Article  Google Scholar 

  5. Fu M, Xu P, Li X, Liu Q, Ye M, Zhu C (2015) Fast crowd density estimation with convolutional neural networks. Engineering Applications of Artificial Intelligence 43:81–88

    Article  Google Scholar 

  6. Gao J, Han T, Wang Q, Yuan Y (2019) Domain-adaptive crowd counting via inter-domain features segregation and gaussian-prior reconstruction. arXiv:1912.03677

  7. Gao J, Wang Q, Li X (2019) Pcc net: Perspective crowd counting via spatial convolutional network. IEEE Transactions on Circuits and Systems for Video Technology 30(10):3486–3498

    Article  Google Scholar 

  8. Gao N, Wilson M, Vandal T, Vinci W, Nemani R, Rieffel E (2020) High-dimensional similarity search with quantum-assisted variational autoencoder. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 956–964

  9. Ge W, Collins RT (2009) Marked point processes for crowd counting. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 2913–2920

  10. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  11. Gokhale A, Pande MB, Pramod D (2020) Implementation of a quantum transfer learning approach to image splicing detection. International Journal of Quantum Information 18(05):2050024

    Article  MATH  Google Scholar 

  12. Guo D, Li K, Zha ZJ, Wang M (2019) Dadnet: Dilated-attention-deformable convnet for crowd counting. In: Proceedings of the 27th ACM international conference on multimedia, pp 1823–1832

  13. Gupta A, Agrawal RK, Kirar JS, Andreu-Perez J, Ding WP, Lin CT, Prasad M (2019) On the utility of power spectral techniques with feature selection techniques for effective mental task classification in noninvasive bci. IEEE Transactions on Systems, Man, and Cybernetics: Systems 51(5):3080–3092

    Article  Google Scholar 

  14. He L, Wen S, Wang L, Li F (2021) Vehicle theft recognition from surveillance video based on spatiotemporal attention. Applied Intelligence 51(4):2128–2143

    Article  Google Scholar 

  15. Hejun Z, Liehuang Z (2019) Encrypted network behaviors identification based on dynamic time warping and k-nearest neighbor. Cluster Computing 22(2):2571–2580

    Article  Google Scholar 

  16. Hou Y, Li C, Yang F, Ma C, Zhu L, Li Y, Jia H, Xie X (2020) Bba-net: A bi-branch attention network for crowd counting. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 4072–4076

  17. Hu R, Chang S, Wang H, He J, Huang Q (2018) Efficient multispike learning for spiking neural networks using probability-modulated timing method. IEEE Transactions on Neural Networks and Learning Systems 30(7):1984–1997

    Article  Google Scholar 

  18. Hu R, Huang Q, Chang S, Wang H, He J (2019) The mbpep: a deep ensemble pruning algorithm providing high quality uncertainty prediction. Applied Intelligence 49(8):2942–2955

    Article  Google Scholar 

  19. Hu R, Mo Q, Xie Y, Xu Y, Chen J, Yang Y, Zhou H, Tang ZR, Wu EQ (2021) Avmsn: An audio-visual two stream crowd counting framework under low-quality conditions. IEEE Access

  20. Hu R, Tang ZR, Song X, Luo J, Wu EQ, Chang S (2021) Ensemble echo network with deep architecture for time-series modeling. Neural Computing and Applications 33(10):4997–5010

    Article  Google Scholar 

  21. Hu R, Zhou S, Liu Y, Tang Z (2019) Margin-based pareto ensemble pruning: An ensemble pruning algorithm that learns to search optimized ensembles. Comput Intell Neurosci 2019

  22. Hu R, Zhou S, Tang ZR, Chang S, Huang Q, Liu Y, Han W, Wu EQ (2021) Dmman: A two-stage audio-visual fusion framework for sound separation and event localization. Neural Networks 133:229–239

    Article  Google Scholar 

  23. Hu X, Zheng H, Wang W, Li X (2013) A novel approach for crowd video monitoring of subway platforms. Optik 124(22):5301–5306

    Article  Google Scholar 

  24. Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maadeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the european conference on computer vision (ECCV), pp 532–546

  25. Ilyas N, Ahmad A, Kim K (2019) Casa-crowd: A context-aware scale aggregation cnn-based crowd counting technique. IEEE Access 7:182050–182059

    Article  Google Scholar 

  26. Jiang X, Xiao Z, Zhang B, Zhen X, Cao X, Doermann D, Shao L (2019) Crowd counting and density estimation by trellis encoder-decoder networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6133–6142

  27. Jiang Z, Liu X (2020) Novel coupled dp system for fuzzy c-means clustering and image segmentation. Applied Intelligence 50(12):4378–4393

    Article  Google Scholar 

  28. Kyriacou E, Pattichis MS, Pattichis CS, Mavrommatis A, Christodoulou CI, Kakkos S, Nicolaides A (2009) Classification of atherosclerotic carotid plaques using morphological analysis on ultrasound images. Applied Intelligence 30(1):3–23

    Article  Google Scholar 

  29. Li W, Yongbo L, Xiangyang X (2019) Coda: Counting objects via scale-aware adversarial density adaption. In: 2019 IEEE International conference on multimedia and expo (ICME), IEEE, pp 193–198

  30. Li Y, Zhang X, Chen D (2018) Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100

  31. Lin Z, Davis LS (2010) Shape-based human detection and segmentation via hierarchical part-template matching. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(4):604–618

    Article  Google Scholar 

  32. Liu C, Weng X, Mu Y (2019) Recurrent attentive zooming for joint crowd counting and precise localization. In: Proceedings of the IEEE/CVF conference on computer vision and attern recognition, pp 1217–1226

  33. Liu J, Gao C, Meng D, Hauptmann AG (2018) Decidenet: Counting varying ensity crowds through attention guided detection and density estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5197–5206

  34. Liu L, Qiu Z, Li G, Liu S, Ouyang W, Lin L (2019) Crowd counting with deep structured scale integration network. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1774–1783

  35. Nguyen NH, Behrman EC, Steck JE (2020) Quantum learning with noise and decoherence: a robust quantum neural network. Quantum Machine Intelligence 2(1):1–15

    Article  Google Scholar 

  36. Nguyen NH, Behrman EC, Steck JE (2020) Quantum learning with noise and decoherence: a robust quantum neural network. Quantum Machine Intelligence 2(1):1–15

    Article  Google Scholar 

  37. Wu EQ, Tang ZR, Xiong P et al (2021) ROpenPose: a rapider OpenPose model for astronaut operation attitude detection. IEEE Trans Ind Electron 69(1):1043–1052

    Article  Google Scholar 

  38. Reddy MKK, Hossain M, Rochan M, Wang Y (2020) Few-shot scene adaptive crowd ounting using meta-learning. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2814–2823

  39. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv:1506.01497

  40. Sam DB, Peri SV, Sundararaman MN, Kamath A, Radhakrishnan VB (2020) Locate, size and count: Accurately resolving people in dense crowds via detection. IEEE Trans Pattern Anal Mach Intell

  41. Sarkar R, van den Berg E (2021) On sets of maximally commuting and anticommuting pauli operators. Research in the Mathematical Sciences 8(1):1–24

    Article  MathSciNet  MATH  Google Scholar 

  42. Shen G, Tang ZR, Shen P, Yu Y (2021) Hq-trans: A high-quality screening based image translation framework for unsupervised cross-domain pedestrian detection. In: International conference on image and graphics, Springer, pp 16–27

  43. Sheng B, Shen C, Lin G, Li J, Yang W, Sun C (2016) Crowd counting via weighted vlad on a dense attribute feature map. IEEE Transactions on Circuits and Systems for Video Technology 28(8):1788–1797

    Article  Google Scholar 

  44. Shi J, Chen S, Lu Y, Feng Y, Shi R, Yang Y, Li J (2020) An approach to cryptography based on continuous-variable quantum neural network. Scientific Reports 10(1):1–13

    Google Scholar 

  45. Sindagi VA, Patel VM (2017) Cnn-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE International conference on advanced video and signal based surveillance (AVSS), IEEE, pp 1–6

  46. Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid cnns. In: Proceedings of the IEEE international conference on computer vision, pp 1861–1870

  47. Tang Z, Zhu R, Hu R et al (2020) A multilayer neural network merging image preprocessing and pattern recognition by integrating diffusion and drift memristors. IEEE Trans Cogn Develop Syst 13(3):645–656

    Article  Google Scholar 

  48. Wang Q, Gao J, Lin W, Li X (2020) Nwpu-crowd: A large-scale benchmark for crowd counting and localization. IEEE Trans Pattern Anal Mach Intell

  49. Wang Q, Han T, Gao J, Yuan Y (2021) Neuron linear transformation: modeling the domain shift for crowd counting. IEEE Trans Neural Netw Learn Syst

  50. Wang X, Wang B, Zhang L (2011) Airport detection in remote sensing images based on visual attention. In: International conference on neural information processing, Springer, pp 475–484

  51. Wu B, Nevatia R (2005) Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: Tenth IEEE international conference on computer vision (ICCV’05), IEEE, vol 1, pp 90–97

  52. Wu B, Nevatia R (2007) Detection and tracking of multiple, partially occluded humans by bayesian combination of edgelet based part detectors. International Journal of Computer Vision 75(2):247–266

    Article  Google Scholar 

  53. Deng PY, Qiu XY, Tang Z et al (2020) Detecting fatigue status of pilots based on deep learning network using EEG signals. IEEE Trans Cogn Develop Syst 13(3):575–585

    Google Scholar 

  54. Wu EQ, Hu D, Deng PY et al (2020) Nonparametric bayesian prior inducing deep network for automatic detection of cognitive status. IEEE Trans Cybern 51(11):5483–5496

    Article  Google Scholar 

  55. Wu EQ, Xiong P, Tang ZR et al (2021) Detecting dynamic behavior of brain fatigue through 3-d-CNN-LSTM. IEEE Trans Syst Man Cybern: Syst 52(1):90–100

    Article  Google Scholar 

  56. Yang Y, Li G, Wu Z, Su L, Huang Q, Sebe N (2020) Reverse perspective network for perspective-aware object counting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4374–4383

  57. Yi Q, Liu Y, Jiang A, Li J, Mei K, Wang M (2021) Scale-aware network with regional and semantic attentions for crowd counting under cluttered background. arXiv:2101.01479

  58. Zeng L, Xu X, Cai B, Qiu S, Zhang T (2017) Multi-scale convolutional neural networks for crowd counting. In: 2017 IEEE International conference on image processing (ICIP), IEEE, pp 465–469

  59. Zhang A, Shen J, Xiao Z, Zhu F, Zhen X, Cao X, Shao L(2019) Relational attention network for crowd counting. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6788–6797

  60. Zhang A, Yue L, Shen J, Zhu F, Zhen X, Cao X, Shao L(2019) Attentional neural fields for crowd counting. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 5714–5723

  61. Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 833–841

  62. Zhang L, Shi M, Chen Q (2018) Crowd counting via scale-adaptive convolutional neural network.In: 2018 IEEE Winter conference on applications of computer vision (WACV), IEEE, pp 1113–1121

  63. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597

  64. Zhao M, Zhang J, Zhang C, Zhang W (2019) Leveraging heterogeneous auxiliary tasks to assist crowd counting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12736–12745

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China Youth Fund under Grant 6210023461,in part by the Guangdong Academy of Sciences’ (GDAS’) Project of Science and Technology Development under Grant 2017GDASCX-0115 and Grant 2018GDASCX-0115, in part by the Guangdong Academy of Science for the Special Fund of Introducing Doctoral Talent under Grant 2021GDASYL-20210103087, in part by the Opening Foundation of Xinjiang Production and Construction Corps Key Laboratory of Modern Agricultural Machinery under Grant BTNJ2021003, in part by the Foundation of Engineering Research Center of Intelligence Perception and Autonomous Control, Ministry of Education of China (K100052021008).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Zhi-Ri Tang, Edmond Q. Wu, Rui Yang, Qinglong Mo, and Jingbin Li. The first draft of the manuscript was written by Ruihan Hu and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zhi-Ri Tang.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, R., Tang, ZR., Wu, E.Q. et al. RDC-SAL: Refine distance compensating with quantum scale-aware learning for crowd counting and localization. Appl Intell 52, 14336–14348 (2022). https://doi.org/10.1007/s10489-022-03238-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03238-4

Keywords

Navigation