Skip to main content
Log in

Pedestrian object detection with fusion of visual attention mechanism and semantic computation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In response to the problem that the primary visual features are difficult to effectively address pedestrian detection in complex scenes, we present a method to improve pedestrian detection using a visual attention mechanism with semantic computation. After determining a saliency map with a visual attention mechanism, we can calculate saliency maps for human skin and the human head-shoulders. Using a Laplacian pyramid, the static visual attention model is established to obtain a total saliency map and then complete pedestrian detection. Experimental results demonstrate that the proposed method achieves state-of-the-art performance on the INRIA dataset with 92.78% pedestrian detection accuracy at a very competitive time cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Bruce N, Tsotsos J (2006) Saliency based on information maximization. In: Advances in neural information processing systems. MIT Press, Cambridge, pp 155–162

    Google Scholar 

  2. Cai Z, Saberian M, Vasconcelos N (2015) Learning complexity-aware cascades for deep pedestrian detection. In: 2015 IEEE international conference on computer vision. IEEE Press, Santiago, pp 3361–3369

    Chapter  Google Scholar 

  3. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition. IEEE Press, San Diego, pp 886–893

    Google Scholar 

  4. Einhäuser W, Spain M, Perona P (2008) Objects predict fixations better than early saliency. In: Journal of vision. Association for Research in Vision and Ophthalmology, Rockville, pp 1–26

    Google Scholar 

  5. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. In: IEEE transactions on pattern analysis and machine intelligence. IEEE Press, New York, pp 1627–1645

    Google Scholar 

  6. Gajjar V, Khandhediya Y, Gurnani A, Mavani V, Raval MS (2018) ViS-HuD: using visual saliency to improve human detection with convolutional neural networks. In: 2018 IEEE conference on computer vision and pattern recognition workshops. IEEE Press, Salt Lake City, pp 1908–1916

    Google Scholar 

  7. Harel J, Koch C, Perona P (2007) Graph-based visual saliency. In: Advances in neural information processing systems. MIT Press, Cambridge, pp 545–552

    Google Scholar 

  8. Hoai M, Zisserman A (2014) Talking heads: detecting humans and recognizing their interactions. In: 2014 IEEE conference on computer vision and pattern recognition. IEEE Press, Columbus, pp 875–882

    Chapter  Google Scholar 

  9. Itti L (2000) Models of bottom-up and top-down visual attention. California Institute of Technology Pasadena, State of California

    Google Scholar 

  10. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. In: IEEE transactions on pattern analysis and machine intelligence. IEEE Press, New York, pp 1254–1259

    Google Scholar 

  11. Jing ZL, Xiao G, Li ZH (2007) Image fusion: theories and applications. Higher Education Press, Beijing

    Google Scholar 

  12. Ketenci S, Gencturk B (2013) Performance analysis in common color spaces of 2D Gaussian color model for skin segmentation. In: Eurocon 2013. IEEE Press, Zagreb, pp 1653–1657

    Chapter  Google Scholar 

  13. Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: 2005 IEEE computer society conference on computer vision and pattern recognition. IEEE Press, San Diego, pp 878–885

    Google Scholar 

  14. Li N, Gong Y, Xu J, Gu X, Xu T, Zhou H (2016) Semantic feature-based visual attention model for pedestrian detection. In: Journal of image and graphics. Journal of Image and Graphics, Beijing, pp 723–733

    Google Scholar 

  15. Liu Q, Zhang QZ, Chen WB, Huang ZC (2014) Pedestrian detection based on modeling computation of visual attention. In: Journal of Beijing Information Science & Technology University. Beijing Information Science & Technology University, Beijing, pp 59–65

    Google Scholar 

  16. Lu S, Mahadevan V, Vasconcelos N (2014) Learning optimal seeds for diffusion-based salient object detection. In: 2014 IEEE conference on computer vision and pattern Recognitio-n. IEEE Press, Columbus, pp 2790–2797

    Chapter  Google Scholar 

  17. Lu H, Li Y, Uemura T, Kim H, Serikawa S (2018) Low illumination underwater light field images reconstruction using deep convolutional neural networks. In: Future generation com-puter systems. Elsevier, Amsterdam, pp 142–148

    Google Scholar 

  18. Lu H, Li Y, Chen M, Kim H, Serikawa S (2018) Brain intelligence: go beyond artificial intelligence. In: Mobile networks and applications. Springer, New York, pp 368–375

    Google Scholar 

  19. Lu H, Wang D, Li Y, Li J, Li X, Kim H, Serikawa S, Humar I (2019) CONet: a Congnitive Ocean network. In: IEEE wireless communications. IEEE Press, Piscataway

    Google Scholar 

  20. Maji S, Berg AC, Malik J (2008) Classification using intersection kernel support vector ma-chines is efficient. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE Press, Anchorage, pp 1–8

    Google Scholar 

  21. Mao J, Xiao T, Jiang Y, Cao Z (2017) What can help pedestrian detection? In: 2017 IEEE conference on computer vision and pattern recognition. IEEE Press, Honolulu, pp 6034–6043

    Google Scholar 

  22. Navalpakkam V, Itti L (2006) An integrated model of top-down and bottom-up attention for optimizing detection speed. In: 2006 IEEE computer society conference on computer vision and pattern recognition. IEEE Press, New York, pp 2049–2056

    Google Scholar 

  23. Shashua A, Gdalyahu Y, Hayun G (2004) Pedestrian detection for driving assistance systems: single-frame classification and system level performance. In: IEEE intelligent vehicles symposium. IEEE Press, Parma, pp 1–6

    Google Scholar 

  24. Wang X, Han TX, Yan S (2009) An HOG-LBP human detector with partial occlusion handling. In: 2009 IEEE 12th international conference on computer vision. IEEE Press, Kyoto, pp 32–39

    Chapter  Google Scholar 

  25. Wang G, Liu Q, Zhang J (2015) Method research on vehicular infrared pedestrian detection based on local features. In: Acta electronica sinica. Acta Electronica Sinica, Beijing, pp 1444–1448

    Google Scholar 

  26. Xu Y, Xu XL, Li CN, Jiang JG (2016) Pedestrian detection combining with SVM Classifi-er and HOG feature extraction. In: Computer engineering. Computer Engineering, Shanghai, pp 56–60

    Google Scholar 

  27. Xu D, Ouyang W, Ricci E, Wang X, Sebe N (2017) Learning cross-modal deep Representat-ions for robust pedestrian detection. In: 2017 IEEE conference on computer vision and pattern recognition. IEEE Press, Honolulu, pp 5363–5371

    Google Scholar 

  28. Zhang S, Bauckhage C, Cremers AB (2014) Informed Haar-like features improve pedestrian detection. In: 2014 IEEE conference on computer vision and pattern recognition. IEEE Press, Columbus, pp 947–954

    Chapter  Google Scholar 

  29. Zhang S, Benenson R, Schiele B (2017) Citypersons: a diverse dataset for pedestrian detection. In: 2017 IEEE conference on computer vision and pattern recognition. I-EEE Press, Honolulu, pp 3213–3221

    Google Scholar 

  30. Zhang P, Wang D, Lu H, Wang H, Ruan X (2017) Amulet: aggregating multi-level convolutional features for salient object detection. In: 2017 IEEE international conference on computer vision. IEEE Press, Venice, pp 202–211

    Chapter  Google Scholar 

  31. Zhang P, Wang D, Lu H, Wang H, Yin B (2017) Learning uncertain convolutional features for accurate saliency detection. In: 2017 IEEE international conference on computer vision. IEEE Press, Venice, pp 212–221

    Chapter  Google Scholar 

  32. Zhang S, Benenson R, Omran M, Hosang J, Schiele B (2018) Towards reaching human performance in pedestrian detection. In: IEEE transactions on pattern analysis and machine intelligence. IEEE Press, New York, pp 973–986

    Google Scholar 

  33. Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection through guided attention in CNNs. In: 2018 IEEE conference on computer vision and pattern recognition. IEEE Press, Salt Lake City, pp 6995–70 03

    Chapter  Google Scholar 

  34. Zhang Y, Gravina R, Lu H, Villari M, Fortino G (2018) PEA: parallel electrocardiogram-based authentication for smart healthcare systems. In: Journal of network and computer applications. Elsevier, Amsterdam, pp 10–16

    Google Scholar 

  35. Zhao W, Zhao F, Wang D, Lu H (2018) Defocus blur detection via multi-stream bottom-top-bottom fully convolutional network. In: 2018 IEEE conference on computer vision and pattern recognition. IEEE Press, Salt Lake City, pp 3080–3088

    Chapter  Google Scholar 

  36. Zhongdong W, Saichao W, Zichao H (2013) A Bayesian approach to skin detection in YCbCr color space. In: 2013 international joint conference on awareness science and Technology & Ubi-Media Computing. IEEE Press, Aizu-Wakamatsu, pp 606–610

    Google Scholar 

  37. Zitnick CL, Vedantam R, Parikh D (2014) Adopting abstract images for semantic scene un-derstanding. In: IEEE transactions on pattern analysis and machine intelligence. IEEE Press, New York, pp 627–638

    Google Scholar 

  38. Zuo H, Fan H, Blasch E, Ling H (2017) Combining convolutional and recurrent neural networks for human skin detection. In: IEEE signal processing letters. IEEE Press, Mississippi State, pp 289–293

    Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 61572392) and Shaanxi Provincial Natural Science Foundation (Grant No. 2017JC2-08).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Xiao.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiao, F., Liu, B. & Li, R. Pedestrian object detection with fusion of visual attention mechanism and semantic computation. Multimed Tools Appl 79, 14593–14607 (2020). https://doi.org/10.1007/s11042-018-7143-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-7143-6

Keywords

Navigation