Abstract
Human head detection is widely used in computer vision. However, in practical applications, human head detection is likely to cause false alarms because of the angle, light condition, and cameras. This paper proposes a novel spatial attention network (SAN) which adopts the saliency module to exploit the environmental information beyond the proposal which is ignored in the Faster-RCNN. At the meantime, the class score and saliency score are fused together through a suitable strategy to effectively suppress false positive samples. In order to train and test our model, this paper has established a dataset including 55,802 images. We have evaluated our method and the final experimental results show that our model is significantly superior to the Faster-RCNN model.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems, pp. 379–387 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger, arXiv preprint arXiv:1612.08242 (2016)
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Hariharan, B., Arbelez, P., Girshick, R., Malik, J.: Hypercolumns for object segmentation and fine-grained localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 447–456 (2015)
Lin, T.-Y., Dollr, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, no. 2, p. 4 (2017)
Kong, T., Yao, A., Chen, Y., Sun, F.: Hypernet: towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 845–853 (2016)
Vu, T.H., Osokin, A., Laptev, I.: Context-aware CNNs for person head detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2893–2901 (2015)
Stewart, R.: Brainwash dataset. Stanford Digital Repository (2015). http://purl.stanford.edu/sx925dc9385
Stewart, R., Andriluka, M., Ng, A.Y.: End-to-end people detection in crowded scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2325–2333 (2016)
Acknowledgments
This work was supported by National Key Research and Development Program of China under No. 2018YFB1003405.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, R., Zhang, B., Huang, Z., Zhao, X., Qiao, P., Dou, Y. (2018). Spatial Attention Network for Head Detection. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11165. Springer, Cham. https://doi.org/10.1007/978-3-030-00767-6_51
Download citation
DOI: https://doi.org/10.1007/978-3-030-00767-6_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00766-9
Online ISBN: 978-3-030-00767-6
eBook Packages: Computer ScienceComputer Science (R0)