Abstract
Understanding where humans look in a scene is significant for many applications. Researches on neuroscience and cognitive psychology show that human brain always pays attention on special areas when they observe an image. In this paper, we recorded and analyzed human eye-tracking data, we found that these areas mainly were focus on semantic objects. Inspired by neuroscience, deep learning concept is proposed. Fully Convolutional Neural Networks (FCN) as one of methods of deep learning can solve image objects segmentation at semantic level efficiently. So we bring forth a new visual attention model which uses FCN to stimulate the cognitive processing of human free observing a natural scene and fuses attractive low-level features to predict fixation locations. Experimental results demonstrated our model has apparently advantages in biology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Scholkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge (2002)
Smola, A.J., Mika, S., Scholkhopf, B., et al.: Regularized principal manifold. J. Mach. Learn. Res. 1(3), 179–209 (2001)
Kanwisher, N., Mcdermott, J., Chun, M.: The fusiform face area: a module in human extrastriate cortex specialized for perception of faces. J. Neurosci. 17(11), 4302–4311 (1997)
Epstein, R., Kanwisher, N.: A cortical representation of the local visual environment. Nature 392(6676), 598–601 (1998)
Epstein, R., Stanley, D., Harris, A., Kanwisher, N.: The parahippocampal place area: perception, encoding, or memory retrieval? Neuron 23(2000), 115–125 (2000)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, vol. 79, pp. 3431–3440 (2015)
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. IEEE 20, 1254–1259 (2002)
Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol. 4(4), 219–227 (1985)
Garcia-Diaz, A., Fdez-Vidal, X.R., Pardo, X.M., Dosil, R.: Decorrelation and distinctiveness provide with human-like saliency. In: Blanc-Talon, J., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2009. LNCS, vol. 5807, pp. 343–354. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04697-1_32
Zhang, L., Tong, M.H., Marks, T.K., Shan, H., Cottrell, G.W.: SUN: a Bayesian framework for saliency using natural statistics. J. Vis. 8(7), 1–20 (2008)
Torralba, A.: Modeling global scene factors in attention. J. Opt. Soc. Am. A 20(7), 1407–1418 (2003)
Schölkopf, B., Platt, J., Hofmann, T.: Graph-Based Visual Saliency, vol. 19, pp. 545–552. MIT Press, Cambridge (2010)
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look, vol. 30, pp. 2106–2113 (2009)
Zhao, Q., Koch, C.: Learning a saliency map using fixated locations in natural scenes. J. Vis. 11(3), 74–76 (2011)
Yan, Y., Ren, J., Zhao, H., Sun, G., Wang, Z., Zheng, J., et al.: Cognitive fusion of thermal and visible imagery for effective detection and tracking of pedestrians in videos. Cognit. Comput. 9, 1–11 (2017)
Zhou, Y., Zeng, F.Z., Zhao, H.M., Murray, P., Ren, J.: Hierarchical visual perception and two-dimensional compressive sensing for effective content-based color image retrieval. Cognit. Comput. 8(5), 877–889 (2016)
Chai, Y., Ren, J., Zhao, H., Li, Y., Ren, J., Murray, P.: Hierarchical and multi-featured fusion for effective gait recognition under variable scenarios. Pattern Anal. Appl. 19(4), 905–917 (2016)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html
Yu, S., Cheng, Y., Xie, L., et al.: Fully convolutional networks for action recognition. IET Comput. Vision 11(8), 744–749 (2017)
Dai, J., He, K., Li, Y., Ren, S., Sun, J.: Instance-sensitive fully convolutional networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 534–549. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_32
Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol. 4(4), 219–227 (1985)
Itti, L., Koch, C.: A saliency-based search mechanism for overt and covert shifts of visual attention. Vis. Res. 40(12), 1489–1506 (2000)
Bruce, N.D.B., Tsotsos, J.K.: Saliency based on information maximization. Adv. Neural. Inf. Process. Syst. 18(3), 298–308 (2005)
Walther, D., Koch, C.: Modeling attention to salient proto-objects. Neural Netw. Off. J. Int. Neural Netw. Soc. 19(9), 1395–1407 (2006)
Acknowledgements
We would like to thank the associate editor and all of the reviewers for their constructive comments to improve the manuscript. The work is supported by NSF of China (Nos. NCYM0001 and 61201319).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, N., Zhao, X., Ma, B., Zou, X. (2018). A Visual Attention Model Based on Human Visual Cognition. In: Ren, J., et al. Advances in Brain Inspired Cognitive Systems. BICS 2018. Lecture Notes in Computer Science(), vol 10989. Springer, Cham. https://doi.org/10.1007/978-3-030-00563-4_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-00563-4_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00562-7
Online ISBN: 978-3-030-00563-4
eBook Packages: Computer ScienceComputer Science (R0)