Abstract
With the increasing maturity of image saliency detection, more and more people are focusing their research on video saliency detection. Currently, video saliency detection can be divided into two forms, eye fixation detection and salient objects detection. In this article, we focus on exploring the relationship between them. Firstly, we propose a network called fixation assisted video salient object detection network (FAVSODNet), which uses the eye gaze information in videos to assist in detecting video salient objects. A fixation assisted module (FAM) is designed to connect FP task and SOD task deeply. Under the guidance of the eye fixation information, multiple salient objects in complex scene can be detected more correctly. Moreover, when the scene suddenly changes or a new person appears, it can better to detect the correct salient objects with the aid of fixation maps. In addition, we adopt an extended multi-scale feature extraction module (EMFEM) to extract rich object features. Thus, the neural network can aware the objects with variable scales in videos more comprehensively. Finally, the experimental results show that our method advances the state-of-art in video salient object detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bak, Ç., Erdem, A., Erdem, E.: Two-stream convolutional networks for dynamic saliency prediction, July 2016
Bazzani, L., Larochelle, H., Torresani, L.: Recurrent mixture density network for spatiotemporal visual attention, March 2016
Borji, A., Sihite, D.N., Itti, L.: Salient object detection: a benchmark. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 414–429. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_30
Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_21. http://dl.acm.org/citation.cfm?id=1888150.1888173
Chaabouni, S., Benois-Pineau, J., Hadar, O., Ben Amar, C.: Deep learning for saliency prediction in natural video, April 2016
Chen, C., Li, S., Wang, Y., Qin, H., Hao, A.: Video saliency detection via spatial-temporal fusion and low-rank coherency diffusion. IEEE Trans. Image Process. 26(7), 3156–3170 (2017). https://doi.org/10.1109/TIP.2017.2670143
Cheng, M., Zhang, G., Mitra, N.J., Huang, X., Hu, S.: Global contrast based salient region detection. In: CVPR 2011, pp. 409–416 (2011). https://doi.org/10.1109/CVPR.2011.5995344
Guo, C., Ma, Q., Zhang, L.: Spatio-temporal saliency detection using phase spectrum of quaternion Fourier transform. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2008. https://doi.org/10.1109/CVPR.2008.4587715
Fang, Y., Wang, Z., Lin, W.: Video saliency incorporating spatiotemporal cues and uncertainty weighting. In: 2013 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6, July 2013. https://doi.org/10.1109/ICME.2013.6607572
Fang, Y., Lin, W., Chen, Z., Tsai, C.M., Lin, C.W.: A video saliency detection model in compressed domain. IEEE Trans. Circuits Syst. Video Technol. 24(1), 27–38 (2014). https://doi.org/10.1109/TCSVT.2013.2273613
Girshick, R.: Fast R-CNN, April 2015. https://doi.org/10.1109/ICCV.2015.169
Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans. Image Process. 19(1), 185–198 (2010). https://doi.org/10.1109/TIP.2009.2030969
Hadizadeh, H., Baji, I.V.: Saliency-aware video compression. IEEE Trans. Image Process. 23(1), 19–33 (2014). https://doi.org/10.1109/TIP.2013.2282897
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Proceedings of the 19th International Conference on Neural Information Processing Systems, NIPS 2006, pp. 545–552. MIT Press, Cambridge (2006). http://dl.acm.org/citation.cfm?id=2976456.2976525
Itti, L., Dhavale, N., Pighin, F.: Realistic avatar eye and head animation using a neurobiological model of visual attention. In: Proceedings of SPIE - The International Society for Optical Engineering, vol. 5200, January 2004. https://doi.org/10.1117/12.512618
Jiang, L., Xu, M., Wang, Z.: Predicting video saliency with object-to-motion CNN and two-layer convolutional LSTM, September 2017
Le, T., Sugimoto, A.: Video salient object detection using spatiotemporal deep features. IEEE Trans. Image Process. 27(10), 5002–5015 (2018). https://doi.org/10.1109/TIP.2018.2849860
Le, T.-N., Sugimoto, A.: Contrast based hierarchical spatial-temporal saliency for video. In: Bräunl, T., McCane, B., Rivera, M., Yu, X. (eds.) PSIVT 2015. LNCS, vol. 9431, pp. 734–748. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29451-3_58
Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., Shum, H.: Learning to detect a salient object. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 353–367 (2011). https://doi.org/10.1109/TPAMI.2010.70
Liu, Z., Li, J., Ye, L., Sun, G., Shen, L.: Saliency detection for unconstrained videos using superpixel-level graph and spatiotemporal propagation. IEEE Trans. Circuits Syst. Video Technol. 27(12), 2527–2542 (2017). https://doi.org/10.1109/TCSVT.2016.2595324
Lu, X., Zheng, X., Yuan, Y.: Remote sensing scene classification by unsupervised representation learning. IEEE Trans. Geosci. Remote Sens. 55(9), 5148–5157 (2017). https://doi.org/10.1109/TGRS.2017.2702596
Mahadevan, V., Vasconcelos, N.: Spatiotemporal saliency in dynamic scenes. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 171–177 (2010). https://doi.org/10.1109/TPAMI.2009.112
Pan, Y., Yao, T., Li, H., Mei, T.: Video captioning with transferred semantic attributes. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 984–992, July 2017. https://doi.org/10.1109/CVPR.2017.111
Perazzi, F., Pont-Tuset, J., McWilliams, B., Gool, L.V., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 724–732, June 2016. https://doi.org/10.1109/CVPR.2016.85
Rahtu, E., Kannala, J., Salo, M., Heikkilä, J.: Segmenting salient objects from images and videos. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 366–379. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_27
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2015). https://doi.org/10.1109/TPAMI.2016.2577031
Ren, Z., Gao, S., Chia, L., Rajan, D.: Regularized feature reconstruction for spatio-temporal saliency detection. IEEE Trans. Image Process. 22(8), 3120–3132 (2013). https://doi.org/10.1109/TIP.2013.2259837
Song, H., Wang, W., Zhao, S., Shen, J., Lam, K.M.: Pyramid dilated deeper ConvLSTM for video salient object detection. In: Proceedings of the 15th European Conference, Munich, Germany, 8–14 September 2018, Part XI, pp. 744–760, September 2018
Stefan, M., Cristian, S.: Actions in the eye: dynamic gaze datasets and learnt saliency models for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1408–1424 (2015)
Tu, W., He, S., Yang, Q., Chien, S.: Real-time salient object detection with a minimum spanning tree. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2334–2342, June 2016. https://doi.org/10.1109/CVPR.2016.256
Wang, W., Shen, J., Shao, L.: Consistent video saliency using local gradient flow optimization and global refinement. IEEE Trans. Image Process. 24(11), 4185–4196 (2015). https://doi.org/10.1109/TIP.2015.2460013
Wang, W., Shen, J., Shao, L.: Video salient object detection via fully convolutional networks. IEEE Trans. Image Process. 27(1), 38–49 (2018). https://doi.org/10.1109/TIP.2017.2754941
Wang, W., Shen, J., Xingping, D., Borji, A.: Salient object detection driven by fixation prediction, May 2018. https://doi.org/10.1109/CVPR.2018.00184
Wei, Y., Wen, F., Zhu, W., Sun, J.: Geodesic saliency using background priors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 29–42. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33712-3_3
Wang, W., Shen, J., Porikli, F.: Saliency-aware geodesic video object segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3395–3402, June 2015. https://doi.org/10.1109/CVPR.2015.7298961
Xu, M., Jiang, L., Sun, X., Ye, Z., Wang, Z.: Learning to detect video saliency with HEVC features. IEEE Trans. Image Process. 26(1), 369–385 (2017). https://doi.org/10.1109/TIP.2016.2628583
Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3166–3173 (2013)
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions, November 2016
Zhong, S., Liu, Y., Ren, F., Zhang, J., Ren, T.: Video saliency detection via dynamic consistent spatio-temporal attention modelling. In: Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2013, pp. 1063–1069. AAAI Press (2013). http://dl.acm.org/citation.cfm?id=2891460.2891608
Zhou, F., Kang, S.B., Cohen, M.F.: Time-mapping using space-time saliency. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3358–3365, June 2014. https://doi.org/10.1109/CVPR.2014.429
Acknowledgment
The authors wish to acknowledge the support for the research work from the National Natural Science Foundation, China under grant Nos. 61572351, 61876125 and 61772360.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Yan, X., Wang, Z., Sun, M. (2020). Eye Fixation Assisted Detection of Video Salient Objects. In: Ren, J., et al. Advances in Brain Inspired Cognitive Systems. BICS 2019. Lecture Notes in Computer Science(), vol 11691. Springer, Cham. https://doi.org/10.1007/978-3-030-39431-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-39431-8_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-39430-1
Online ISBN: 978-3-030-39431-8
eBook Packages: Computer ScienceComputer Science (R0)