Skip to main content

Eye Fixation Assisted Detection of Video Salient Objects

  • Conference paper
  • First Online:
Advances in Brain Inspired Cognitive Systems (BICS 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11691))

Included in the following conference series:

  • 1323 Accesses

Abstract

With the increasing maturity of image saliency detection, more and more people are focusing their research on video saliency detection. Currently, video saliency detection can be divided into two forms, eye fixation detection and salient objects detection. In this article, we focus on exploring the relationship between them. Firstly, we propose a network called fixation assisted video salient object detection network (FAVSODNet), which uses the eye gaze information in videos to assist in detecting video salient objects. A fixation assisted module (FAM) is designed to connect FP task and SOD task deeply. Under the guidance of the eye fixation information, multiple salient objects in complex scene can be detected more correctly. Moreover, when the scene suddenly changes or a new person appears, it can better to detect the correct salient objects with the aid of fixation maps. In addition, we adopt an extended multi-scale feature extraction module (EMFEM) to extract rich object features. Thus, the neural network can aware the objects with variable scales in videos more comprehensively. Finally, the experimental results show that our method advances the state-of-art in video salient object detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bak, Ç., Erdem, A., Erdem, E.: Two-stream convolutional networks for dynamic saliency prediction, July 2016

    Google Scholar 

  2. Bazzani, L., Larochelle, H., Torresani, L.: Recurrent mixture density network for spatiotemporal visual attention, March 2016

    Google Scholar 

  3. Borji, A., Sihite, D.N., Itti, L.: Salient object detection: a benchmark. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 414–429. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_30

    Chapter  Google Scholar 

  4. Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_21. http://dl.acm.org/citation.cfm?id=1888150.1888173

    Chapter  Google Scholar 

  5. Chaabouni, S., Benois-Pineau, J., Hadar, O., Ben Amar, C.: Deep learning for saliency prediction in natural video, April 2016

    Google Scholar 

  6. Chen, C., Li, S., Wang, Y., Qin, H., Hao, A.: Video saliency detection via spatial-temporal fusion and low-rank coherency diffusion. IEEE Trans. Image Process. 26(7), 3156–3170 (2017). https://doi.org/10.1109/TIP.2017.2670143

    Article  MathSciNet  MATH  Google Scholar 

  7. Cheng, M., Zhang, G., Mitra, N.J., Huang, X., Hu, S.: Global contrast based salient region detection. In: CVPR 2011, pp. 409–416 (2011). https://doi.org/10.1109/CVPR.2011.5995344

  8. Guo, C., Ma, Q., Zhang, L.: Spatio-temporal saliency detection using phase spectrum of quaternion Fourier transform. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2008. https://doi.org/10.1109/CVPR.2008.4587715

  9. Fang, Y., Wang, Z., Lin, W.: Video saliency incorporating spatiotemporal cues and uncertainty weighting. In: 2013 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6, July 2013. https://doi.org/10.1109/ICME.2013.6607572

  10. Fang, Y., Lin, W., Chen, Z., Tsai, C.M., Lin, C.W.: A video saliency detection model in compressed domain. IEEE Trans. Circuits Syst. Video Technol. 24(1), 27–38 (2014). https://doi.org/10.1109/TCSVT.2013.2273613

    Article  Google Scholar 

  11. Girshick, R.: Fast R-CNN, April 2015. https://doi.org/10.1109/ICCV.2015.169

  12. Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans. Image Process. 19(1), 185–198 (2010). https://doi.org/10.1109/TIP.2009.2030969

    Article  MathSciNet  MATH  Google Scholar 

  13. Hadizadeh, H., Baji, I.V.: Saliency-aware video compression. IEEE Trans. Image Process. 23(1), 19–33 (2014). https://doi.org/10.1109/TIP.2013.2282897

    Article  MathSciNet  MATH  Google Scholar 

  14. Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Proceedings of the 19th International Conference on Neural Information Processing Systems, NIPS 2006, pp. 545–552. MIT Press, Cambridge (2006). http://dl.acm.org/citation.cfm?id=2976456.2976525

  15. Itti, L., Dhavale, N., Pighin, F.: Realistic avatar eye and head animation using a neurobiological model of visual attention. In: Proceedings of SPIE - The International Society for Optical Engineering, vol. 5200, January 2004. https://doi.org/10.1117/12.512618

  16. Jiang, L., Xu, M., Wang, Z.: Predicting video saliency with object-to-motion CNN and two-layer convolutional LSTM, September 2017

    Google Scholar 

  17. Le, T., Sugimoto, A.: Video salient object detection using spatiotemporal deep features. IEEE Trans. Image Process. 27(10), 5002–5015 (2018). https://doi.org/10.1109/TIP.2018.2849860

    Article  MathSciNet  Google Scholar 

  18. Le, T.-N., Sugimoto, A.: Contrast based hierarchical spatial-temporal saliency for video. In: Bräunl, T., McCane, B., Rivera, M., Yu, X. (eds.) PSIVT 2015. LNCS, vol. 9431, pp. 734–748. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29451-3_58

    Chapter  Google Scholar 

  19. Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., Shum, H.: Learning to detect a salient object. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 353–367 (2011). https://doi.org/10.1109/TPAMI.2010.70

    Article  Google Scholar 

  20. Liu, Z., Li, J., Ye, L., Sun, G., Shen, L.: Saliency detection for unconstrained videos using superpixel-level graph and spatiotemporal propagation. IEEE Trans. Circuits Syst. Video Technol. 27(12), 2527–2542 (2017). https://doi.org/10.1109/TCSVT.2016.2595324

    Article  Google Scholar 

  21. Lu, X., Zheng, X., Yuan, Y.: Remote sensing scene classification by unsupervised representation learning. IEEE Trans. Geosci. Remote Sens. 55(9), 5148–5157 (2017). https://doi.org/10.1109/TGRS.2017.2702596

    Article  Google Scholar 

  22. Mahadevan, V., Vasconcelos, N.: Spatiotemporal saliency in dynamic scenes. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 171–177 (2010). https://doi.org/10.1109/TPAMI.2009.112

    Article  Google Scholar 

  23. Pan, Y., Yao, T., Li, H., Mei, T.: Video captioning with transferred semantic attributes. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 984–992, July 2017. https://doi.org/10.1109/CVPR.2017.111

  24. Perazzi, F., Pont-Tuset, J., McWilliams, B., Gool, L.V., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 724–732, June 2016. https://doi.org/10.1109/CVPR.2016.85

  25. Rahtu, E., Kannala, J., Salo, M., Heikkilä, J.: Segmenting salient objects from images and videos. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 366–379. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_27

    Chapter  Google Scholar 

  26. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2015). https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  27. Ren, Z., Gao, S., Chia, L., Rajan, D.: Regularized feature reconstruction for spatio-temporal saliency detection. IEEE Trans. Image Process. 22(8), 3120–3132 (2013). https://doi.org/10.1109/TIP.2013.2259837

    Article  Google Scholar 

  28. Song, H., Wang, W., Zhao, S., Shen, J., Lam, K.M.: Pyramid dilated deeper ConvLSTM for video salient object detection. In: Proceedings of the 15th European Conference, Munich, Germany, 8–14 September 2018, Part XI, pp. 744–760, September 2018

    Google Scholar 

  29. Stefan, M., Cristian, S.: Actions in the eye: dynamic gaze datasets and learnt saliency models for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1408–1424 (2015)

    Article  Google Scholar 

  30. Tu, W., He, S., Yang, Q., Chien, S.: Real-time salient object detection with a minimum spanning tree. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2334–2342, June 2016. https://doi.org/10.1109/CVPR.2016.256

  31. Wang, W., Shen, J., Shao, L.: Consistent video saliency using local gradient flow optimization and global refinement. IEEE Trans. Image Process. 24(11), 4185–4196 (2015). https://doi.org/10.1109/TIP.2015.2460013

    Article  MathSciNet  MATH  Google Scholar 

  32. Wang, W., Shen, J., Shao, L.: Video salient object detection via fully convolutional networks. IEEE Trans. Image Process. 27(1), 38–49 (2018). https://doi.org/10.1109/TIP.2017.2754941

    Article  MathSciNet  MATH  Google Scholar 

  33. Wang, W., Shen, J., Xingping, D., Borji, A.: Salient object detection driven by fixation prediction, May 2018. https://doi.org/10.1109/CVPR.2018.00184

  34. Wei, Y., Wen, F., Zhu, W., Sun, J.: Geodesic saliency using background priors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 29–42. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33712-3_3

    Chapter  Google Scholar 

  35. Wang, W., Shen, J., Porikli, F.: Saliency-aware geodesic video object segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3395–3402, June 2015. https://doi.org/10.1109/CVPR.2015.7298961

  36. Xu, M., Jiang, L., Sun, X., Ye, Z., Wang, Z.: Learning to detect video saliency with HEVC features. IEEE Trans. Image Process. 26(1), 369–385 (2017). https://doi.org/10.1109/TIP.2016.2628583

    Article  MathSciNet  MATH  Google Scholar 

  37. Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3166–3173 (2013)

    Google Scholar 

  38. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions, November 2016

    Google Scholar 

  39. Zhong, S., Liu, Y., Ren, F., Zhang, J., Ren, T.: Video saliency detection via dynamic consistent spatio-temporal attention modelling. In: Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2013, pp. 1063–1069. AAAI Press (2013). http://dl.acm.org/citation.cfm?id=2891460.2891608

  40. Zhou, F., Kang, S.B., Cohen, M.F.: Time-mapping using space-time saliency. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3358–3365, June 2014. https://doi.org/10.1109/CVPR.2014.429

Download references

Acknowledgment

The authors wish to acknowledge the support for the research work from the National Natural Science Foundation, China under grant Nos. 61572351, 61876125 and 61772360.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zheng Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yan, X., Wang, Z., Sun, M. (2020). Eye Fixation Assisted Detection of Video Salient Objects. In: Ren, J., et al. Advances in Brain Inspired Cognitive Systems. BICS 2019. Lecture Notes in Computer Science(), vol 11691. Springer, Cham. https://doi.org/10.1007/978-3-030-39431-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-39431-8_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-39430-1

  • Online ISBN: 978-3-030-39431-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics