Skip to main content

On the Use of 3D CNNs for Video Saliency Modeling

  • Conference paper
  • First Online:
Book cover Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020)

Abstract

There has been emerging interest recently in three dimensional (3D) convolutional neural networks (CNNs) as a powerful tool to encode spatio-temporal representations in videos, by adding a third temporal dimension to pre-existing 2D CNNs. In this chapter, we discuss the effectiveness of using 3D convolutions to capture the important motion features in the context of video saliency prediction. The method filters the spatio-temporal features across multiple adjacent frames. This cubic convolution could be effectively applied on a dense sequence of frames propagating the previous frames’ information into the current, reflecting processing mechanisms of the human visual system for better saliency prediction. We extensively evaluate the model performance compared to the state-of-the-art video saliency models on both 2D and 360\(^\circ \) videos. The architecture can efficiently learn expressive spatio-temporal representations and produce high quality video saliency maps on three large-scale 2D datasets, DHF1K, UCF-SPORTS and DAVIS. Investigations on the 360\(^\circ \) Salient360! and datasets show how the approach can generalise.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adel Bargal, S., Zunino, A., Kim, D., Zhang, J., Murino, V., Sclaroff, S.: Excitation backprop for RNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1440–1449 (2018)

    Google Scholar 

  2. Amudha, J., Radha, D., Naresh, P.: Video shot detection using saliency measure. Int. J. Comput. Appl. 975, 8887 (2012). Citeseer

    Google Scholar 

  3. Arun, S.: Turning visual search time on its head. Vis. Res. 74, 86–92 (2012)

    Google Scholar 

  4. Bak, C., Kocak, A., Erdem, E., Erdem, A.: Spatio-temporal saliency networks for dynamic saliency prediction. IEEE Trans. Multimed. 20(7), 1688–1698 (2018)

    Google Scholar 

  5. Boccignone, G.: Nonparametric Bayesian attentive video analysis. In: 2008 19th International Conference on Pattern Recognition, pp. 1–4. IEEE (2008)

    Google Scholar 

  6. Boccignone, G., Ferraro, M.: Modelling gaze shift as a constrained random walk. Phys. A 331(1–2), 207–218 (2004)

    Google Scholar 

  7. Borji, A.: Saliency prediction in the deep learning era: an empirical investigation. arXiv preprint arXiv:1810.03716 (2018)

  8. Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 185–207 (2013)

    Google Scholar 

  9. Borji, A., Itti, L.: CAT 2000: a large scale fixation dataset for boosting saliency research. arXiv preprint arXiv:1505.03581 (2015)

  10. Bruce, N., Tsotsos, J.: Saliency based on information maximization. In: Advances in Neural Information Processing Systems, pp. 155–162 (2006)

    Google Scholar 

  11. Bylinskii, Z., et al.: MIT saliency benchmark (2015)

    Google Scholar 

  12. Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation metrics tell us about saliency models? IEEE Trans. Pattern Anal. Mach. Intell. 41(3), 740–757 (2018)

    Google Scholar 

  13. Chao, F.Y., Zhang, L., Hamidouche, W., Deforges, O.: Salgan360: visual saliency prediction on 360 degree images with generative adversarial networks. In: 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 01–04. IEEE (2018)

    Google Scholar 

  14. Cheng, H.T., Chao, C.H., Dong, J.D., Wen, H.K., Liu, T.L., Sun, M.: Cube padding for weakly-supervised saliency prediction in 360 videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1420–1429 (2018)

    Google Scholar 

  15. Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: A deep multi-level network for saliency prediction. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 3488–3493. IEEE (2016)

    Google Scholar 

  16. Dahou Djilali, Y.A., Sayah, M., McGuinness, K., O’Connor, N.E.: 3DSAL: An efficient 3D-CNN architecture for video saliency prediction. In: Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, pp. 27–36. INSTICC, SciTePress (2020). https://doi.org/10.5220/0008875600270036

  17. David, E.J., Gutiérrez, J., Coutrot, A., Da Silva, M.P., Callet, P.L.: A dataset of head and eye movements for 360 videos. In: Proceedings of the 9th ACM Multimedia Systems Conference, pp. 432–437 (2018)

    Google Scholar 

  18. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  19. Droste, R., Jiao, J., Noble, J.A.: Unified image and video saliency modeling. arXiv preprint arXiv:2003.05477 (2020)

  20. Ehinger, K.A., Hidalgo-Sotelo, B., Torralba, A., Oliva, A.: Modelling search for people in 900 scenes: a combined source model of eye guidance. Vis. Cogn. 17(6–7), 945–978 (2009)

    Google Scholar 

  21. Frintrop, S., Jensfelt, P.: Attentional landmarks and active gaze control for visual SLAM. IEEE Trans. Rob. 24(5), 1054–1065 (2008)

    Google Scholar 

  22. Gao, D., Vasconcelos, N.: Discriminant saliency for visual recognition from cluttered scenes. In: Advances in Neural Information Processing Systems, pp. 481–488 (2005)

    Google Scholar 

  23. Gao, J., Huang, Y., Yu, H.H.: Method and system for video summarization, 31 May 2016. US Patent 9,355,635

    Google Scholar 

  24. Garcia-Diaz, A., Fdez-Vidal, X.R., Pardo, X.M., Dosil, R.: Decorrelation and distinctiveness provide with human-like saliency. In: Blanc-Talon, J., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2009. LNCS, vol. 5807, pp. 343–354. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04697-1_32

    Chapter  Google Scholar 

  25. Garcia-Diaz, A., Fdez-Vidal, X.R., Pardo, X.M., Dosil, R.: Saliency from hierarchical adaptation through decorrelation and variance normalization. Image Vis. Comput. 30(1), 51–64 (2012)

    Google Scholar 

  26. Goferman, S., Zelnik-Manor, L., Tal, A.: Context-aware saliency detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(10), 1915–1926 (2012)

    Google Scholar 

  27. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  28. Guo, C., Ma, Q., Zhang, L.: Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)

    Google Scholar 

  29. Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans. Image Process. 19(1), 185–198 (2009)

    MathSciNet  MATH  Google Scholar 

  30. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Google Scholar 

  31. Hossein Khatoonabadi, S., Vasconcelos, N., Bajic, I.V., Shan, Y.: How many bits does it take for a stimulus to be salient? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5501–5510 (2015)

    Google Scholar 

  32. Huang, X., Shen, C., Boix, X., Zhao, Q.: SALICON: reducing the semantic gap in saliency prediction by adapting deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 262–270 (2015)

    Google Scholar 

  33. Itti, L.: Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans. Image Process. 13(10), 1304–1318 (2004)

    Google Scholar 

  34. Itti, L., Baldi, P.: Bayesian surprise attracts human attention. Vision. Res. 49(10), 1295–1306 (2009)

    Google Scholar 

  35. Itti, L., Koch, C.: A saliency-based search mechanism for overt and covert shifts of visual attention. Vis. Res. 40(10–12), 1489–1506 (2000)

    Google Scholar 

  36. Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 11, 1254–1259 (1998)

    Google Scholar 

  37. Jacobson, N., Lee, Y.L., Mahadevan, V., Vasconcelos, N., Nguyen, T.Q.: A novel approach to FRUC using discriminant saliency and frame segmentation. IEEE Trans. Image Process. 19(11), 2924–2934 (2010)

    MathSciNet  MATH  Google Scholar 

  38. James, W.: The Principles of Psychology, vol. 1. Cosimo, Inc. (1950)

    Google Scholar 

  39. Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)

    Google Scholar 

  40. Jia, S., Bruce, N.D.: EML-NET: an expandable multi-layer network for saliency prediction. Image Vis. Comput. 95, 103887 (2020)

    Google Scholar 

  41. Jiang, L., Xu, M., Liu, T., Qiao, M., Wang, Z.: DeepVS: a deep learning based video saliency prediction approach. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 602–617 (2018)

    Google Scholar 

  42. Jiang, L., Xu, M., Wang, Z.: Predicting video saliency with object-to-motion CNN and two-layer convolutional LSTM. arXiv preprint arXiv:1709.06316 (2017)

  43. Jiang, M., Huang, S., Duan, J., Zhao, Q.: SALICON: saliency in context. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015

    Google Scholar 

  44. Judd, T., Durand, F., Torralba, A.: A benchmark of computational models of saliency to predict human fixations. MIT Technical report (2012)

    Google Scholar 

  45. Judd, T., Durand, F., Torralba, A.: A benchmark of computational models of saliency to predict human fixations (2012)

    Google Scholar 

  46. Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2106–2113. IEEE (2009)

    Google Scholar 

  47. Kienzle, W., Franz, M.O., Schölkopf, B., Wichmann, F.A.: Center-surround patterns emerge as optimal predictors for human saccade targets. J. Vis. 9(5), 7 (2009)

    Google Scholar 

  48. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  49. Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. In: Vaina, L.M. (ed.) Matters of intelligence. Synthese Library (Studies in Epistemology, Logic, Methodology, and Philosophy of Science), vol. 188, pp. 115–141. Springer, Dordrecht (1987). https://doi.org/10.1007/978-94-009-3833-5_5

    Chapter  Google Scholar 

  50. Kootstra, G., Nederveen, A., De Boer, B.: Paying attention to symmetry. In: British Machine Vision Conference (BMVC2008), pp. 1115–1125. The British Machine Vision Association and Society for Pattern Recognition (2008)

    Google Scholar 

  51. Kotseruba, I., Wloka, C., Rasouli, A., Tsotsos, J.K.: Do saliency models detect odd-one-out targets? New datasets and evaluations. arXiv preprint arXiv:2005.06583 (2020)

  52. Kruthiventi, S.S., Ayush, K., Babu, R.V.: DeepFix: a fully convolutional neural network for predicting human eye fixations. IEEE Trans. Image Process. 26(9), 4446–4456 (2017)

    MathSciNet  MATH  Google Scholar 

  53. Kümmerer, M., Theis, L., Bethge, M.: Deep gaze i: Boosting saliency prediction with feature maps trained on ImageNet. arXiv preprint arXiv:1411.1045 (2014)

  54. Lai, Q., Wang, W., Sun, H., Shen, J.: Video saliency prediction using spatiotemporal residual attentive networks. IEEE Trans. Image Process. 29, 1113–1126 (2019)

    MathSciNet  MATH  Google Scholar 

  55. Le Meur, O., Le Callet, P., Barba, D.: Predicting visual fixations on video based on low-level visual features. Vis. Res. 47(19), 2483–2498 (2007)

    Google Scholar 

  56. Le Meur, O., Le Callet, P., Barba, D., Thoreau, D.: A coherent computational approach to model bottom-up visual attention. IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 802–817 (2006)

    Google Scholar 

  57. Leboran, V., Garcia-Diaz, A., Fdez-Vidal, X.R., Pardo, X.M.: Dynamic whitening saliency. IEEE Trans. Pattern Anal. Mach. Intell. 39(5), 893–907 (2017)

    Google Scholar 

  58. Li, X., et al.: DeepSaliency: multi-task deep neural network model for salient object detection. IEEE Trans. Image Process. 25(8), 3919–3930 (2016)

    MathSciNet  MATH  Google Scholar 

  59. Linardos, P., Mohedano, E., Nieto, J.J., O’Connor, N.E., Giro-i Nieto, X., McGuinness, K.: Simple vs complex temporal recurrences for video saliency prediction. arXiv preprint arXiv:1907.01869 (2019)

  60. Liu, N., Han, J.: A deep spatial contextual long-term recurrent convolutional network for saliency detection. IEEE Trans. Image Process. 27(7), 3264–3274 (2018)

    MathSciNet  Google Scholar 

  61. Liu, N., Han, J., Zhang, D., Wen, S., Liu, T.: Predicting eye fixations using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 362–370 (2015)

    Google Scholar 

  62. Ma, Y.F., Hua, X.S., Lu, L., Zhang, H.J.: A generic framework of user attention model and its application in video summarization. IEEE Trans. Multimed. 7(5), 907–919 (2005)

    Google Scholar 

  63. Mahadevan, V., Vasconcelos, N.: Saliency-based discriminant tracking. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1007–1013. IEEE (2009)

    Google Scholar 

  64. Mancas, M., Ferrera, V.P., Riche, N., Taylor, J.G.: From Human Attention to Computational Attention, vol. 2. Springer, Heidelberg (2016). https://doi.org/10.1007/978-1-4939-3435-5

    Book  Google Scholar 

  65. Marat, S., Guironnet, M., Pellerin, D.: Video summarization using a visual attention model. In: 2007 15th European Signal Processing Conference, pp. 1784–1788. IEEE (2007)

    Google Scholar 

  66. Marat, S., Phuoc, T.H., Granjon, L., Guyader, N., Pellerin, D., Guérin-Dugué, A.: Modelling spatio-temporal saliency to predict gaze direction for short videos. Int. J. Comput. Vis. 82(3), 231 (2009)

    Google Scholar 

  67. Mathe, S., Sminchisescu, C.: Actions in the eye: dynamic gaze datasets and learnt saliency models for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1408–1424 (2014)

    Google Scholar 

  68. Mathe, S., Sminchisescu, C.: Actions in the eye: dynamic gaze datasets and learnt saliency models for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1408–1424 (2015)

    Google Scholar 

  69. Milanese, R.: Detecting salient regions in an image: from biological evidence to computer implementation. Ph. D Theses, the University of Geneva (1993)

    Google Scholar 

  70. Min, K., Corso, J.J.: TASED-net: temporally-aggregating spatial encoder-decoder network for video saliency detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2394–2403 (2019)

    Google Scholar 

  71. Mital, P.K., Smith, T.J., Hill, R.L., Henderson, J.M.: Clustering of gaze during dynamic scene viewing is predicted by motion. Cogn. Comput. 3(1), 5–24 (2011)

    Google Scholar 

  72. Murray, N., Vanrell, M., Otazu, X., Parraga, C.A.: Saliency estimation using a non-parametric low-level vision model. In: CVPR 2011, pp. 433–440. IEEE (2011)

    Google Scholar 

  73. Navalpakkam, V., Itti, L.: An integrated model of top-down and bottom-up attention for optimizing detection speed. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 2049–2056. IEEE (2006)

    Google Scholar 

  74. Nwankpa, C., Ijomah, W., Gachagan, A., Marshall, S.: Activation functions: comparison of trends in practice and research for deep learning. CoRR abs/1811.03378 (2018). http://arxiv.org/abs/1811.03378

  75. Oliva, A., Torralba, A., Castelhano, M.S., Henderson, J.M.: Top-down control of visual attention in object detection. In: Proceedings 2003 International Conference on Image Processing (Cat. No. 03CH37429), vol. 1, p. I-253. IEEE (2003)

    Google Scholar 

  76. Ouerhani, N., Hügli, H.: Real-time visual attention on a massively parallel SIMD architecture. Real-Time Imaging 9(3), 189–196 (2003)

    MATH  Google Scholar 

  77. Pan, J., et al.: SalGAN: visual saliency prediction with generative adversarial networks. arXiv preprint arXiv:1701.01081 (2017)

  78. Pan, J., Sayrol, E., Giro-i Nieto, X., McGuinness, K., O’Connor, N.E.: Shallow and deep convolutional networks for saliency prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 598–606 (2016)

    Google Scholar 

  79. Panagiotis, L., Eva, M., Monica, C., Cathal, G., Xavier, G.i.N.: Temporal saliency adaptation in egocentric videos (2018)

    Google Scholar 

  80. Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 724–732 (2016)

    Google Scholar 

  81. Prechelt, L.: Early stopping - but when? In: Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 1524, pp. 55–69. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49430-8_3

    Chapter  Google Scholar 

  82. Privitera, C.M., Stark, L.W.: Algorithms for defining visual regions-of-interest: comparison with eye fixations. IEEE Trans. Pattern Anal. Mach. Intell. 22(9), 970–982 (2000)

    Google Scholar 

  83. Rai, Y., Le Callet, P., Guillotel, P.: Which saliency weighting for omni directional image quality assessment? In: 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–6. IEEE (2017)

    Google Scholar 

  84. Ren, Z., Gao, S., Chia, L.T., Tsang, I.W.H.: Region-based saliency detection and its application in object recognition. IEEE Trans. Circ. Syst. Video Technol. 24(5), 769–779 (2013)

    Google Scholar 

  85. Rimey, R.D., Brown, C.M.: Controlling eye movements with hidden Markov models. Int. J. Comput. Vis. 7(1), 47–65 (1991)

    Google Scholar 

  86. Rosenholtz, R.: A simple saliency model predicts a number of motion popout phenomena. Vis. Res. 39(19), 3157–3163 (1999)

    Google Scholar 

  87. Rudoy, D., Goldman, D.B., Shechtman, E., Zelnik-Manor, L.: Learning video saliency from human gaze using candidate selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1147–1154 (2013)

    Google Scholar 

  88. Salvucci, D.D.: An integrated model of eye movements and visual encoding. Cogn. Syst. Res. 1(4), 201–220 (2001)

    Google Scholar 

  89. Saslow, M.: Effects of components of displacement-step stimuli upon latency for saccadic eye movement. Josa 57(8), 1024–1029 (1967)

    Google Scholar 

  90. Seo, H.J., Milanfar, P.: Static and space-time visual saliency detection by self-resemblance. J. Vis. 9(12), 15 (2009)

    Google Scholar 

  91. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  92. Sohlberg, M.M., Mateer, C.A.: Introduction to Cognitive Rehabilitation: Theory and Practice. Guilford Press (1989)

    Google Scholar 

  93. Sprague, N., Ballard, D.: Eye movements for reward maximization. In: Advances in Neural Information Processing Systems, pp. 1467–1474 (2004)

    Google Scholar 

  94. Suzuki, T., Yamanaka, T.: Saliency map estimation for omni-directional image considering prior distributions. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2079–2084. IEEE (2018)

    Google Scholar 

  95. Torralba, A.: Modeling global scene factors in attention. JOSA A 20(7), 1407–1418 (2003)

    Google Scholar 

  96. Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)

    Google Scholar 

  97. Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12(1), 97–136 (1980)

    Google Scholar 

  98. Tsotsos, J.K., Culhane, S.M., Wai, W.Y.K., Lai, Y., Davis, N., Nuflo, F.: Modeling visual attention via selective tuning. Artif. Intell. 78(1–2), 507–545 (1995)

    Google Scholar 

  99. Vig, E., Dorr, M., Cox, D.: Large-scale optimization of hierarchical features for saliency prediction in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2798–2805 (2014)

    Google Scholar 

  100. Wang, W., Shen, J., Guo, F., Cheng, M.M., Borji, A.: Revisiting video saliency: a large-scale benchmark and a new model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4894–4903 (2018)

    Google Scholar 

  101. Wang, W., Shen, J., Xie, J., Cheng, M.M., Ling, H., Borji, A.: Revisiting video saliency prediction in the deep learning era. IEEE Trans. Pattern Anal. Mach. Intell. (2019)

    Google Scholar 

  102. Wang, W., Shen, J., Yang, R., Porikli, F.: Saliency-aware video object segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 40(1), 20–33 (2017)

    Google Scholar 

  103. Wolfe, J.M.: Guided search 2.0 a revised model of visual search. Psychon. Bull. Rev. 1(2), 202–238 (1994)

    Google Scholar 

  104. Wu, X., Yuen, P.C., Liu, C., Huang, J.: Shot boundary detection: an information saliency approach. In: 2008 Congress on Image and Signal Processing, vol. 2, pp. 808–812. IEEE (2008)

    Google Scholar 

  105. Xu, M., Li, C., Zhang, S., Le Callet, P.: State-of-the-art in 360 video/image processing: perception, assessment and compression. IEEE J. Sel. Top. Signal Process. 14(1), 5–26 (2020)

    Google Scholar 

  106. Xu, M., Song, Y., Wang, J., Qiao, M., Huo, L., Wang, Z.: Predicting head movement in panoramic video: a deep reinforcement learning approach. IEEE Trans. Pattern Anal. Mach. Intell. 41(11), 2693–2708 (2018)

    Google Scholar 

  107. Xu, Y., et al.: Gaze prediction in dynamic 360 immersive videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5333–5342 (2018)

    Google Scholar 

  108. Zhang, J., Sclaroff, S.: Saliency detection: a boolean map approach. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 153–160 (2013)

    Google Scholar 

  109. Zhang, K., Chen, Z.: Video saliency prediction based on spatial-temporal two-stream network. IEEE Trans. Circ. Syst. Video Technol. 29(12), 3544–3557 (2018)

    Google Scholar 

  110. Zhang, L., Tong, M.H., Marks, T.K., Shan, H., Cottrell, G.W.: SUN: a Bayesian framework for saliency using natural statistics. J. Vis. 8(7), 32 (2008)

    Google Scholar 

  111. Zhang, Z., Xu, Y., Yu, J., Gao, S.: Saliency detection in 360 videos. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 488–503 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yasser Abdelaziz Dahou Djilali .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dahou Djilali, Y.A., Sayah, M., McGuinness, K., O’Connor, N.E. (2022). On the Use of 3D CNNs for Video Saliency Modeling. In: Bouatouch, K., et al. Computer Vision, Imaging and Computer Graphics Theory and Applications. VISIGRAPP 2020. Communications in Computer and Information Science, vol 1474. Springer, Cham. https://doi.org/10.1007/978-3-030-94893-1_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94893-1_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94892-4

  • Online ISBN: 978-3-030-94893-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics