Compressed-domain correlates of human fixations in dynamic scenes

Khatoonabadi, Sayed Hossein; Bajić, Ivan V.; Shan, Yufeng

doi:10.1007/s11042-015-2802-3

Compressed-domain correlates of human fixations in dynamic scenes

Published: 02 August 2015

Volume 74, pages 10057–10075, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Sayed Hossein Khatoonabadi ORCID: orcid.org/0000-0002-9927-9595¹,
Ivan V. Bajić¹ &
Yufeng Shan²

251 Accesses
Explore all metrics

Abstract

In this paper we present two compressed-domain features that are highly indicative of saliency in natural video. We demonstrate the potential of these two features to indicate saliency by comparing their statistics around human fixation points against their statistics at control points away from fixations. Then, using these features, we construct a simple and effective saliency estimation method for compressed video, which utilizes only motion vectors, block coding modes and coded residuals from the bitstream, with partial decoding. The proposed algorithm has been extensively tested on two ground truth datasets using several accuracy metrics. The results indicate its superior performance over several state-of-the-art compressed-domain and pixel-domain algorithms for saliency estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast and efficient saliency detection model in video compressed-domain for human fixations prediction

Article 12 December 2016

Compressed-domain visual saliency models: a comparative study

Article 13 December 2016

Saliency detection in MPEG and HEVC video using intra-frame and inter-frame distances

Article 09 July 2015

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Agarwal G, Anbu A, Sinha A (2003) A fast algorithm to find the region-of-interest in the compressed MPEG domain. In: Proc. IEEE ICME’03, vol 2, pp 133–136
Arvanitidou M G, Glantz A, Krutz A, Sikora T, Mrak M, Kondoz A (2009) Global motion estimation using variable block sizes and its application to object segmentation. In: Proc. IEEE WIAMIS’09, pp 173–176
Borji A, Itti L (2013) State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell 35(1):185–207
Article MathSciNet Google Scholar
Borji A, Sihite D N, Itti L (2013) Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. IEEE Trans Image Process 22(1):55–69
Article MathSciNet Google Scholar
Dagan I, Lee L, Pereira F (1997) Similarity-based methods for word sense disambiguation. In: Proc. European chapter of the association for computational linguistics, pp 56–63
Efron B, Tibshirani R (1993) An introduction to the bootstrap, vol 57. CRC press
Einhäuser W, Spain M, Perona P (2008) Objects predict fixations better than early saliency. J Vis 8(14)
Fang Y, Lin W, Chen Z, Tsai C M, Lin C W (2014) A video saliency detection model in compressed domain. IEEE Trans Circuits Syst Video Technol 24 (1):27–38
Article Google Scholar
Garcia-Diaz A, Fdez-Vidal X R, Pardo X M, Dosil R (2012) Saliency from hierarchical adaptation through decorrelation and variance normalization. Image Vis Comput 30(1):51–64
Article Google Scholar
Guo C, Zhang L (2010) A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans Image Process 19(1):185–198
Article MathSciNet Google Scholar
Hadizadeh H, Bajić I V (2014) Saliency-aware video compression. IEEE Trans Image Process 23(1): 19–33
Article MathSciNet Google Scholar
Hadizadeh H, Bajić I V, Cheung G (2013) Video error concealment using a computation-efficient low saliency prior. IEEE Trans Multimed 15(8):2099–2113
Article Google Scholar
Hadizadeh H, Enriquez M J, Bajić I V (2012) Eye-tracking database for a set of standard video sequences. IEEE Trans Image Process 21(2):898–903
Article MathSciNet Google Scholar
Han S, Vasconcelos N (2010) Biologically plausible saliency mechanisms improve feedforward object recognition. Vis Res 50(22):2295–2307
Article Google Scholar
Harel J, Koch C, Perona P (2007) Graph-based visual saliency. Adv Neural Inf Process Syst 19:545–552
Google Scholar
Hochberg Y, Tamhane A C (1987) Multiple comparison procedures, Wiley
Itti L, Koch C (2001) Feature combination strategies for saliency-based visual attention systems. J Electron Imag 10(1):161–169
Article Google Scholar
Itti L (2004) Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans Image Process. 13(10):1304–1318
Article Google Scholar
Itti L, Dhavale N, Pighin F (2004) Realistic avatar eye and head animation using a neurobiological model of visual attention. In: Optical science and technology, SPIE’s 48th annual meeting, pp 64–78
Itti L, Baldi P (2005) A principled approach to detecting surprising events in video. In: Proc. IEEE CVPR’05, vol 1, pp 631–637
Itti L, Baldi P F (2006) Bayesian surprise attracts human attention. Adv Neural Inf Process Syst 19:547–554
Google Scholar
Itti L, Baldi P (2009) Bayesian surprise attracts human attention. Vis Res 49(10):1295–1306
Article Google Scholar
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Article Google Scholar
Ji Q G, Fang Z D, Xie Z H, Lu Z M (2013) Video abstraction based on the visual attention model and online clustering. Signal Process Image Commun 28 (3):241–253
Article Google Scholar
Khalilian H, Bajić I V (2013) Video watermarking with empirical PCA-based decoding. IEEE Trans Image Process 22(12):4825–4840
Article MathSciNet Google Scholar
Khatoonabadi S H, Bajić I V, Shan Y (2014) Comparison of visual saliency models for compressed video. In: Proc. IEEE ICIP’14, pp 1081–1085
Khatoonabadi S H, Bajić I V, Shan Y (2014) Compressed-domain correlates of fixations in video. In: Proc. 1st Intl. workshop on perception inspired video processing (PIVP’14), pp 3–8
Kim W, Jung C, Kim C (2011) Spatiotemporal saliency detection and its applications in static and dynamic scenes. IEEE Trans Circuits Syst Video Technol 21 (4):446–456
Article MathSciNet Google Scholar
Kreyszig E (1970) Introductory mathematical statistics: principles and methods. Wiley, p New York
Le Meur O, Baccino T (2013) Methods for comparing scanpaths and saliency maps: strengths and weaknesses. Behav Res Methods 45(1):251–266
Article Google Scholar
Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37(1):145–151
Article MATH Google Scholar
Liu Z, Yan H, Shen L, Wang Y, Zhang Z (2009) A motion attention model based rate control algorithm for H. 264/AVC. In: The 8th IEEE/ACIS international conference on computer and information science (ICIS’09), pp 568–573
Ma YF, Zhang HJ A new perceived motion based shot content representation. In: Proc. IEEE ICIP’01, vol 3, pp 426-429
Ma Y F, Zhang H J (2002) A model of motion attention for video skimming. In: Proc. IEEE ICIP’02, vol 1, pp 129–132
Mahadevan V, Vasconcelos N (2013) Biologically inspired object tracking using center-surround saliency mechanisms. IEEE Trans Pattern Anal Mach Intell 35 (3):541–554
Article Google Scholar
Mateescu VA, Bajić IV (2014) Attention retargeting by color manipulation in images. In: Proc. 1st Intl. workshop on perception inspired video processing (PIVP’14), pp 15-20
Moorthy A K, Bovik A C (2009) Visual importance pooling for image quality assessment. IEEE J Sel Topics Signal Process 3(2):193–201
Article Google Scholar
Muthuswamy K, Rajan D (2013) Salient motion detection in compressed domain. IEEE Signal Process Lett 20(10):996–999
Article Google Scholar
Niebur E, Koch C (1998) Computational architectures for attention. The Attentive Brain, chapter, chapter 9. MIT Press, Cambridge, pp 163–186
Google Scholar
Peters R J, Iyer A, Itti L, Koch C (2005) Components of bottom-up gaze allocation in natural images. Vis Res 45(18):2397–2416
Article Google Scholar
Reinagel P, Zador A M (1999) Natural scene statistics at the center of gaze. Netw Comput Neural Syst 10:1–10
Article Google Scholar
Seo H J, Milanfar P (2009) Static and space-time visual saliency detection by self-resemblance. J Vis 9(12):1–27
Article Google Scholar
Sinha A, Agarwal G, Anbu A (2004) Region-of-interest based compressed domain video transcoding scheme. In: Proc. IEEE ICASSP’04, vol 3, pp 161–164
Sullivan G J, Ohm J, Woo-Jin H, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circuits Syst Video Technol 22 (12):1649–1668
Article Google Scholar
Swets A (1996) Signal detection theory and ROC analysis in psychology and diagnostics: collected papers. Lawrence Erlbaum Associates Inc
The Dynamic Images and Eye Movements (DIEM) project. http://thediemproject.wordpress.com
Treisman A M, Gelade G (1980) A feature-integration theory of attention. Cognitive Psychol 12(1):97–136
Article Google Scholar
Wiegand T, Sullivan G J, Bjontegaard G, Luthra A (2003) Overview of the H. 264/AVC video coding standard. IEEE Trans Circuits Syst Video Technol 13 (7):560–576
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by the Cisco Research Award CG# 573690 and NSERC Grant RGPIN 327249.

Author information

Authors and Affiliations

Simon Fraser University, Burnaby, BC, Canada
Sayed Hossein Khatoonabadi & Ivan V. Bajić
Cisco Systems, Boxborough, MA, USA
Yufeng Shan

Authors

Sayed Hossein Khatoonabadi
View author publications
You can also search for this author inPubMed Google Scholar
Ivan V. Bajić
View author publications
You can also search for this author inPubMed Google Scholar
Yufeng Shan
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Sayed Hossein Khatoonabadi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khatoonabadi, S.H., Bajić, I.V. & Shan, Y. Compressed-domain correlates of human fixations in dynamic scenes. Multimed Tools Appl 74, 10057–10075 (2015). https://doi.org/10.1007/s11042-015-2802-3

Download citation

Received: 29 January 2015
Revised: 06 May 2015
Accepted: 01 July 2015
Published: 02 August 2015
Issue Date: November 2015
DOI: https://doi.org/10.1007/s11042-015-2802-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Compressed-domain correlates of human fixations in dynamic scenes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A fast and efficient saliency detection model in video compressed-domain for human fixations prediction

Compressed-domain visual saliency models: a comparative study

Saliency detection in MPEG and HEVC video using intra-frame and inter-frame distances

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now