Abstract
Recently, visual saliency detection has received great interest. As most video saliency detection models are based on spatiotemporal mechanism, we firstly give a simple introduction of it in this paper. After discussing issues to be addressed, we present a novel framework for video saliency detection based on 3D discrete shearlet transform. Instead of measuring saliency by fusing spatial and temporal saliency maps, the proposed model regards video as three-dimensional data. By decomposing the video with 3D discrete shearlet transform and reconstructing it on multi-scales, this multi-scale saliency detection model obtains a number of feature blocks to describe the video. Based on each feature block, every a number of successive feature maps are taken as a whole, and the global contrast is calculated to obtain the saliency maps. By fusing all the saliency maps of different levels, the saliency map is generated for each video frame. This novel framework is very simple, and experimental results on ten videos show that the proposed model outperforms lots existing models.











Similar content being viewed by others
References
Bao L, Lu J, Li Y, Shi Y (2015) A saliency detection model using shearlet transform. Multimedia Tools Appl 74(11)
Borji A, Itti L (2013) State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell 35(1):185–207
Borji A, Sihite DN, Itti L (2013) Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. IEEE Trans Image Process 22(1):55–69
Donoho DL (1995) De-noising by soft-thresholding. IEEE Trans Inf Theory 41(3):613–627
Duncan K, Sarkar S (2010) REM: relational entropy-based measure of saliency. Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing, pp. 40–47
Duncan K, Sarkar S (2012) Relational entropy-based saliency detection in images and videos. Image Processing, 2012 19th IEEE International Conference on, pp. 1093–1096
Fang Y, Lin W, Chen Z, Tsai C, Lin C (2013) A video saliency detection model in compressed domain. IEEE Trans Circuits Syst Video Technol 24(1):27–38
Fukuchi K, Miyazato K, Kimura A, Takagi S, Yamato J (2009) Saliency-based video segmentation with graph cuts and sequentially updated priors. IEEE Int Conf Multimedia Expo 638–641
Goferman S, Zelnik-Manor L, Tal A (2012) Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell 34(10):1915–1926
Guo K, Kutyniok G, Labate D (2006) Sparse multidimensional representations using anisotropic dilation and shear operators. Wavelets Splines 189–201
Hadizadeh H, Bajic IV (2014) Saliency-aware video compression. IEEE Trans Image Process 23(1):19–33
Harel J, Koch C, Perona P (2006) Graph-based visual saliency. Proc Neural Inf Process Syst 545–552
Itti L (2000) Models of bottom-up and top-down visual attention. California Institute of Technology
Kim W, Jung C, Kim C (2011) Spatiotemporal saliency detection and its applications in static and dynamic scenes. IEEE Trans Circuits Syst Video Technol 21(4):446–456
Kim W, Kim C (2013) Spatiotemporal saliency detection using textural contrast and its applications. IEEE Trans Circuits Syst Video Technol 24(4):646–659
Koch C, Ullman S (1987) Shifts in selective visual attention: towards the underlying neural circuitry. Matters of intelligence, ed: Springer, pp. 115–141
Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. Proceedings of the 18th International Conference on Machine Learning, pp. 282–289
Levinshtein A, Stere A, Kutulakos KN, Fleet DJ, Dickinson SJ, Siddiqi K (2009) Turbopixels: fast superpixels using geometric flows. IEEE Trans Pattern Anal Mach Intell 31(12):2290–2297
Li W, Chang H, Lien K, Chang H, Wang Y (2013) Exploring visual and motion saliency for automatic video object extraction. IEEE Trans Image Process 22(7):2600–2610
Li Y, Sheng B, Ma L, Wu W, Xie Z (2013) Temporally coherent video saliency using regional dynamic contrast. IEEE Trans Circuits Syst Video Technol 23(12):2067–2076
Negi PS, Labate D (2012) 3-D discrete shearlet transform and video processing. IEEE Trans Image Process 21(6):2944–2954
Rapantzikos K, Tsapatsoulis N, Avrithis Y, Kollias S (2009) Spatiotemporal saliency for video classification. Signal Process Image Commun 24(7):557–571
Rudoy D, Goldman DB, Shechtman E, Zelnik-Manor L (2013). Learning video saliency from human gaze using candidate selection. Computer Vision and Pattern Recognition, 2013 I.E. Conference on, pp. 1147–1154
Tapu R, Zaharia T (2012) Video structuring: from pixels to visual entities. Signal Processing Conference, Proceedings of the 20th European, pp. 1583–1587
Wu B, Xu L, Zeng L, Wang Z, Wang Y (2013) A unified framework for spatiotemporal salient region detection. EURASIP J Image Video Process 2013(1):1–12
Zhou F, Kang S, Cohen M (2014) Time-mappingusingspace-timesaliency. Computer Vision and Pattern Recognition, 2014 I.E. Conference on, pp. 3358–3365
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bao, L., Zhang, X., Zheng, Y. et al. Video saliency detection using 3D shearlet transform. Multimed Tools Appl 75, 7761–7778 (2016). https://doi.org/10.1007/s11042-015-2692-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2692-4