Salient object detection and classification for stereoscopic images

Kang, Kai; Cao, Yang; Zhang, Jing; Wang, Zengfu

doi:10.1007/s11042-014-2142-8

Salient object detection and classification for stereoscopic images

Published: 19 June 2014

Volume 75, pages 1443–1457, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Kai Kang¹,
Yang Cao¹,
Jing Zhang¹ &
…
Zengfu Wang¹

496 Accesses
6 Citations
Explore all metrics

Abstract

Stereoscopic images have become more and more prevalent following the rapid advances in 3D capturing and display techniques. However, there has been little research on visual content analysis for stereoscopic images. In this paper, we address the challenging problem of object detection and classification for stereoscopic images. An iterative method that can mutually boost salient object detection and object classification is proposed for stereoscopic images. This method includes two steps. In the first step, a 3D saliency detection method, which includes the contrastive and occlusion cues contained in each stereoscopic image pair along with the discriminative features provided by the SVM classifier, is proposed to localize object of interest in the stereoscopic images. In the second step, the bag of word features of foreground and background is pooled by using the localization information, and then is applied to train the SVM classifier. Each of the two steps benefits from the gradual improvement result in the other, no matter in the training or the testing process. To evaluate the performance of our approach, a 6-object class dataset of stereoscopic images real objects viewed under general lighting conditions, poses and viewpoints is set up. Our experimental results on the dataset, for object localization and object classification, demonstrate the effectiveness of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Salient Region Detection Using Multilevel Image Features

Effective Information and Contrast Based Saliency Detection

A learning-based visual saliency prediction model for stereoscopic 3D video (LBVS-3D)

Article 23 November 2016

Notes

References

Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Bilen H, Namboodiri VP, Gool LJV (2011) Object and action classification with latent variables. In: British machine vision conference (BMVC)
Bruce N, Tsotsos J (2005) An attentional framework for stereo vision. In: Proceedings of the Canadian conference on computer and robot vision
Bruce N, Tsotsos J (2006) Saliency based on information maximization. In: Advances in neural information processing systems (NIPS), vol. 18, p. 155–162
Chai Y, Lempitsky V, Zisserman A (2011) Bicos: A bi-level co-segmentation method for image classification. In: IEEE international conference on computer vision
Chamaret C, Godeffroy S, Lopez P, Meur OL (2010) Adaptive 3d rendering based on region-of-interest. In: Proceedings of SPIE
Cheng M, Zhang G, Mitra N, Huang X, Hu S (2011) Global contrast based salient region detection. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Delaitre V, Laptev I, Sivic J (2010) Recognizing human actions in still images: a study of bag-of-features and part-based representations. In: British Machine vision conference (BMVC)
Gao D, Han S, Vasconcelos N (2009) Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Transcations on Pattern Anal Machine Intell (PAMI) 31(6):989–1005
Article Google Scholar
He K, Sun J, Tang X (2010) Guided image filtering. In: The European conference on computer vision (ECCV)
Google Scholar
Hou X, Zhang L (2007) Saliency detection: a spectral residual approach. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Transcations Pattern Anal Machine Intell (PAMI) 20:1254–1259
Article Google Scholar
Koch C, Ullman S (1985) Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurbiology 4:219–227
Google Scholar
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Li F, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Liu W, Tao D (2013) Multiview hessian regularization for image annotation. IEEE Trans Image Process 22:2676–2687
Article MathSciNet Google Scholar
Liu W, Tao D, Cheng J, Tang Y (2014) Multiview hessian discriminative sparse coding for image annotation. Comput Vis Image Underst 118:50–60
Article Google Scholar
Mai L, Niu Y, Liu F (2013) Saliency aggregation: a data-driven approach. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Maki A, Nordlund P, Eklundh J (1996) A computational model of depth-based attention. In: proceedings of the international conference on pattern recognition
Murphy K, Torralba A, Eaton D, Freeman W (2006) Object detection and localization using local and global features. In: Toward category-level object recognition, springer berlin heidelberg
Chapter Google Scholar
Murray N, Vanrell M, Otazu X, Parraga CA (2011) Saliency estimation using a non-parametric low level vision model. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Nguyen M H, Torresani L, de la Torre F, Rother C (2009) Weakly supervised discriminative localization and classification: a joint learning process. In: IEEE International conference on computer vision
Niu Y, Geng Y, Li X (2012) Leveraging stereopsis for saliency analysis. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66
Article MathSciNet Google Scholar
Ouerhani N, Hugli H (2000) Computing visual attention from scene depth. In: Proceedings of the international conference on pattern recognition
Potapova E, Zillich M, Vincze M (2011) Learning what matters: combining probabilistic models of 2d and 3d saliency cues. Comput Vis Syst:132–142
Rapantzikos K, Avrithis Y, Kollias S (2009) Dense saliency-based spationtemporal feature points for action recognition. In: Proceedings IEEE conference on computer vision and pattern recognition (CVPR)
Reynolds J, Desimone R (2003) Interacting roles of attention and visual salience in v4, vol 37, pp 853–863
Article Google Scholar
Rhemann C, Hosni A, Bleyer M, Rother C, Gelautz M (2011) Fast cost-volume filtering for visual correspondence and beyond. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Russakovsky O, Lin Y, Yu K, Fei-Fei L (2012) Object-centric spatial pooling for image classification. In: The European conference on computer vision (ECCV)
Google Scholar
Sivic J, Zisserman A (2003) Video google: A text retrieval approach to object matching in videos. In: IEEE international conference on computer vision (ICCV)
Tatler B, Baddeley R, Gilchrist I (2005) Visual correlates of fixation selection: effects of scale and time. Vis Res 45:643–659
Article Google Scholar
van Zoest W, Donk M (2004) Bottom-up and top-down control in visual search, vol 33. PERCEPTION LONDON, pp 927–938
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Proceedings IEEE conference on computer vision and pattern recognition (CVPR)
Wolfe JM, Horowitz TS (2004) What attributes guide the deployment of visual attention and how do they do it? Nat Rev Neurosci 5:1–7
Article Google Scholar
Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Yao B, Khosla A, Li F (2011) Combining randomization and discrimination for fine-grained image categorization. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Zhai Y, Shah M (2006) Visual attention detection in video sequences using spatiotemporal cues. ACM Trans Multimed:815–824
Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW (2008) Sun: a Bayesian framework for saliency using natural statistics. Journal of Vision 8(7):1–20
Article Google Scholar
Zhang Y, Jiang G, Yu M, Chen K (2010) Stereoscopic visual attention model for 3d video. Adv Multimed Model:314–324
Zha Z-J, Wang M, Zheng Y-T, Yang Y, Hong R, Chua T-S (2012) Interactive video indexing with statistical active learning. IEEE Trans Multimed 14(1):17–27
Article Google Scholar
Zha Z-J, Zhang H, et al (2013) Detecting Group Activities with Multi-Camera Context. IEEE transactions on circuits and systems for video technologies 23(5):856–869
Article Google Scholar
Zha Z-J, Yang Y, Tang J, Wang M, Chua T-S (2014) Robust multi-view feature learning for RGB-D image understanding, ACM transactions on intelligent systems and technology

Download references

Acknowledgments

We would like to thanks the Flickr users and the NVIDIA 3D Vision Live sharers for their sharing photos. We also would like to thank Yuzhen Niu, Yujie Geng, Xueqing Li and Feng Liu for they providing the website links.

Author information

Authors and Affiliations

Department of Automation, University of Science and Technology of China, Hefei, Anhui, People’s Republic of China
Kai Kang, Yang Cao, Jing Zhang & Zengfu Wang

Authors

Kai Kang
View author publications
You can also search for this author inPubMed Google Scholar
Yang Cao
View author publications
You can also search for this author inPubMed Google Scholar
Jing Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Zengfu Wang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yang Cao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kang, K., Cao, Y., Zhang, J. et al. Salient object detection and classification for stereoscopic images. Multimed Tools Appl 75, 1443–1457 (2016). https://doi.org/10.1007/s11042-014-2142-8

Download citation

Received: 28 December 2013
Revised: 09 April 2014
Accepted: 02 June 2014
Published: 19 June 2014
Issue Date: February 2016
DOI: https://doi.org/10.1007/s11042-014-2142-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Salient object detection and classification for stereoscopic images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Salient Region Detection Using Multilevel Image Features

Effective Information and Contrast Based Saliency Detection

A learning-based visual saliency prediction model for stereoscopic 3D video (LBVS-3D)

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now