Salient region detection and segmentation for general object recognition and image understanding

Huang, TieJun; Tian, YongHong; Li, Jia; Yu, HaoNan

doi:10.1007/s11432-011-4487-1

Salient region detection and segmentation for general object recognition and image understanding

Research Papers
Special Focus
Published: 03 December 2011

Volume 54, pages 2461–2470, (2011)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

TieJun Huang¹,
YongHong Tian¹,
Jia Li² &
…
HaoNan Yu¹

255 Accesses
18 Citations
Explore all metrics

Abstract

General object recognition and image understanding is recognized as a dramatic goal for computer vision and multimedia retrieval. In spite of the great efforts devoted in the last two decades, it still remains an open problem. In this paper, we propose a selective attention-driven model for general image understanding, named GORIUM (general object recognition and image understanding model). The key idea of our model is to discover recurring visual objects by selective attention modeling and pairwise local invariant features matching on a large image set in an unsupervised manner. Towards this end, it can be formulated as a four-layer bottomup model, i.e., salient region detection, object segmentation, automatic object discovering and visual dictionary construction. By exploiting multi-task learning methods to model visual saliency simultaneously with the bottom-up and top-down factors, the lowest layer can effectively detect salient objects in an image. The second layer exploits a simple yet effective learning approach to generate two complementary maps from several raw saliency maps, which then can be utilized to segment the salient objects precisely from a complex scene. For the third layer, we have also implemented an unsupervised approach to automatically discover general objects from large image set by pairwise matching with local invariant features. Afterwards, visual dictionary construction can be implemented by using many state-of-the-art algorithms and tools available nowadays.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable scene understanding via saliency consensus

Article 21 November 2017

Salient region detection via unit boundary distribution and energy optimization

Article 29 June 2016

Salient Object Detection Using Spatially Weighted Multiple Contrast Cues

References

Smeulders A W M, Worring M, Santini S, et al. Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell, 2000, 22: 1349–1380
Article Google Scholar
Lowe D G. Distinctive image features from scale-invariant keypoints. Int J Comput Vision, 2004, 60: 91–110
Article Google Scholar
Bay H, Ess A, Tuytelaars T, et al. SURF: Speeded up robust features. Comput Vis Image Underst, 2008, 110: 346–359
Article Google Scholar
Deng J, Dong W, Socher R, et al. ImageNet: A large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009. 248–255
Biederman I. Recognition-by-components: A theory of human image understanding. Psycho Rev, 1987, 94: 115–147
Article Google Scholar
Itti L, Rees G, Tsotsos J. Neurobiology of Attention. San Diego: Elsevier, 2005
Google Scholar
Li J, Tian Y H, Huang T J, et al. Probabilistic multi-task learning for visual saliency estimation in video. Int J Comput Vision, 2010, 90: 150–165
Article Google Scholar
Li J, Tian Y H, Huang T J, et al. Cost-sensitive rank learning from positive and unlabeled data for visual saliency estimation. IEEE Signal Process Lett, 2010, 17: 591–594
Article Google Scholar
Li J, Tian Y H, Huang T J, et al. Multi-task rank learning for visual saliency in video. IEEE Trans Circuits Syst Video Technol, 2011, 21: 623–636
Article Google Scholar
Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell, 1998, 20: 1254–1259
Article Google Scholar
Achanta R, Hemami S, Estrada F, et al. Frequency-tuned salient region detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009. 1597–1604
Hou X, Zhang L. Saliency detection: a spectral residual approach. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, Minnesota, USA, 2007. 1–8
Ma Y, Zhang H. Contrast-based image attention analysis by using fuzzy growing. In: Proceedings of the 11th ACM International Conference on Multimedia, Berkeley, CA, USA, 2003. 374–381
Yu H N, Li J, Tian Y H, et al. Automatic interesting object extraction from images using complementary saliency maps. In: Proceedings of ACM Multimedia, Firenze, Italy, 2010. 891–894
Goferman S, Manor L Z, Tal A. Context-aware saliency detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 2010, 2376–2383
Harel J, Koch C, Perona P. Graph-based visual saliency. Adv Neural Inf Process Syst, 2007, 19: 545–552
Google Scholar
Seo H J, Milanfar P. Static and space-time visual saliency detection by self-resemblance. J Vision, 2009, 9: 1–27
Article Google Scholar
Rother C, Kolmogorov V, Blake A. GrabCut-interactive foreground extraction using iterated graph cuts. ACM Trans Graphics, 2004, 23: 309–314
Article Google Scholar
Boykov Y, Kolmogorov V. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans Pattern Anal Mach Intell, 2004, 23: 1124–1137
Article Google Scholar
Movahedi V, Elder J H. Design and perceptual validation of performance measures for salient object segmentation. In: Proceedings of IEEE Workshop on Perceptual Organization in Computer Vision, San Francisco, CA, USA, 2010
Chen D, Tsai S, Chandrasekhar V, et al. Inverted index compression for scalable image matching. In: Proceedings of IEEE Data Compression Conference, Snowbird, UT, USA, 2010
Chen Z, Duan L Y, Wang C Y, et al. Generating vocabulary for global feature representation towards commerce image retrieval. In: Proceedings of IEEE International Conference Image Processing, Brussels, Belgium, 2011

Download references

Author information

Authors and Affiliations

National Engineering Laboratory for Video Technology, School of Electrical Engineering and Computer Science, Peking University, Beijing, 100871, China
TieJun Huang, YongHong Tian & HaoNan Yu
Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Jia Li

Authors

TieJun Huang
View author publications
You can also search for this author in PubMed Google Scholar
YongHong Tian
View author publications
You can also search for this author in PubMed Google Scholar
Jia Li
View author publications
You can also search for this author in PubMed Google Scholar
HaoNan Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to YongHong Tian.

Additional information

HUANG TieJun was born in 1970. He received the Ph.D. degree in pattern recognition and intelligent systems from Huazhong University of Science and Technology in 1998. Currently he is a professor at the School of Electrical Engineering and Computer Science of Peking University and the vice director of the National Engineering Laboratory of Video Technology of China. He is supported as New Century Excellent Talents in University by Ministry of Education of China. His research interests include image understanding, video coding, digital libraries and digital rights management. He is a council member of Chinese Institute of Electronics, a senior member of China Computer Federation, a board member of Director of Digital Media Project and an advisory board of IEEE Computing Now.

TIAN YongHong was born in 1975. He received the Ph.D. degree in computer application technology from Institute of Computing Technology, Chinese Academy of Sciences in 2005. Currently he is an associate professor at the National Engineering Laboratory of Video Technology, School of Electrical Engineering and Computer Science of Peking University. His research interests include machine learning and multimedia content analysis, retrieval, and copyright management. He is a senior member of IEEE.

Electronic supplementary material

Supplementary material, approximately 11.5 MB.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, T., Tian, Y., Li, J. et al. Salient region detection and segmentation for general object recognition and image understanding. Sci. China Inf. Sci. 54, 2461–2470 (2011). https://doi.org/10.1007/s11432-011-4487-1

Download citation

Received: 15 June 2011
Accepted: 27 September 2011
Published: 03 December 2011
Issue Date: December 2011
DOI: https://doi.org/10.1007/s11432-011-4487-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Salient region detection and segmentation for general object recognition and image understanding

Abstract

Access this article

Similar content being viewed by others

Scalable scene understanding via saliency consensus

Salient region detection via unit boundary distribution and energy optimization

Salient Object Detection Using Spatially Weighted Multiple Contrast Cues

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Salient region detection and segmentation for general object recognition and image understanding

Abstract

Access this article

Similar content being viewed by others

Scalable scene understanding via saliency consensus

Salient region detection via unit boundary distribution and energy optimization

Salient Object Detection Using Spatially Weighted Multiple Contrast Cues

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation