Skip to main content
Log in

Joint salient object detection and existence prediction

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Recent advances in supervised salient object detection modeling has resulted in significant performance improvements on benchmark datasets. However, most of the existing salient object detection models assume that at least one salient object exists in the input image. Such an assumption often leads to less appealing saliency maps on the background images with no salient object at all. Therefore, handling those cases can reduce the false positive rate of a model. In this paper, we propose a supervised learning approach for jointly addressing the salient object detection and existence prediction problems. Given a set of background-only images and images with salient objects, as well as their salient object annotations, we adopt the structural SVM framework and formulate the two problems jointly in a single integrated objective function: saliency labels of superpixels are involved in a classification term conditioned on the salient object existence variable, which in turn depends on both global image and regional saliency features and saliency labels assignments. The loss function also considers both image-level and region-level mis-classifications. Extensive evaluation on benchmark datasets validate the effectiveness of our proposed joint approach compared to the baseline and state-of-the-art models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on pattern analysis and machine intelligence, 1998, 20(11): 1254–1259

    Article  Google Scholar 

  2. Borji A, Itti L. State-of-the-art in visual attention modeling. IEEE transactions on pattern analysis and machine intelligence, 2013, 35(1): 185–207

    Article  Google Scholar 

  3. Borji A, Sihite D N, Itti L. Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. IEEE Transactions on Image Processing, 2013, 22(1): 55–69

    Article  MathSciNet  MATH  Google Scholar 

  4. Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum H Y. Learning to detect a salient object. IEEE Transactions on Pattern analysis and machine intelligence, 2011, 33(2): 353–367

    Article  Google Scholar 

  5. Zhang G X, Cheng M M, Hu S M, Martin R R. A shape-preserving approach to image resizing. Computer Graphics Forum, 2009, 28(7): 1897–1906

    Article  Google Scholar 

  6. Chen T, Cheng MM, Tan P, Shamir A, Hu S M. Sketch2photo: Internet image montage. ACM Transactions on Graphics (TOG), 2009, 28(5): 124

    Google Scholar 

  7. Chen T, Tan P, Ma L Q, Cheng M M, Shamir A, Hu S M. Poseshop: human image database construction and personalized content synthesis. IEEE Transactions on Visualization and Computer Graphics, 2013, 19(5): 824–837

    Article  Google Scholar 

  8. Cheng M M, Mitra N J, Huang X, Hu S M. Salientshape: group saliency in image collections. The Visual Computer, 2014, 30(4): 443–453

    Article  Google Scholar 

  9. Wang J, Quan L, Sun J, Tang X, Shum H Y. Picture collage. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2006, 347–354

    Google Scholar 

  10. Abdulmunem A, Lai Y K, Sun X. Saliency guided local and global descriptors for effective action recognition. Computational Visual Media, 2016, 2(1): 97–106

    Article  Google Scholar 

  11. Zhang J, Han Y, Jiang J. Tucker decomposition-based tensor learning for human action recognition. Multimedia Systems, 2016, 22(3): 343–353

    Article  Google Scholar 

  12. Hu S M, Chen T, Xu K, Cheng M M, Martin R R. Internet visual media processing: a survey with graphics and vision applications. The Visual Computer, 2013, 29(5): 393–405

    Article  Google Scholar 

  13. Cheng M M, Hou Q B, Zhang S H, Rosin P L. Intelligent visual media processing: when graphics meets vision. Journal of Computer Science and Technology, 2017, 32(1): 110–121

    Article  Google Scholar 

  14. Jiang H, Wang J, Yuan Z, Wu Y, Zheng N, Li S. Salient object detection: a discriminative regional feature integration approach. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2083–2090

    Google Scholar 

  15. Zhao R, Ouyang W, Li H, Wang X. Saliency detection by multi-context deep learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1265–1274

    Google Scholar 

  16. Li G, Yu Y. Visual saliency based on multiscale deep features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 5455–5463

    Google Scholar 

  17. Perazzi F, Krähenbühl P, Pritch Y, Hornung A. Saliency filters: contrast based filtering for salient region detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 733–740

    Google Scholar 

  18. Zhu W, Liang S, Wei Y, Sun J. Saliency optimization from robust background detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 2814–2821

    Google Scholar 

  19. Li X, Lu H, Zhang L, Ruan X, Yang M H. Saliency detection via dense and sparse reconstruction. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 2976–2983

    Google Scholar 

  20. LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278–2324

    Article  Google Scholar 

  21. Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of the Neural Information Processing Systems Conference. 2012, 1106–1114

    Google Scholar 

  22. Borji A. What is a salient object? a dataset and a baseline model for salient object detection. IEEE Transactions on Image Processing, 2015, 24(2): 742–756

    Article  MathSciNet  MATH  Google Scholar 

  23. Wang P, Wang J, Zeng G, Feng J, Zha H, Li S. Salient object detection for searched web images via global saliency. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 3194–3201.

    Google Scholar 

  24. Boykov Y, Kolmogorov V. An experimental comparison of mincut/ max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(9): 1124–1137

    Article  Google Scholar 

  25. Borji A, Cheng M M, Hou Q, Jiang H, Li J. Salient object detection: a survey. 2014, arXiv preprint arXiv:1411.5878

    Google Scholar 

  26. Borji A, Cheng M M, Jiang H, Li J. Salient object detection: a benchmark. IEEE Transactions on Image Processing, 2015, 24(12): 5706–5722

    Article  MathSciNet  MATH  Google Scholar 

  27. Han J, Liu N, Zhang D. Visual saliency detection and applications: a survey. Frontiers of Computer Science, 2017

    Google Scholar 

  28. Achanta R, Hemami S, Estrada F, Süsstrunk S. Frequency-tuned salient region detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 1597–1604

    Google Scholar 

  29. Goferman S, Zelnik-Manor L, Tal A. Context-aware saliency detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(10): 1915–1926

    Article  Google Scholar 

  30. Tian Y, Li J, Yu S, Huang T. Learning complementary saliency priors for foreground object segmentation in complex scenes. International Journal of Computer Vision, 2015, 111(2): 153–170

    Article  MATH  Google Scholar 

  31. Fang S, Li J, Tian Y, Huang T, Chen X. Learning discriminative subspaces on random contrasts for image saliency analysis. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(5): 1095–1108

    Article  Google Scholar 

  32. Margolin R, Tal A, Zelnik-Manor L. What makes a patch distinct? In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 1139–1146

    Google Scholar 

  33. Cheng M M, Mitra N J, Huang X, Torr P H, Hu S M. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 569–582

    Article  Google Scholar 

  34. Borji A, Itti L. Exploiting local and global patch rarities for saliency detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 478–485

    Google Scholar 

  35. Qi W, Cheng M M, Borji A, Lu H, Bai L F. SaliencyRank: two-stage manifold ranking for salient object detection. Computational Visual Media, 2015, 1(4): 309–320

    Article  Google Scholar 

  36. Jiang H, Wang J, Yuan Z, Liu T, Zheng N, Li S. Automatic salient object segmentation based on context and shape prior. In: Proceedings of the British Machine Vision Conference (BMVC). 2011

    Google Scholar 

  37. Felzenszwalb P F, Huttenlocher D P. Efficient graph-based image segmentation. International Journal of Computer Vision, 2004, 59(2): 16–181

    Article  Google Scholar 

  38. Cheng M M, Liu Y, Hou Q, Bian J, Torr P, Hu S M, Tu Z. HFS: hierarchical feature selection for efficient image segmentation. In: Proceedings of European Conference on Computer Vision. 2016, 867–882

    Google Scholar 

  39. Yan Q, Xu L, Shi J, Jia J. Hierarchical saliency detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 1155–1162

    Google Scholar 

  40. Wei Y, Wen F, Zhu W, Sun J. Geodesic saliency using background priors. In: Proceedings of European Conference on Computer Vision. 2012, 29–42

    Google Scholar 

  41. Yang C, Zhang L, Lu H, Ruan X, Yang M H. Saliency detection via graph-based manifold ranking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3166–3173

    Google Scholar 

  42. Jiang B, Zhang L, Lu H, Yang C, Yang M H. Saliency detection via absorbing Markov chain. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1665–1672

    Google Scholar 

  43. Zhang J, Sclaroff S. Saliency detection: a boolean map approach. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 153–160

    Google Scholar 

  44. Chang K Y, Liu T L, Chen H T, Lai S H. Fusing generic objectness and visual saliency for salient object detection. In: Proceedings of IEEE International Conference on Computer Vision. 2011, 914–921

    Google Scholar 

  45. Jiang P, LingH, Yu J, Peng J. Salient region detection by UFO: uniqueness, focusness and objectness. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1976–1983

    Google Scholar 

  46. Jia Y, Han M. Category-independent object-level saliency detection. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1761–1768

    Google Scholar 

  47. Cheng MM, Warrell J, Lin WY, Zheng S, Vineet V, Crook N. Efficient salient region detection with soft image abstraction. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1529–1536

    Google Scholar 

  48. Mai L, Niu Y, Liu F. Saliency aggregation: a data-driven approach. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1131–1138

    Google Scholar 

  49. Lu S, Mahadevan V, Vasconcelos N. Learning optimal seeds for diffusion-based salient object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 2790–2797

    Google Scholar 

  50. Mehrani P, Veksler O. Saliency segmentation based on learning and graph cut refinement. In: Proceedings of the British Machine Vision Conference (BMVC). 2010, 1–12

    Google Scholar 

  51. Kim J, Han D, Tai Y W, Kim J. Salient region detection via highdimensional color transform. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014, 883–890

    Google Scholar 

  52. Khuwuthyakorn P, Robles-Kelly A, Zhou J. Object of interest detection by saliency learning. In: Proceedings of European Conference on Computer Vision. 2010

    Google Scholar 

  53. Hou Q, Cheng M M, Hu X, Borji A, Tu Z, Torr P. Deeply supervised salient object detection with short connections. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017, 5300–5309

    Google Scholar 

  54. Zhang J, Ma S, Sameki M, Sclaroff S, Betke M, Lin Z, Shen X, Price B, Mech R. Salient object subitizing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 4045–4054

    Google Scholar 

  55. Deng J, DongW, Socher R, Li L J, Li K, Li F F. Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248–255

    Google Scholar 

  56. Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 580–587

    Google Scholar 

  57. Yang S, Luo P, Loy C C, Tang X. From facial parts responses to face detection: a deep learning approach. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3676–3684

    Google Scholar 

  58. Cimpoi M, Maji S, Vedaldi A. Deep filter banks for texture recognition and segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3828–3836

    Google Scholar 

  59. Lin T Y, RoyChowdhury A, Maji S. Bilinear CNN models for finegrained visual recognition. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 1449–1457

    Google Scholar 

  60. Su H, Maji S, Kalogerakis E, Learned-Miller E. Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 945–953

    Google Scholar 

  61. Zeiler M D, Fergus R. Visualizing and understanding convolutional networks. In: Proceedings of European Conference on Computer Vision. 2014, 818–833

    Google Scholar 

  62. Do T M T, Artières T. Regularized bundle methods for convex and non-convex risks. Journal of Machine Learning Research, 2012, 13: 3539–3583

    MathSciNet  MATH  Google Scholar 

  63. Xiao J, Hays J, Ehinger K A, Oliva A, Torralba A. SUN database: large-scale scene recognition from abbey to zoo. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 3485–3492

    Google Scholar 

  64. Cimpoi M, Maji S, Kokkinos I, Mohamed S, Vedaldi A. Describing textures in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3606–3613

    Google Scholar 

  65. Shen X, Wu Y. A unified approach to salient object detection via low rank matrix recovery. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 853–860

    Google Scholar 

  66. Huang H, Zhang L, Zhang H C. Arcimboldo-like collage using internet images. ACM Transactions on Graphics, 2011, 30(6): 155

    Google Scholar 

  67. Liu H, Zhang L, Huang H. Web-image driven best views of 3D shapes. The Visual Computer, 2012, 28(3): 279–287

    Article  Google Scholar 

  68. Wei Y, Liang X, Chen Y, Shen X, Cheng M M, Feng J, Zhao Y, Yan S. STC: a simple to complex framework for weakly-supervised semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(11): 2314–2320

    Article  Google Scholar 

  69. Chia A Y S, Zhuo S, Gupta R K, Tai Y W, Cho S Y, Tan P, Lin S. Semantic colorization with Internet images. ACM Transactions on Graphics, 2011, 30(6): 156

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (Grant Nos. 61572264, 61620106008) and CAST young talents plan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming-Ming Cheng.

Additional information

Huaizu Jiang is currently a PhD student in College of Information and Computer Sciences, University of Massachusetts, Amherst, USA. He received his BS and MS degrees from Xi’an Jiaotong University, China in 2005 and 2009, respectively. He is interested in how to teach an intelligent machine to understand the visual scene like a human.

Ming-Ming Cheng received his PhD degree from Tsinghua University, China in 2012. Then, he became a research fellow, with Prof. Philip Torr in the University of Oxford, UK. He is now an associate professor at Nankai University, China. His research interests include computer graphics, computer vision, and image processing.

Shi-Jie Li received his BS degree from University of Electronic Science and Technology of China, China in 2016. He is now a master student in department of computer science, Nankai University, China, working with Prof. Ming-Ming Cheng.

Ali Borji received his BS and MS degrees in computer engineering from Petroleum University of Technology, Tehran, Iran in 2001 and Shiraz University, Shiraz, Iran in 2004, respectively. He did his PhD in cognitive neurosciences at Institute for Studies in Fundamental Sciences (IPM) in Tehran, Iran, 2009 and spent four years as a postdoctoral scholar at iLab, University of Southern California, USA from 2010 to 2014. He is currently an assistant professor at University of Central Floria, Orlando, USA. His research interests include visual attention, active learning, object and scene recognition, and cognitive and computational neurosciences.

Jingdong Wang received the PhD degree in computer science from The Hong Kong University of Science and Technology, China in 2007. He is currently a Lead Researcher with the Internet Media Group, Microsoft Research, Beijing, China. His current research interests include computer vision, machine learning, and multimedia. He has served as an Area Chair in CVPR 2017, ECCV 2016, ACMMM 2015, and ICME 2015, a Track Chair in ICME 2012. He is an Editorial Board Member of the IEEE TRANSACTIONS ON MULTIMEDIA and the International Journal of Multimedia Tools and Applications and an Associate Editor of the International Journal of Neurocomputing. He has shipped 10+ technologies to Microsoft products, including XiaoIce Chatbot, Microsoft cognitive service, and Bing search.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, H., Cheng, MM., Li, SJ. et al. Joint salient object detection and existence prediction. Front. Comput. Sci. 13, 778–788 (2019). https://doi.org/10.1007/s11704-017-6613-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-017-6613-8

Keywords

Navigation