Top-Down Saliency Detection via Contextual Pooling

Zhu, Jun; Qiu, Yuanyuan; Zhang, Rui; Huang, Jun; Zhang, Wenjun

doi:10.1007/s11265-013-0768-9

Top-Down Saliency Detection via Contextual Pooling

Published: 20 June 2013

Volume 74, pages 33–46, (2014)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Jun Zhu¹,
Yuanyuan Qiu¹,
Rui Zhang¹,
Jun Huang² &
…
Wenjun Zhang¹

865 Accesses
18 Citations
Explore all metrics

Abstract

Goal-driven top-down mechanism plays important role in the case of object detection and recognition. In this paper, we propose a top-down computational model for goal-driven saliency detection based on the coding-based classification framework. It consists of four successive steps: feature extraction, descriptor coding, contextual pooling and saliency prediction. Particularly, we investigate the effect of spatial context information for saliency detection, and propose a block-wise spatial pooling operation to take advantage of contextual cues in multiple neighborhood scales and orientations. The experimental results on three datasets demonstrate that our method can effectively exploit contextual information and achieves the state-of-the-art performance on top-down saliency detection task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Global Context Guided Multi-scale Feature Network for Salient Object Detection

Semantic feature-guided and correlation-aggregated salient object detection

Article 13 November 2023

Weighted Pooling Based on Visual Saliency for Image Classification

Notes

http://lear.inrialpes.fr/data
http://labelme.csail.mit.edu
We use the code of [32] for calculating the AP value.
We adopt the implementation by LibLinear code package [34].
We obtain the saliency maps for JCDL based on the pre-trained models provided by the authors of [20], which can be download from http://faculty.ucmerced.edu/mhyang/pubs.html. For the top-down saliency models in [16] and [13], we produce corresponding saliency maps based on the code from the websites of their authors respectively: http://ilab.usc.edu/toolkit/downloads.shtml and http://people.csail.mit.edu/tjudd/WherePeopleLook/index.html.

References

Gao, D., Han, S., Vasconcelos, N. (2009). Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 989–1005.
Article Google Scholar
Lee, Y.B., & Lee, S. (2011). Robust face detection based on knowledge-directed specification of botton-up saliency. ETRI Journal, 33(4), 600–610.
Article Google Scholar
Achanta, R., & Susstrunk, S. (2009). Saliency detection for content-aware image resizing. In Proceedings of IEEE international conference on image processing (pp. 1005–1008).
Guo, C., & Zhang, L. (2010). A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Transactions on Image Processing, 19(1), 185–198.
Article MathSciNet Google Scholar
Ma, Q., & Zhang, L. (2008). Image quality assessment with visual attention. In Proceedings of international conference on pattern recognition.
Loupias, E., Sebe, N., Bres, S., Jolion, J.M. (2000). Wavelet-based salient points for image retrieval. In Proceedings of IEEE international conference on image processing (pp. 518–521).
Itti, L., Koch, C., Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 1254–1259.
Article Google Scholar
Harel, J., Koch, C., Pernoa, P. (2006). Graph-based visual saliency. In Proceedings of advances in neural information processing systems (pp. 545–552).
Bruce, N., & Tsotsos, J. (2006). Saliency based on information maximization. In Proceedings of advances in neural information processing systems (pp. 155–162).
Hou, X., & Zhang, L. (2007). Saliency detection: a spectral residual approach. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2280–2287).
Hou, X., & Zhang, L. (2008). Dynamic visual attention: Searching for coding length increments. In Proceedings of advances in neural information processing systems (pp. 681–688).
Achanta, R., Hemami, S., Estrada, F., Susstrunk, S. (2009). Frequency-tuned salient region detection. In Proceedings of IEEE international conference on computer vision and pattern recognition (Vol. 1–4, pp. 1597–1604).
Judd, T., Ehinger, K., Durand, F., Torralba, A. (2009). Learning to predict where humans look. In Proceedings of IEEE international conference on computer vision (pp. 2106–2113).
Cheng, M.M., Zhang, G.X., Mitra, N.J., Huang, X.L., Hu, S.M. (2011). Global contrast based salient region detection. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 409–416).
Zhou, Q., Li, N.Y., Yang, Y., Chen, P., Liu, W.Y. (2012). Corner-surround contrast for saliency detection. In Proceedings of IEEE international conference on patten recognition (pp. 1–4).
Navalpakkam, V., & Itti, L. (2006). An integrated model of top-down and bottom-up attention for optimizing detection speed. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2049–2056).
Frintrop, S., Backer, G., Rome, E. (2005). Goal-directed search with a top-down modulated computational attention system. In Proceedings of pattern recognition, LNCS (Vol. 3663, pp. 117–124).
Oliva, A., Torralba, A., Castelhano, M.S., Henderson, J.M. (2003). Top-down control of visual attention in object detection. In Proceedings of IEEE international conference on image processing (pp. 253–256).
Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., Shum, H.-Y. (2011). Learning to detect a salient object. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(2), 353–367.
Article Google Scholar
Yang, J., & Yang, M.-H. (2012). Top-down visual saliency via joint CRF and dictionary learning. In Proceedings of IEEE international conference on computer vision and pattern recognition.
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y. (2010). Locality-constrained linear coding for image classification. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 3360–3367).
Yang, J., Yu, K., Gong, Y., Huang, T. (2009). Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of IEEE international conference on computer vision and pattern recognition.
Zhu, J., Zou, W., Yang, X., Zhang, R., Zhou, Q., Zhang, W. (2012). Image classification by hierarchical spatial pooling with partial least squares analysis. In Proceedings of British machine vision conference.
Lowe, D.G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110.
Article Google Scholar
Yu, K., Zhang, T., Gong, Y. (2009). Nonlinear learning using local coordinate coding. In Proceedings of advances in neural information processing systems (pp. 2223–2231).
Torralba, A., Oliva, A., Castelhano, M.S., Henderson, J.M. (2006). Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychology Review, 113, 766–786.
Article Google Scholar
Hong, R., Wang, M., Xu, M., Yan, S., Chua, T.-S. (2010). Dynamic captioning: video accessibility enhancement for hearing impairment. In Proceedings of ACM multimedia (pp. 421–430).
Wang, M., Hong, R., Yuan, X.-T., Yan, S., Chua, T.-S. (2012). Movie2Comics: towards a lively video content presentation. IEEE Transactions on Multimedia, 14, 858–870.
Article Google Scholar
Carbonetto, P., Dork, G., Schmid, C., Kück, H., de Freitas, N. (2008). Learning to recognize objects with little supervision. International Journal of Computer Vision, 77, 219–237.
Article Google Scholar
Russell, B., Torralba, A., Murphy, K., Freeman, W.T. (2007). LabelMe: a database and web-based tool for image annotation. International Journal of Computer Vision, 77, 157–173.
Article Google Scholar
Opelt, A., Pinz, A., Fussenegger, M., Auer, P. (2006). Generic object recognition with boosting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(3), 416–431.
Article Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A (2010). The Pascal Visual Object Classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338.
Article Google Scholar
Shen, X., & Wu, Y. (2012). A unified approach to salient object detection via low rank matrix recovery. In Proceedings of IEEE international conference on computer vision and pattern recognition.
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wanvg, X.-R., Lin, C.-J. (2008). LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9, 1871–1874.
MATH Google Scholar
Zhai, Y., & Shah, M. (2006). Visual attention detection in video sequences using spatialtemporal cues. In Proceedings of the 14th annual ACM international conference on multimedia (pp. 815–824).
Qiu, Y., Zhu, J., Zhang, R., Huang, J. (2012). Top-Down saliency by multi-scale contextual pooling. In Proceedings of the 13th Pacific-Rim conference on multimedia (pp. 294–305).

Download references

Acknowledgments

The authors thank Quan Zhou for valuable comments. We also thank Mingming Cheng, Ming-Hsuan Yang, Laurent Itti and Tilke Judd for providing the codes of their methods on experimental comparison.

Author information

Authors and Affiliations

Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, People’s Republic of China
Jun Zhu, Yuanyuan Qiu, Rui Zhang & Wenjun Zhang
Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai, People’s Republic of China
Jun Huang

Authors

Jun Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yuanyuan Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Rui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Huang
View author publications
You can also search for this author in PubMed Google Scholar
Wenjun Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Zhu.

Additional information

This work is funded by the Project NSFC (61071155, 61201446), 973 National Program (2010CB731401, 2010CB731406) and STCSM (12DZ2272600).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, J., Qiu, Y., Zhang, R. et al. Top-Down Saliency Detection via Contextual Pooling. J Sign Process Syst 74, 33–46 (2014). https://doi.org/10.1007/s11265-013-0768-9

Download citation

Received: 24 January 2013
Revised: 24 April 2013
Accepted: 25 April 2013
Published: 20 June 2013
Issue Date: January 2014
DOI: https://doi.org/10.1007/s11265-013-0768-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Top-Down Saliency Detection via Contextual Pooling

Abstract

Access this article

Similar content being viewed by others

Global Context Guided Multi-scale Feature Network for Salient Object Detection

Semantic feature-guided and correlation-aggregated salient object detection

Weighted Pooling Based on Visual Saliency for Image Classification

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Top-Down Saliency Detection via Contextual Pooling

Abstract

Access this article

Similar content being viewed by others

Global Context Guided Multi-scale Feature Network for Salient Object Detection

Semantic feature-guided and correlation-aggregated salient object detection

Weighted Pooling Based on Visual Saliency for Image Classification

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation