poster

Beyond tag relevance: integrating visual attention model and multi-instance learning for tag saliency ranking

Authors:

De XuAuthors Info & Claims

CIVR '10: Proceedings of the ACM International Conference on Image and Video Retrieval

Pages 288 - 295

https://doi.org/10.1145/1816041.1816084

Published: 05 July 2010 Publication History

Abstract

Tag ranking has emerged as an important research topic recently due to its potential application on web image search. Conventional tag ranking approaches mainly rank the tags according to their relevance levels with respect to a given image. Nonetheless, such algorithms heavily rely on the large-scale image dataset and the proper similarity measurement to retrieve semantic relevant images with multi-labels. In contrast to the existing tag relevance ranking algorithms, in this paper, we propose a novel tag saliency ranking scheme, which aims to automatically rank the tags associated with a given image according to their saliency to the image content. To this end, this paper presents an integrated framework for tag saliency ranking which combines both visual attention model and multi-instance learning algorithm to investigate the saliency ranking order information of tags with respect to the given image. Specifically, tags annotated on the image-level are propagated to the region-level via an efficient multi-instance learning algorithm firstly; then, visual attention model is employed to measure the importance of regions in the given image. And finally, tags are ranked according to the saliency values of the corresponding regions. Experiments conducted on the COREL and MSRC image datasets demonstrate the effectiveness and efficiency of the proposed framework.

References

[1]

R. Achanta and S. Hemam. Frequency-tuned salient region detection. In Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'09), pages 1597--1604, Miami, Florida, USA, 2009.

[2]

G. Carneiro, A. B. Chan, P. J. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(3):394--410, 2007.

Digital Library

[3]

Y. Chen, J. Bi, and J. Wang. Miles: Multiple-instance learning via embedded instance selection. IEEE Transaction on Pattern Analysis and Machine Intelligence, 28(12):1931--1947, 2006.

Digital Library

[4]

Y. Chen and J. Wang. Image categorization by learning and reasoning with regions. Journal of Machine Learning Research, 5:913--939, 2004.

Digital Library

[5]

T. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: A real-world web image database from national university of singapore. In Proc. of ACM Int. Conf. on Image and Video Retrieval(ACM CIVR'09), pages 1--9, Santorini, Fira, Greece, 2009.

Digital Library

[6]

R. Datta, D. Joshi, J. Li, and J. Wang. Image retrieval: ideas, influences, and trends of the new age. ACM Computing Surveys, 40(2):1--60, 2008.

Digital Library

[7]

Y. Deng and B. Manjunath. Unsupervised segmentation of color-texture regions in images and video. IEEE Transactions on Pattern Analysis and Machine Learning, 23(8):800--810, 2001.

Digital Library

[8]

P. Duygulu, K. Barnard, N. Freitas, and D. Forsyth. Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In Proc. of European Conference on Computer Vision (ECCV'02), pages 97--112, Copenhagen, Denmark, 2002.

Digital Library

[9]

S. Feng, R. Manmatha, and V. Lavrenko. Multiple bernoulli relevance models for image and video annotation. In Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'04), pages 1002--1009, Washington DC, USA, 2004.

Digital Library

[10]

S. Feng, D. Xu, and X. Yang. Attention-driven salient edge(s) and region(s) extraction with application to cbir. Signal Processing, 90:1--15, 2010.

Digital Library

[11]

X. Hou and L. Zhang. Saliency detection: a spectral residual approach. In Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR'07), pages 1--8, Pittsburgh, Pennsylvania, USA, 2007.

[12]

L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):1254--1259, 1998.

Digital Library

[13]

T. Li, T. Mei, and S. C. Yan. Contextual decomposition of multi-label images. In Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'09), pages 2270--2277, Miami, Florida, USA, 2009.

[14]

X. R. Li, C. G. M. Snoek, and M. Worring. Learning social tag relevance by neighbor voting. IEEE Transactions on Multimedia, 11(7):1310--1322, 2009.

Digital Library

[15]

D. Liu, X. Hua, M. Wang, and H. Zhang. Boost search relevance for tag-based social image retrieval. In Proc. of Int. Conf. on Multimedia and Expro (ICME'09), pages 1636--1639, New York, USA, 2009.

Digital Library

[16]

D. Liu, X. S. Hua, L. J. Yang, M. Wang, and H. J. Zhang. Tag ranking. In Proc. of ACM Int. Conf. on World Wide Web (WWW'09), pages 351--360, Madrid, Spain, 2009.

Digital Library

[17]

T. Liu, Z. Yuan, J. Sun, J. Wang, N. Zheng, X. Tang, and H.-Y. Shum. Learning to detect a salient object. IEEE Transactions on Pattern Analysis and Machine Learning, 2010 (In press).

Digital Library

[18]

X. Liu, B. Cheng, and S. C. Yan. Label to region by bi-layer sparsity priors. In Proc. of ACM Int. Conf on Multimedia (ACM MM'09), pages 115--124, Beijing, China, 2009.

Digital Library

[19]

Y. Liu, D. Xu, I. Tsang, and J. Luo. Using large-scale web data to facilitate textual query based retrieval of consumer photos. In Proc. of ACM Int. Conf on Multimedia (ACM MM'09), pages 55--64, Beijing, China, 2009.

Digital Library

[20]

Y. Ma and H. Zhang. Contrast-based image attention analysis by using fuzzy growing. In Proc. of ACM Int. Conf on Multimedia (ACM MM'03), pages 374--381, Berkeley, CA, USA, 2003.

Digital Library

[21]

O. Maron and T. Lozano-Pierez. A framework for multiple-instance learning. In Proc. of Advances in Neural Information Processing Systems (NIPS'98), pages 570--576, 1998.

Digital Library

[22]

X. Qi and Y. Han. Incorporating multiple svms for automatic image annotation. Pattern Recognition, 40(2):728--741, 2007.

Digital Library

[23]

R. Rahmani and S. Goldman. Missl: multiple-instance semi-supervised learning. In Proc. of Int. Conf. on Machine Learning (ICML'06), pages 705--712, Pittsburgh, Pennsylvania, USA, 2006.

Digital Library

[24]

A. Sun and S. Bhowmick. Image tag clarity: in search of visual-representative tags for social images. In Proc. of ACM Multimedia Workshop on Social Media (ACM WSM'09), pages 19--26, Beijing, China, 2009.

Digital Library

[25]

J. Tang, X. Hua, G. Qi, and X. Wu. Typicality ranking via semi-supervised multiple-instance learning. In Proc. of ACM Int. Conf. on Multimedia (ACM MM'07), pages 297--300, Augsburg, Germany, 2007.

Digital Library

[26]

X. Wang, L. Zhang, X. Li, and W. Ma. Annotating images by mining image search results. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11):1919--1932, 2008.

Digital Library

[27]

C. Yang, M. Dong, and F. Fotouhi. Region-based image annotation through multiple-instance learning. In Proc. of ACM Int. Conf on Multimedia (ACM MM'05), pages 435--438, Singapore, Singapore, 2005.

Digital Library

Cited By

Guo JRen THuang LBei J(2019)Saliency detection on sampled images for tag rankingMultimedia Systems10.1007/s00530-017-0546-925:1(35-47)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s00530-017-0546-9
Shen M(2018)Social image tag enrichment based on textual similarity modelingMultimedia Tools and Applications10.1007/s11042-017-5184-x77:3(3659-3676)Online publication date: 1-Feb-2018
https://dl.acm.org/doi/10.1007/s11042-017-5184-x
Liu DHua XZhang H(2018)Content-based tag processing for Internet social imagesMultimedia Tools and Applications10.1007/s11042-010-0647-351:2(723-738)Online publication date: 31-Dec-2018
https://dl.acm.org/doi/10.1007/s11042-010-0647-3
Show More Cited By

Index Terms

Beyond tag relevance: integrating visual attention model and multi-instance learning for tag saliency ranking
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
2. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Combining visual attention model with multi-instance learning for tag ranking

Tag ranking has emerged as an important research topic recently due to its potential application on web image search. Existing tag relevance ranking approaches mainly rank the tags according to their relevance levels with respect to a given image. ...
Adaptive all-season image tag ranking by saliency-driven image pre-classification

Social image tag ranking has emerged as an important research topic due to its application on web image search. This paper presents an adaptive all-season tag ranking algorithm which can handle the images with and without distinct object(s) using ...
Towards relevance and saliency ranking of image tags
MM '12: Proceedings of the 20th ACM international conference on Multimedia

Social image tag ranking has emerged as an important research topic recently due to its potential application on web image search. This paper presents an adaptive all-season tag ranking algorithm which can handle the images with and without distinct ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIVR '10: Proceedings of the ACM International Conference on Image and Video Retrieval

July 2010

492 pages

ISBN:9781450301176

DOI:10.1145/1816041

Conference Chairs:
Shipeng Li
Microsoft Research Asia, China
,
Xinbo Gao
Xidian University, China
,
Nicu Sebe
University of Trento, Italy

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Funding Sources

China Postdoctoral Science Foundation
Fundamental Research Funds for the Central Universities
National Natural Science Foundation of China

Conference

CIVR' 10

Sponsor:

SIGMM

CIVR' 10: International Conference on Image and Video Retrieval

July 5 - 7, 2010

Xi'an, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
372
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Guo JRen THuang LBei J(2019)Saliency detection on sampled images for tag rankingMultimedia Systems10.1007/s00530-017-0546-925:1(35-47)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s00530-017-0546-9
Shen M(2018)Social image tag enrichment based on textual similarity modelingMultimedia Tools and Applications10.1007/s11042-017-5184-x77:3(3659-3676)Online publication date: 1-Feb-2018
https://dl.acm.org/doi/10.1007/s11042-017-5184-x
Liu DHua XZhang H(2018)Content-based tag processing for Internet social imagesMultimedia Tools and Applications10.1007/s11042-010-0647-351:2(723-738)Online publication date: 31-Dec-2018
https://dl.acm.org/doi/10.1007/s11042-010-0647-3
Tang JLi MLi ZZhao C(2018)Tag ranking based on salient region graph propagationMultimedia Systems10.1007/s00530-014-0357-121:3(267-275)Online publication date: 27-Dec-2018
https://dl.acm.org/doi/10.1007/s00530-014-0357-1
Sun GWu XPeng Q(2016)Part-based clothing image annotation by visual neighbor retrievalNeurocomputing10.1016/j.neucom.2015.12.141213:C(115-124)Online publication date: 12-Nov-2016
https://dl.acm.org/doi/10.1016/j.neucom.2015.12.141
Cao YKang KZhang SZhang JWang Z(2016)Automatic tag saliency ranking for stereo imagesNeurocomputing10.1016/j.neucom.2014.09.097172(9-18)Online publication date: Jan-2016
https://doi.org/10.1016/j.neucom.2014.09.097
Zhang JLiu XZhuo LWang C(2015)Social images tag ranking based on visual words in compressed domainNeurocomputing10.1016/j.neucom.2014.11.027153(278-285)Online publication date: Apr-2015
https://doi.org/10.1016/j.neucom.2014.11.027
Qian XHua XTang YMei T(2014)Social Image Tagging With Diverse SemanticsIEEE Transactions on Cybernetics10.1109/TCYB.2014.230959344:12(2493-2508)Online publication date: Dec-2014
https://doi.org/10.1109/TCYB.2014.2309593
Xin Liu Zhang JZhuo LLi Z(2014)Optimized tag ranking based on visual vocabulary for social images in compressed domain2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)10.1109/ICMEW.2014.6890529(1-5)Online publication date: Jul-2014
https://doi.org/10.1109/ICMEW.2014.6890529
Sun JFeng SWang WLang CChua TLu KMei TWu X(2013)Personalized image recommendation and retrieval via latent SVM based modelProceedings of the Fifth International Conference on Internet Multimedia Computing and Service10.1145/2499788.2499818(223-226)Online publication date: 17-Aug-2013
https://dl.acm.org/doi/10.1145/2499788.2499818
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten