skip to main content
research-article

Attribute-Augmented Semantic Hierarchy: Towards a Unified Framework for Content-Based Image Retrieval

Published: 01 October 2014 Publication History

Abstract

This article presents a novel attribute-augmented semantic hierarchy (A2SH) and demonstrates its effectiveness in bridging both the semantic and intention gaps in content-based image retrieval (CBIR). A2SH organizes semantic concepts into multiple semantic levels and augments each concept with a set of related attributes. The attributes are used to describe the multiple facets of the concept and act as the intermediate bridge connecting the concept and low-level visual content. An hierarchical semantic similarity function is learned to characterize the semantic similarities among images for retrieval. To better capture user search intent, a hybrid feedback mechanism is developed, which collects hybrid feedback on attributes and images. This feedback is then used to refine the search results based on A2SH. We use A2SH as a basis to develop a unified content-based image retrieval system. We conduct extensive experiments on a large-scale dataset of over one million Web images. Experimental results show that the proposed A2SH can characterize the semantic affinities among images accurately and can shape user search intent quickly, leading to more accurate search results as compared to state-of-the-art CBIR solutions.

References

[1]
C. F. Baker, C. J. Fillmore, and J. B. Lowe. 1998. The Berkeley FrameNet project. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics.
[2]
M. Belkin and P. Niyogi. 2003. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computat. 15, 6, 1373--1396.
[3]
A. Binder, K.-R. Müller, and M. Kawanabe. 2012. On taxonomies for multi-class image categorization. Int. J. Comput. Vision 99, 3, 281--301.
[4]
Y. Boureau, N. Le Roux, F. Bach, J. Ponce, and Y. LeCun. 2011. Ask the locals: Multi-way local pooling for image recognition. In Proceedings of the International Conference on Computer Vision.
[5]
M. Crucianu, M. Ferecatu, and N. Boujemaa. 2004. Relevance feedback for image retrieval: A short survey. DELOS2 Report.
[6]
R. Datta, D. Joshi, J. Li, and J. Wang. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 2, Article 50.
[7]
J. Deng, A. C. Berg, and F.-F. Li. 2011. Hierarchical semantic indexing for large scale image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[8]
J. Deng, A. C. Berg, K. Li, and F.-F. Li. 2010. What does classifying more than 10,000 image categories tell us? In Proceedings of the European Conference on Computer Vision.
[9]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and F.-F. Li. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Computer Vision and Pattern Recognition.
[10]
T. Deselaers and V. Ferrari. 2011. Visual and semantic similarity in ImageNet. In Proceedings of the IEEE Computer Vision and Pattern Recognition.
[11]
M. Douze, A. Ramisa, and C. Schmid. 2011. Combining attributes and fisher vectors for efficient image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[12]
A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth. 2009. Describing objects by their attributes. In Proceedings of the IEEE Conference on Computer Vision and Patter Recognition.
[13]
C. Fellbaum. 2010. WordNet. In Theory and Applications of Ontology: Computer Applications. Springer.
[14]
G. Griffin and P. Perona. 2008. Learning and using taxonomies for fast visual categorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[15]
A. Jaimes and S.-F. Chang. 2000. A conceptual framework for indexing visual information at multiple levels. Proc. SPIE 3964.
[16]
A. Kovashka, D. Parikh, and K. Grauman. 2012. WhittleSearch: Image search with relative attribute feedback. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[17]
N. Kumar, A. Berg, P. Belhumeur, and S. Nayar. 2011. Describable visual attributes for face verification and image search. IEEE Trans. Pattern Anal. Mach. Intell. 33, 10, 1962--1977.
[18]
M. S. Lew, N. Sebe, C. Djeraba, and R. Jain. 2006. Content-based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. 2, 1--90.
[19]
Z. Ma, Y. Yang, Z. Xu, S. Yan, N. Sebe, and A. G. Hauptmann. 2012. Complex event detection via multi-source video attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[20]
M. Marszalek and C. Schmid. 2007. Semantic hierarchies for visual object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[21]
M. Naphade, J. R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. 2006. Large-scale concept ontology for multimedia. IEEE Multimedia 13, 3, 86--91.
[22]
P. Over, G. Awad, M. Michel, J. Fiscus, G. Sanders, B. Shaw, W. Kraaij, A. F. Smeaton, and G. Quenot. 2012. TRECVID 2012 -- An overview of the goals, tasks, data, evaluation mechanisms and metrics. In Proceedings of the TRECVID Conference.
[23]
D. Parikh and K. Grauman. 2011a. Interactively building a discriminative vocabulary of nameable attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[24]
D. Parikh and K. Grauman. 2011b. Relative attributes. In Proceedings of the IEEE International Conference on Computer Vision.
[25]
Y. Rui, T. S. Huang, and S.-F. Chang. 1999. Image retrieval: Current techniques, promising directions, and open issues. J. Visual Commun. Image Represent. 10, 1, 39--62.
[26]
Y. Rui, T. S. Huang, M. Ortega, and S. Mehrotra. 1998. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circuits Syst. Video Techno. 8, 5, 644--655.
[27]
O. Russakovsky and F.-F. Li. 2010. Attribute learning in large-scale datasets. In Trends and Topics in Computer Vision. Lecture Notes in Computer Science, vol. 6553. Springer.
[28]
W. J. Scheirer, N. Kumar, P. N. Belhumeur, and T. E. Boult. 2012. Multi-attribute spaces: Calibration for attribute fusion and similarity search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[29]
A. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. 2000. Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22, 12, 1349--1380.
[30]
J. R. Smith and S.-F. Chang. 1997. VisualSeek: A fully automated content-based image query system. In Proceedings of the ACM International Conference on Multimedia.
[31]
Y. Song, M. Zhao, J. Yagnik, and X. Wu. 2010. Taxonomic classification for web-based videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[32]
D. Tao, X. Tang, X. Li, and X. Wu. 2006. Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 28, 7, 1088--1099.
[33]
S. Tong and E. Chang. 2001. Support vector machine active learning for image retrieval. In Proceedings of the ACM International Conference on Multimedia.
[34]
N. Verma, D. Mahajan, S. Sellamanickam, and V. Nair. 2012. Learning hierarchical similarity metrics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[35]
J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. 2010. Locality-constrained linear coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[36]
K. Q. Weinberger, J. Blitzer, and L. K. Saul. 2006. Distance metric learning for large margin nearest neighbor classification. In Proceedings of the 20th Annual Conference on Neural Information Processing Systems.
[37]
C. Yang, M. Dong, and F. Fotouhi. 2005. Semantic feedback for interactive image retrieval. In Proceedings of the ACM International Conference on Multimedia.
[38]
F. X. Yu, L. Cao, R. S. Feris, J. R. Smith, and S.-F. Chang. 2013. Designing category-level attributes for discriminative visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[39]
Z.-J. Zha, X.-S. Hua, T. Mei, J. Wang, G.-J. Qi, and Z. Wang. 2008. Joint multi-label multi-instance learning for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[40]
Z.-J. Zha, W. Meng, Y.-T. Zheng, Y. Yang, R. Hong, and T.-S. Chua. 2012. Interactive video indexing with statistical active learning. IEEE Trans. Multimedia 14, 1.
[41]
Z.-J. Zha, L. Yang, T. Mei, M. Wang, and Z. Wang. 2009. Visual query suggestion. In Proceedings of the ACM International Conference on Multimedia.
[42]
Z.-J. Zha, L. Yang, T. Mei, M. Wang, Z. Wang, T.-S. Chua, and X.-S. Hua. 2010. Visual query suggestion: Towards capturing user intent in internet image search. ACM Trans. Multimedia Comput. Commun. Appl. 6, 3.
[43]
H. Zhang, Z.-J. Zha, S. Yan, J. Bian, and T.-S. Chua. 2012. Attribute feedback. In Proceedings of the ACM International Conference on Multimedia.
[44]
H. Zhang, Z.-J. Zha, Y. Yang, S. Yan, Y. Gao, and T.-S. Chua. 2013. Attribute-augmented semantic hierarchy: Towards bridging semantic gap and intention gap in image retrieval. In Proceedings of the ACM International Conference on Multimedia.
[45]
K. Zhang, I. W. Tsang, and J. T. Kwok. 2009. Maximum margin clustering made practical. IEEE Trans. Neural Netw. 20, 4, 583--596.

Cited By

View all
  • (2023)Chest X-Ray Image Annotation based on Spatial Relationship Feature ExtractionAnnals of Emerging Technologies in Computing10.33166/AETiC.2023.05.0077:5(71-89)Online publication date: 5-Oct-2023
  • (2023)AMC: Adaptive Multi-expert Collaborative Network for Text-guided Image RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/358470319:6(1-22)Online publication date: 20-Feb-2023
  • (2021)An analysis of content-based image retrievalInternational Advanced Researches and Engineering Journal10.35860/iarej.8119275:1(123-141)Online publication date: 15-Apr-2021
  • Show More Cited By

Index Terms

  1. Attribute-Augmented Semantic Hierarchy: Towards a Unified Framework for Content-Based Image Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 11, Issue 1s
    Special Issue on Multiple Sensorial (MulSeMedia) Multimodal Media : Advances and Applications
    September 2014
    260 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/2675060
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 October 2014
    Accepted: 01 June 2014
    Revised: 01 May 2014
    Received: 01 February 2014
    Published in TOMM Volume 11, Issue 1s

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Image retrieval
    2. attribute
    3. semantic hierarchy

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 30 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Chest X-Ray Image Annotation based on Spatial Relationship Feature ExtractionAnnals of Emerging Technologies in Computing10.33166/AETiC.2023.05.0077:5(71-89)Online publication date: 5-Oct-2023
    • (2023)AMC: Adaptive Multi-expert Collaborative Network for Text-guided Image RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/358470319:6(1-22)Online publication date: 20-Feb-2023
    • (2021)An analysis of content-based image retrievalInternational Advanced Researches and Engineering Journal10.35860/iarej.8119275:1(123-141)Online publication date: 15-Apr-2021
    • (2021)Content-Based Video Retrieval Using Integration of Curvelet Transform and Simple Linear Iterative ClusteringInternational Journal of Image and Graphics10.1142/S021946782250018822:02Online publication date: 16-Jun-2021
    • (2021)Industrial Dataspace for smart manufacturing: connotation, key technologies, and frameworkInternational Journal of Production Research10.1080/00207543.2021.195599661:12(3868-3883)Online publication date: 16-Aug-2021
    • (2021)Hierarchical feature selection with multi-granularity clustering structureInformation Sciences10.1016/j.ins.2021.04.046568(448-462)Online publication date: Aug-2021
    • (2020)Reinforced Cross-Media Correlation Learning by Context-Aware Bidirectional TranslationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2019.290740030:6(1718-1731)Online publication date: Jun-2020
    • (2020)ARTAN: Align reviews with topics in attention network for rating predictionNeurocomputing10.1016/j.neucom.2020.04.054403(337-347)Online publication date: Aug-2020
    • (2020)Fine-grained image classification with factorized deep user click featureInformation Processing & Management10.1016/j.ipm.2019.10218657:3(102186)Online publication date: May-2020
    • (2019)Interpretable Partitioned Embedding for Intelligent Multi-item Fashion Outfit CompositionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/332633215:2s(1-20)Online publication date: 29-Jul-2019
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media