Abstract
In this paper we present an enhanced multi-modality ontology-based approach for web image retrieval step by step. Several ontology-based approaches have been made in the field of multimedia retrieval. Our multi-modality approach is one of the earliest attempts to integrate information from different modalities and apply the model in a complex domain. In order to develop the model, we need to answer the following questions: (1) how to find the proper structure and construct an ontology which can integrate information from different modalities; (2) how to quantify the matching degree (concept similarity) and provide an independent ranking mechanism; (3) how to ensure the scalability of this approach when applied to large domains. The first question has been answered by our multi-modality ontology which has been discussed in Wang et al. (Does ontology help in image retrieval? In: Asia-Pacific workshop on visual information processing, 2006) and its extension (Wang et al., Does ontology help in image retrieval?—a comparison between keyword, text ontology and multi-modality ontology approaches, ACM Press, New York, NY, USA, pp 109–112, 2006). More details about this work is given later. The main focus of this paper is that we propose a new ranking mechanism using Spearman’s ranking correlation to measure the similarity of concepts in the ontology. We take the priorities of information from different modalities into consideration. This algorithm gives the answer of the second question. The semantic matchmaking result is quantized and the degree of similarity between concepts is calculated. For the third question, importing of ontology will resolve the scalability issue but computing concept similarity and identify relationships when integrating different ontologies will be beyond the scope of this paper. To convince readers that our multi-modality ontology and concept similarity ranking is the right step forward, we decided to work on the animal kingdom. We believe this domain is challenging as demonstrated by images depict animals in a wide range of aspects, pose, configurations and appearances. We experimented with a data sets of 4,000 web images. Based on ground truth, we analyze the image content and text information, build up the enhanced multi-modality ontology and compare the retrieval results. Results show that we can even classify close animal species which share similar appearances and we can infer their hidden relationships from the canine family graph. By assigning a ranking to the semantic relationships we show unequivocal evidence that our improved model achieves good accuracy and performs comparable result with the Google re-ranking result in our previous work.










Similar content being viewed by others
References
IEEE Computer Society (2006) 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR 2006). IEEE Computer Society, New York, NY, USA, pp 17–22, June 2006
Aslandogan YA, Thier C, Yu CT, Zou J, Rishe N (1997) Using semantic contents and wordnet in image retrieval. In: SIGIR. ACM, pp 286–295
Berg TL, Forsyth DA (2006) Animals on the web. In: CVPR (2) 2006 IEEE computer society conference on computer vision and pattern recognition, pp 1463–1470
Chang SK, Hsu A (1992) Image information systems: where do we go from here? IEEE Trans Knowl Data Eng 4(5):431–442
Fan L, Li B (2006) A hybrid model of image retrieval based on ontology technology and probabilistic ranking. In: Proceedings of the 2006 IEEE/WIC/ACM international conference on web intelligence, pp 477–480
Gao Y, Fan J (2006) Incorporating concept ontology to enable probabilistic concept reasoning for multi-level image annotation. In: MIR ’06: Proceedings of the 8th ACM international workshop on multimedia information retrieval. ACM Press, New York, NY, USA, pp 79–88. http://doi.acm.org.ezlibproxy1.ntu.edu.sg/10.1145/1178677.1178691
Gordon A (2001) Browsing image collections with representations of common-sense activities. J Am Soc Inf Sci Technol 52(11):925–929
Grauman K, Darrell T (2005) The pyramid match kernel: discriminative classification with sets of image features. In: ICCV. IEEE Computer Society, pp 1458–1465
Grauman K, Darrell T (2006) Unsupervised learning of categories from sets of partially matching image features. In: CVPR (1) 2006 IEEE computer society conference on computer vision and pattern recognition, pp 19–25
Grosky W, Sreenath D, Fotouhi F (2002) Emergent semantics and the multimedia semantic web. ACM SIGMOD Record 31(4):54–58
Gruber TR (1993) A translation approach to portable ontologies. Knowl Acquis 5(2):199–220
Haarslev V, Möller R (2001) Racer system description. In: Goré R, Leitsch A, Nipkow T (eds) IJCAR. Lecture notes in computer science, vol 2083. Springer, pp 701–706
Hu B, Dasmahapatra S, Lewis PH, Shadbolt N (2003) Ontology-based medical image annotation with description logics. In: ICTAI. IEEE Computer Society, pp 77–82
Hunter J (2001) Adding multimedia to the semantic web: building an MPEG-7 ontology. In: Cruz IF, Decker S, Euzenat J, McGuinness DL (eds) SWWS, pp 261–283
Hyvonen E, Styrman A, Saarela S (2002) Ontology-based image retrieval. Towards the semantic web and web services. In: Proceedings of XML Finland 2002 conference, pp 15–27
Jing F, Wang C, Yao Y, Deng K, Zhang L, Ma W (2006) IGroup: web image search results clustering. In: Proceedings of the 14th annual ACM international conference on multimedia, pp 377–384
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR (2) 2006 IEEE computer society conference on computer vision and pattern recognition, pp 2169–2178
Liu S, Chia LT, Chan S (2004) Ontology for nature-scene image retrieval. In: Meersman R, Tari Z (eds) CoopIS/DOA/ODBASE (2). Lecture notes in computer science, vol 3291. Springer, pp 1050–1061
Mehrotra S, Rui Y, Ortega M, Huang TS (1997) Supporting content-based queries over images in mars. In: ICMCS, pp 632–633
Miller G, Beckwith R, Fellbaum C, Gross D, Miller K (1990) Wordnet: an on-line lexical database. Int J Lexicogr 3:235–244
Pentland A, Picard RW, Sclaroff S (1994) Photobook: tools for content-based manipulation of image databases. In: Storage and retrieval for image and video databases (SPIE), pp 34–47
Popescu A, Moëllic P, Millet C (2007) SemRetriev: an ontology driven image retrieval system. In: Proceedings of the 6th ACM international conference on image and video retrieval, pp 113–116
Radhouani S, Lim JH, Chevallet JP, Falquet G (2006) Combining textual and visual ontologies to solve medical multimodal queries. In: 2006 International conference on multimedia and expo
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Smith JR, Chang SF (1996): Visualseek: a fully automated content-based image query system. In: ACM multimedia, pp 87–98
Tamura H, Yokoya N (1984) Image database systems: a survey. Pattern Recogn 17(1):29–43
Wang H, Liu S, Chia LT (2006) Does ontology help in image retrieval? In: Asia-Pacific workshop on visual information processing, best paper award
Wang H, Liu S, Chia LT (2006) Does ontology help in image retrieval?—a comparison between keyword, text ontology and multi-modality ontology approaches. In: MULTIMEDIA ’06: Proceedings of the 14th annual ACM international conference on multimedia. ACM Press, New York, NY, USA, pp 109–112. http://doi.acm.org/10.1145/1180639.1180672
Wang X, Ma W, He Q, Li X (2004) Grouping web image search result. In: Proceedings of the 12th annual ACM international conference on multimedia, pp 436–439
Yanai K, Barnard K (2005) Probabilistic web image gathering. In: MIR ’05: Proceedings of the 7th ACM SIGMM international workshop on multimedia information retrieval. ACM Press, New York, NY, USA, pp 57–64. http://doi.acm.org/10.1145/1101826.1101838
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, H., Chia, LT. & Liu, S. Image retrieval ++—web image retrieval with an enhanced multi-modality ontology. Multimed Tools Appl 39, 189–215 (2008). https://doi.org/10.1007/s11042-008-0202-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-008-0202-7