Multimodal concept fusion using semantic closeness for image concept disambiguation

Abu-Shareha, Ahmad Adel; Mandava, Rajeswari; Khan, Latifur; Ramachandram, Dhanesh

doi:10.1007/s11042-010-0707-8

Multimodal concept fusion using semantic closeness for image concept disambiguation

Published: 11 January 2011

Volume 61, pages 69–86, (2012)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Ahmad Adel Abu-Shareha¹,
Rajeswari Mandava¹,
Latifur Khan² &
…
Dhanesh Ramachandram¹

259 Accesses
4 Citations
Explore all metrics

Abstract

In this paper we show how to resolve the ambiguity of concepts that are extracted from visual stream with the help of identified concepts from associated textual stream. The disambiguation is performed at the concept-level based on semantic closeness over the domain ontology. The semantic closeness is a function of the distance between the concept to be disambiguated and selected associated concepts in the ontology. In this process, the image concepts will be disambiguated with any associated concept from the image and/or the text. The ability of the text concepts to resolve the ambiguity in the image concepts is varied. The best talent to resolve the ambiguity of an image concept occurs when the same concept(s) is stated clearly in both image and text, while, the worst case occurs when the image concept is an isolated concept that has no semantically close text concept. WordNet and the image labels with selected senses are used to construct the domain ontology used in the disambiguation process. The improved accuracy, as shown in the results, proves the ability of the proposed disambiguation process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

About Sense Disambiguation of Image Tags in Large Annotated Image Collections

Word-Sense Disambiguation for Ontology Mapping: Concept Disambiguation using Virtual Documents and Information Retrieval Techniques

Article 30 September 2014

Evaluation of Automatic Tag Sense Disambiguation Using the MIRFLICKR Image Collection

References

Angelo C, Vincenzo M, Antonio P, Antonio P (2008) Scene detection using visual and audio attention. Paper presented at the Proceedings of the 2008 Ambi-Sys workshop on Ambient media delivery and interactive television, Quebec City, Canada
Athanasiadis T, Mylonas P, Yannis A, Stefanos K (2008) Semantic image segmentation and object labeling. IEEE Trans Circuits Syst Video Technol 17(3):298–312
Article Google Scholar
Barnard K, Forsyth D (2001) Learning the semantics of words and pictures. Paper presented at the International Conference on Computer Vision
Barnard K, Johnson M (2005) Word sense disambiguation with pictures. Artif Intell 167(1–2):13–30. doi:10.1016/j.artint.2005.04.009
Article Google Scholar
Benitez AB, Chang S-F (2002) Semantic knowledge construction from annotated image collections. ICME Lausanne, Switzerland
Google Scholar
Boyd-Graber J, Blei DM, Zhu X (2007) A topic model for word sense disambiguation. Paper presented at the Empirical Methods in Natural Language Processing, Prague, Czech Republic
Chin Y, Khan L, Wang L, Awad M (2005) “Image annotations by combining multiple evidence & WordNet” In Proc. of 13th Annual ACM International Conference on Multimedia (MM 2005), Singapore,November 2005, pp 706–715
Cunningham H, Maynard D, Bontcheva K, Tablan V (2002) GATE: a framework and graphical development environment for robust NLP tools and applications. Paper presented at the the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL‘02). Philadelphia, July 2002
Fan X (2004) Contextual disambiguation for multi-class object detection. Paper presented at the International Conference on Image Processing
FELLBAUM Ce (1998) WordNet: an electronic lexical database. MIT Press
Galleguillos C, Rabinovich A, Belongie S (2008) Object categorization using co-occurrence, location and appearance. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008. pp 1–8
Garcia ACB, Ferraz I, Santarosa Vivacqua A (2009) From data to knowledge mining. Artif Intell Eng Des Anal Manuf 23(4):427–441. doi:10.1017/S089006040900016X
Article Google Scholar
Jie Y, Jiebo L (2008) Leveraging probabilistic season and location context models for scene understanding. Paper presented at the Proceedings of the 2008 international conference on Content-based image and video retrieval, Niagara Falls, Canada
Knublauch H, Fergerson R, Noy N, Musen M (2004) The Protege OWL Plugin: An Open Development Environment for Semantic Web Applications. In: The Semantic Web ISWC 2004, pp 229-243
Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification. In Fellbaum 1998, pp 265–283
Manjunath KN, Renuka A, Niranjan UC (2007) Linear models of cumulative distribution function for content-based medical image retrieval. J Med Syst 31(6):433–443. doi:10.1007/s10916-007-9075-y
Article Google Scholar
Manolis D, Guillaume G, Patrick G (2008) Audiovisual integration with segment models for tennis video parsing. Comput Vis Image Underst 111(2):142–154. doi:10.1016/j.cviu.2007.09.002
Article Google Scholar
Margarita K, Emmanouil B, Constantine K, Ioannis P (2007) A neural network approach to audio-assisted movie dialogue detection. Neurocomput 71(1–3):157–166. doi:10.1016/j.neucom.2007.08.006
Google Scholar
Michael G, D. CP, Henning M, Thomas D (2006) The IAPR benchmark: a new evaluation resource for visual information systems. Paper presented at the International Conference on Language Resources and Evaluation, Genoa, Italy, 24/05/2006
Miller G (1995) WordNet: a lexical database for english. Commun ACM 38(11)
Ming-Fang W, Yung-Yu C (2008) Multi-cue fusion for semantic video indexing. Paper presented at the Proceeding of the 16th ACM international conference on Multimedia, Vancouver, British Columbia, Canada
Park K-W, Lee D-H (2006) Full-automatic high-level concept extraction from images using ontologies and semantic inference rules. In: ASWC, pp 307–321
Recommendation WC (10 February 2004 ) OWL: Web Ontology Language Overview http://www.w3.org/TR/owl-features/
Sanjiv K, Martial H (2005) A hierarchical field framework for unified context-based classification. Paper presented at the Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Singhal A, Luo J, Zhu W (2003) Probabilistic spatial context models for scene content understanding. Paper presented at the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Los Alamitos, CA, USA
Thies C, Herzog H, Schmitz-Rode T, Deserno TM (2007) Bridging the semantic gap for object extraction from biomedical images by classification. Biomed Tech 52
Wu Y, Tseng BL, Smith JR (2004) Ontology-based multi-classification learning for video concept detection. In: IEEE International Conference on Multimedia and Expo, ICME '04, pp 1003–1006
Ying L, Dengsheng Z, Guojun L, Wei-Ying M (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recogn 40(1):262–282. doi:10.1016/j.patcog.2006.04.045
Article MATH Google Scholar
Zlatoff N, Tellez B, Baskurt A (2004) Image understanding and scene models: a generic framework integrating domain knowledge and Gestalt theory. In: International Conference on Image Processing, ICIP '04, Vol. 2354, pp 2355–2358

Download references

Acknowledgments

This work was supported by a Research University grant titled ‘Multimodal Meaning Normalization through Ontologies’ (No:1001/PKOMP/811021).

Author information

Authors and Affiliations

School of Computer Science, Universiti Sains Malaysia, Penang, Malaysia
Ahmad Adel Abu-Shareha, Rajeswari Mandava & Dhanesh Ramachandram
Department of Computer Science, University of Texas at Dallas, Richardson, TX, 750830688, USA
Latifur Khan

Authors

Ahmad Adel Abu-Shareha
View author publications
You can also search for this author in PubMed Google Scholar
Rajeswari Mandava
View author publications
You can also search for this author in PubMed Google Scholar
Latifur Khan
View author publications
You can also search for this author in PubMed Google Scholar
Dhanesh Ramachandram
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajeswari Mandava.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abu-Shareha, A.A., Mandava, R., Khan, L. et al. Multimodal concept fusion using semantic closeness for image concept disambiguation. Multimed Tools Appl 61, 69–86 (2012). https://doi.org/10.1007/s11042-010-0707-8

Download citation

Published: 11 January 2011
Issue Date: November 2012
DOI: https://doi.org/10.1007/s11042-010-0707-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal concept fusion using semantic closeness for image concept disambiguation

Abstract

Access this article

Similar content being viewed by others

About Sense Disambiguation of Image Tags in Large Annotated Image Collections

Word-Sense Disambiguation for Ontology Mapping: Concept Disambiguation using Virtual Documents and Information Retrieval Techniques

Evaluation of Automatic Tag Sense Disambiguation Using the MIRFLICKR Image Collection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multimodal concept fusion using semantic closeness for image concept disambiguation

Abstract

Access this article

Similar content being viewed by others

About Sense Disambiguation of Image Tags in Large Annotated Image Collections

Word-Sense Disambiguation for Ontology Mapping: Concept Disambiguation using Virtual Documents and Information Retrieval Techniques

Evaluation of Automatic Tag Sense Disambiguation Using the MIRFLICKR Image Collection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation