Abstract
In this article, we present the topic of semantic annotation of text using open semantic resources. We present an introduction to the concept and its history, provide basic notions around semantic annotations and open semantic resources, in particular illustrating commonly used open semantic resources repositories such as Wikipedia, Wordnet, or DBPedia. Further, we discuss the issues around creating open semantic resources, both from the annotation perspective, and from the format perspective. Finally, we introduce two well-known semantic annotation tasks, entity linking (or named entity disambiguation), and semantic parsing, with corresponding sample implementations, explaining in particular how they work and how make use of open semantic resources.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
-
Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen F”urstenau Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. Robust Disambiguation of Named Entities in Text. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Edinburgh, Scotland, pages 782–792, 2011.
-
David Milne and Ian H. Witten. Learning to link with wikipedia. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, pages 509–518. ACM, 2008.
-
Roberto Navigli. Word sense disambiguation: A survey. ACM Comput. Surv., 41(2):10:1–10:69, February 2009.
-
Luke S Zettlemoyer and Michael Collins. Learning context-dependent mappings from sentences to logical form. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2, pages 976–984. Association for Computational Linguistics, 2009.
-
Recommended Reading
-
Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen F”urstenau Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. Robust Disambiguation of Named Entities in Text. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Edinburgh, Scotland, pages 782–792, 2011.
-
David Milne and Ian H. Witten. Learning to link with wikipedia. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, pages 509–518. ACM, 2008.
-
Roberto Navigli. Word sense disambiguation: A survey. ACM Comput. Surv., 41(2):10:1–10:69, February 2009.
-
Luke S Zettlemoyer and Michael Collins. Learning context-dependent mappings from sentences to logical form. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2, pages 976–984. Association for Computational Linguistics, 2009.
Baader F, Calvanese D, McGuiness DL, Nardi D, Patel-Schneieder PF (2010) The description logic handbook: theory, implementation, and applications. Cambridge University Press, Cambridge
Baker CF, Fillmore CJ, Lowe JB (1998) The Berkeley framenet project. In: Proceedings of the 17th international conference on computational linguistics. Association for computational linguistics, Montreal, vol 1, pp 86–90
Banerjee S, Pedersen T (2002) An adapted lesk algorithm for word sense disambiguation using wordnet. In: Proceedings of computational linguistics and intelligent text processing,third international conference, CICLing 2002, Mexico City, pp 136–145
Berant J, Chou A, Frostig R, Liang P (2013) Semantic parsing on freebase from question-answer pairs. In: EMNLP, Seattle, pp 1533–1544
Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 284:34–43
Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, SIGMOD’08. ACM, New York, pp 1247–1250
Bordes A, Glorot X, Weston J, Bengio Y (2012) Joint learning of words and meaning representations for open-text semantic parsing. In: International conference on artificial intelligence and statistics, La Palma, pp 127–135
Bosma W, Vossen P, Soroa A, Rigau G, Tesconi M, Marchetti A, Monachini M, Aliprandi C (2009) KAF: a generic semantic annotation format. In: Proceedings of the GL2009 workshop on semantic annotation, Pisa
Brachman RJ, Levesque HJ (2004) Knowledge representation and reasoning. The Morgan Kaufmann series in artificial intelligence series. Morgan Kaufmann, Burlington
Bradesko L, Starc J, Pacifico S (2015) Isaac bloomberg meets Michael bloomberg: better entity disambiguation for the news. In: Proceedings of the 24th international conference on World Wide Web companion WWW, companion volume, Florence, pp 631–635
Cai Q, Yates A (2013) Large-scale semantic parsing via schema matching and lexicon extension. In: ACL (1), Sofia. Citeseer, pp 423–433
Cucerzan S (2007) Large-scale named entity disambiguation based on wikipedia data. In: Proceeding of the 2007 joint conference on EMNLP and CNLL, Prague, pp 708–716
Erxleben F, Günther M, Krötzsch M, Mendez J, Vrandecic J (2014) Introducing Wikidata to the linked data web. In: The semantic web – ISWC 2014 – Proceedings of 13th international semantic web conference part I, Riva del Garda, pp 50–65
Fellbaum C (2005) Wordnet and wordnets. In: Brown K (ed) Encyclopedia of language and linguistics. Oxford, Elsevier, pp 665–670
Gildea D, Jurafsky D (2002) Automatic labeling of semantic roles. Comput Linguist 28:245–288
Hajič J, Ciaramita M, Johansson R, Kawahara D, Martí M, Màrquez L, Meyers A, Nivre J, Padó S, Štěpánek J et al (2009) The conll-2009 shared task: syntactic and semantic dependencies in multiple languages. In: Proceedings of the thirteenth conference on computational natural language learning: shared task. Association for computational linguistics, Stroudsburg, pp 1–18
Hoffart J, Yosef MA, Bordino I, Fürstenau H, Pinkal M, Spaniol M, Taneva B, Thater S, Weikum G (2011) Robust disambiguation of named entities in text. In: Proceedings of the 2011 conference on empirical methods in natural language processing, EMNLP 2011, Edinburgh, pp 782–792
Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes PN, Hellmann S, Morsey M, van Kleef P, Auer S, Bizer C (2015) DBpedia – a large-scale, multilingual knowledge base extracted from wikipedia. Semant Web J 6(2):167–195
Lesk M (1986) Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: ACM special interest group for design of communication, Chicago, pp 24–26
Mahdisoltani F, Biega J, Suchanek F (2015) YAGO3: a knowledge base from multilingual Wikipedias. In: Proceeding of 7th biennial conference on innovative data systems research (CIDR 2015), Asilomar
Mendes PN, Jakob M, Garcia-Silva A, Bizer C (2011) Dbpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th international conference on semantic systems (I-semantics), Graz
Mihalcea R, Csomai A (2007) Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the sixteenth ACM conference on information and knowledge management, CIKM’07. ACM, New York, pp 233–242
Milne D, Witten IH (2008) Learning to link with wikipedia. In: Proceedings of the 17th ACM conference on information and knowledge management. ACM, Napa Valley, pp 509–518
Navigli R (2009) Word sense disambiguation: a survey. ACM Comput Surv 41(2):10:1–10:69
Nguyen DB, Hoffart J, Theobald M, Weikum G (2014) AIDA-light: high-throughput named-entity disambiguation. In: Linked data on the web at WWW2014, Seoul
W3C OWL Working Group (2009) OWL 2 web ontology language: document overview. W3C Recommendation, 27 October 2009. Available at http://www.w3.org/TR/owl2-overview/
Pershina M, He Y, Grishman R (2015) Personalized page rank for named entity disambiguation. In: Proceedings of the 2015 conference of the North American chapter of the association for computational linguistics: human language technologies. Association for computational linguistics, Denver, pp 238–243
Poon H, Domingos P (2009) Unsupervised semantic parsing. In: Proceedings of the 2009 conference on empirical methods in natural language processing, vol 1. Association for computational linguistics, Singapore, pp 1–10
Roberts A, Gaizauskas R, Hepple M, Davis N, Demetriou G, Guo Y, Kola JS, Roberts I, Setzer A, Tapuria A et al (2007) The clef corpus: semantic annotation of clinical text. In: AMIA annual symposium proceedings. American Medical Informatics Association, Chicago, vol 2007, p 625
Surdeanu M, Johansson R, Meyers A, Màrquez L, Nivre J (2008) The conll-2008 shared task on joint parsing of syntactic and semantic dependencies. In: Proceedings of the twelfth conference on computational natural language learning. Association for computational linguistics, Manchester, pp 159–177
Uren V, Cimiano P, Iria J, Handschuh S, Vargas-Vera M, Motta E, Ciravegna F (2006) Semantic annotation for knowledge management: requirements and a survey of the state of the art. Web Semant sci serv Agents World Wide Web 4(1):14–28
Völkel M, Krötzsch M, Vrandečić D, Haller H, Studer R (2006) Semantic wikipedia. In: Proceedings of the 15th international conference on World Wide Web, WWW 2006, Edinburgh
Warren DHD (1981) Efficient processing of interactive relational data base queries expressed in logic. In: Proceedings of the seventh international conference on very large data bases, vol 7. VLDB Endowment, Cannes, pp 272–281
Wong YW, Mooney RJ (2006) Learning for semantic parsing with statistical machine translation. In: Proceedings of the main conference on human language technology conference of the North American chapter of the association of computational linguistics. Association for computational linguistics, New York, pp 439–446
Zelle JM, Mooney RJ (1996) Learning to parse database queries using inductive logic programming. In: Proceedings of the national conference on artificial intelligence, Portland, pp 1050–1055
Zettlemoyer LS, Collins M (2009) Learning context-dependent mappings from sentences to logical form. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, vol 2. Association for Computational Linguistics, Stroudsburg, pp 976–984
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this entry
Cite this entry
Pacifico, S., Starc, J., Brank, J., Bradesko, L., Grobelnik, M. (2017). Semantic Annotation of Text Using Open Semantic Resources. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_903
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7687-1_903
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering