Skip to main content
Log in

Annotating korean text documents with linked data resources

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Semantic annotation approaches link entities from a knowledge base to mentions of entities in text to provide additional content-related information. Recently increasing use of resources from the Linked Open Data (LOD) Cloud has been made to annotate text documents thanks to the network of machine-understandable, interlinked data. While existing approaches to semantic annotation in the LOD context have been proven to be well performing with the English language, many other languages in general and the Korean language in particular are still underrepresented. We investigate the applicability of existing semantic annotation approaches to the Korean language by adapting two popular approaches in the semantic annotation field and evaluating those approaches on an English-Korean bilingual sense-tagged corpus. Further, general challenges in internationalization of annotation approaches are summarized.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://lod2.eu

  2. http://thedatahub.org/group/lodcloud

  3. http://dbpedia.org

  4. Example sentence taken from washingtonpost.com

  5. http://dbpedia.org/spotlight

  6. Source: www.etoday.co.kr (March 22, 2012)

  7. http://ko.dbpedia.org

  8. http://ko.dbpedia.org

  9. http://lucene.apache.org

  10. http://sourceforge.net/projects/lucenekorean

  11. http://alias-i.com/lingpipe/docs/api/com/aliasi/dict/ExactDictionaryChunker.html

  12. http://dumps.wikimedia.org/kowiki

  13. http://opennlp.apache.org

  14. http://corpora.uni-leipzig.de

  15. http://www.cs.waikato.ac.nz/ml/weka

  16. http://www.ted.com

  17. http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Linking

References

  1. Auer S, Bizer C, Kobilarov G, Lehmann J (2007) DBpedia: a nucleus for a web of open data. In: 6th international semantic web conference (ISWC07)

  2. Auer S, Weidl M, Lehmann J, Zaveri AJ, Choi KS (2010) I18n of semantic web applications. In: 9th international semantic web conference (ISWC10)

  3. Benjamins V, Contreras J, Corcho O (2002) Six challenges for the semantic web. In: KR2002 workshop on formal ontology, knowledge representation and intelligent systems for the web

  4. Chai H (2007) Automatic annotation for korean - approach based on the contextual exploration method. In: Database and expert systems applications (DEXA07)

  5. Chai H, Djioua B, Le Priol F (2010) Korean semantic annotation on the EXCOM platform. In: Proceedings of the 21st Pacific Asia conference on language, information and computation

  6. Chung T, Post M (2010) Factors affecting the accuracy of korean parsing. In: NAACL HLT 2010 first workshop on statistical parsing of morphologically-rich languages (SPMRL10)

  7. Djioua B, Flores J, Blais A, Desclés J (2006) EXCOM: an automatic annotation engine for semantic information. In: Proceedings of the FLAIRS conference 2006

  8. Ferragina P (2010) TAGME: on-the-fly annotation of short text fragments (by Wikipedia Entities). In: 19th ACM conference on information and knowledge management (CIKM10)

  9. Gerber A, Gao L (2011) A scoping study of (who, what, when, where) semantic tagging services. University of Queensland, Australia

  10. Halpern J (2006) The contribution of lexical resources to natural language processing of CJK languages. In: 5th international conference on chinese spoken language processing (ISCSLP06)

  11. Heath T, Bizer C (2011) Linked data: evolving the web into a global data space. In: Synthesis lectures on the semantic web

  12. Kim E, Weidl M, Choi K (2010) Towards a Korean DBpedia and an approach for complementing the Korean Wikipedia based on DBpedia. In: Proceedings of the 5th open knowledge conference

  13. Medelyan O, Witten I (2008) Topic indexing with Wikipedia In: Proceedings of the AAAI WikiAI workshop

  14. Meij E, Weerkamp W (2012) Adding semantics to microblog posts. In: 5th ACM international conference on web search and data mining (WSDM12)

  15. Mendes P, Jakob M, Garcia-Silva A (2011) DBpedia spotlight: shedding light on the web of documents. In: 7th international conference on semantic systems (I-Semantics)

  16. Mihalcea R (2007) Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the sixteenth ACM conference on information and knowledge management (CIKM2007)

  17. Milne D (2008) Learning to link with Wikipedia. In: 17th ACM conference on information and knowledge management (CIKM08)

  18. Milne D (2009) An open-source toolkit for mining Wikipedia. In: Proceedings of New Zealand computer science research

  19. Ratinov L, Roth D, Downey D (2011) Local and global algorithms for disambiguation to Wikipedia. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (HLT2011)

  20. Rizzo G (2011) NERD: evaluating named entity recognition tools in the web of data. In: 10th international semantic web conference (ISWC2011)

  21. Zheng H, Kang B, Koo S, Choi H (2006) A semantic annotation tool to extract instances from korean web documents. In: 1st semantic authoring and annotation workshop of 5th international semantic web conference (ISWC2006)

Download references

Acknowledgements

This research was conducted by the International Collaborative Research and Development Program (Creating Knowledge out of Interlinked Data) and funded by the Korean Ministry of Knowledge Economy.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mun Yong Yi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Müller, D., Yi, M.Y. Annotating korean text documents with linked data resources. Multimed Tools Appl 68, 413–427 (2014). https://doi.org/10.1007/s11042-012-1339-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-012-1339-y

Keywords

Navigation