skip to main content
10.1145/2024288.2024330acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesi-knowConference Proceedingsconference-collections
short-paper

Automatic acquisition of taxonomies in different languages from multiple Wikipedia versions

Published: 07 September 2011 Publication History

Abstract

In the last years, the vision of the Semantic Web has led to many approaches that aim to automatically derive knowledge bases from Wikipedia. These approaches rely mostly on the English Wikipedia as it is the largest Wikipedia version and have lead to valuable knowledge bases. However, each Wikipedia version contains socio-cultural knowledge, i.e. knowledge with specific relevance for a culture or language. One difficulty of the application of existing approaches to multiple Wikipedia versions is the use of additional corpora. In this paper, we describe the adaptation of existing heuristics that make the extraction of large sets of hyponymy relations from multiple Wikipedia versions with little information about each language possible. Further, we evaluate our approach with Wikipedia versions in four different languages and compare results with GermaNet for German and WordNet for English.

References

[1]
G. de Melo and G. Weikum. MENTA: Inducing Multilingual Taxonomies from Wikipedia. In Proceedings of the 19th ACM Conference on Information and Knowledge Management, pages 1099--1108, 2010.
[2]
L. Kassner, V. Nastase, and M. Strube. Acquiring a Taxonomy from the German Wikipedia. In Proceedings of the International Conference on Language Resources and Evaluation. European Language Resources Association, 2008.
[3]
O. Medelyan, C. Legg, D. Milne, and I. H. Witten. Mining Meaning from Wikipedia. Int. Journal of Human-Computer Studies, 67(9):716--754, 2008.
[4]
G. A. Miller. WordNet: A Lexical Database for English. Communications of the ACM, 38:39--41, 1995.
[5]
V. Nastase, M. Strube, B. Boerschinger, C. Zirn, and A. Elghafari. WikiNet: A Very Large Scale Multi-Lingual Concept Network. In Proceedings of the International Conference on Language Resources and Evaluation, 2010.
[6]
R. Navigli and S. P. Ponzetto. BabelNet: Building a Very Large Multilingual Semantic Network. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 216--225, 2010.
[7]
R. Navigli and P. Velardi. Learning word-class lattices for definition and hypernym extraction. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 1318--1327, 2010.
[8]
S. P. Ponzetto and M. Strube. Deriving a Large-Scale Taxonomy from Wikipedia. In Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, pages 1440--1445. AAAI Press, 2007.
[9]
F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: A Core of Semantic Knowledge. In 16th International World Wide Web conference (WWW 2007), New York, NY, USA, 2007. ACM Press.
[10]
F. Wu and D. S. Weld. Open Information Extraction Using Wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 118--127, 2010.
[11]
I. Yamada, K. Torisawa, J. Kazama, K. Kuroda, M. Murata, S. De Saeger, F. Bond, and A. Sumida. Hypernym discovery based on distributional similarity and hierarchical structures. In Proceedings of the 2009 Conf.on Empirical Methods in Natural Language Processing: Volume 2, EMNLP '09, pages 929--937, 2009.

Cited By

View all
  • (2017)Automatic Acquisition of Controlled Vocabularies from Wikipedia Using Wikilinks, Word Ranking, and a Dependency ParserAdvances in Computing10.1007/978-3-319-66562-7_3(32-43)Online publication date: 17-Aug-2017
  • (2012)Automatic taxonomy extraction in different languages using wikipedia and minimal language-specific informationProceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I10.1007/978-3-642-28604-9_4(42-53)Online publication date: 11-Mar-2012

Index Terms

  1. Automatic acquisition of taxonomies in different languages from multiple Wikipedia versions

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      i-KNOW '11: Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies
      September 2011
      306 pages
      ISBN:9781450307321
      DOI:10.1145/2024288
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 September 2011

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Wikipedia mining
      2. hyponymy detection
      3. knowledge derivation
      4. taxonomy acquisition

      Qualifiers

      • Short-paper

      Conference

      i-KNOW '11

      Acceptance Rates

      Overall Acceptance Rate 77 of 238 submissions, 32%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 25 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2017)Automatic Acquisition of Controlled Vocabularies from Wikipedia Using Wikilinks, Word Ranking, and a Dependency ParserAdvances in Computing10.1007/978-3-319-66562-7_3(32-43)Online publication date: 17-Aug-2017
      • (2012)Automatic taxonomy extraction in different languages using wikipedia and minimal language-specific informationProceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I10.1007/978-3-642-28604-9_4(42-53)Online publication date: 11-Mar-2012

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media