Skip to main content

Corpus-Driven Hyponym Acquisition for Turkish Language

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7181))

Abstract

In this study, we propose a method for acquisition of hyponymy relations for the Turkish Language. This integrated method relies on both lexico-syntactic pattern and semantic similarity. Once the model has extracted the items using patterns it applies similarity based elimination of the incorrect ones in order to increase precision. We show that the algorithm based on a particular lexico-syntactic pattern for Turkish language can retrieve many hyponymy relations and also demonstrate that elimination based on semantic similarity gives promising results. We discuss how we measure the similarity between the concepts. The objective is to get better relevance and more precise results. The experiments show that this approach gives successful results with high precision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sak, H., Güngör, T., Saraçlar, M.: Turkish Language Resources: Morphological Parser, Morphological Disambiguator and Web Corpus. In: Nordström, B., Ranta, A. (eds.) GoTAL 2008. LNCS (LNAI), vol. 5221, pp. 417–427. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  2. Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: WordNet: An on-line lexical database. International Journal of Lexicography 3, 235–244 (1990)

    Article  Google Scholar 

  3. Lenat, D., Prakash, M., Shepherd, M.: CYC: Using Common Sense Knowledge to Overcome Brittleness and Knowledge Acquisition Bottlenecks. AI Magazine 6(4), 65–85 (1986)

    Google Scholar 

  4. Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  5. Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  6. Alshawi, H.: Processing Dictionary Definitions with Phrasal Pattern Hierarchies. American Journal of Computational Linguistics 13(3-4), 195–202 (1987)

    Google Scholar 

  7. Markowitz, J., Ahlswede, T., Evens, M.: Semantically Significant Patterns in Dictionary Definitions. In: Proceedings of the 24th Annual Meeting on Association for Computational Linguistics, ACL 1986, vol. 13, pp. 112–119. Association for Computational Linguistics (1986)

    Google Scholar 

  8. Jensen, K., Binot, J.: Disambiguating Prepositional Phrase Attachments by Using On-Line Dictionary Definitions. American Journal of Computational Linguistics 13(3-4), 251–260 (1987)

    Google Scholar 

  9. Nakamura, J., Nagao, M.: Extraction of semantic information from an ordinary English dictionary and its evaluation. In: Proceedings of the 12th International Conference on Computational Linguistics, COLING 1988, vol. 2, pp. 459–464. Association for Computational Linguistics (1988)

    Google Scholar 

  10. Ahlswede, T., Evens, M.E.: Parsing vs. Text Processing in the Analysis of Dictionary Definitions. In: Proceedings of the 26th Annual Meeting on Association for Computational Linguistics, ACL 1988, vol. 1, pp. 217–224. Association for Computational Linguistics (1988)

    Google Scholar 

  11. Hearst, M.A.: Automatic Acquisition of Hyponyms from Large Text Corpora. In: Proceedings of the 14th Conference on Computational Linguistics, COLING 1992, vol. 2, pp. 539–545. Association for Computational Linguistics (1992)

    Google Scholar 

  12. Hearst, M.A.: Automated Discovery of WordNet Relations. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database and Some of its Applications, pp. 131–152. MIT Press, Cambridge (1998)

    Google Scholar 

  13. Riloff, E., Shepherd, J.: A corpus-based approach for building semantic lexicons. In: Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp. 117–124 (1997)

    Google Scholar 

  14. Roark, B., Charniak, E.: Noun-phrase co-occurrence statistics for semi-automatic semantic lexicon construction. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, ACL 1998, vol. 2, pp. 1110–1116. Association for Computational Linguistics, Montreal (1998)

    Google Scholar 

  15. Caraballo, S.A.: Automatic Construction of a Hypernym-Labeled Noun Hierarchy From Text. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, ACL 1999, pp. 120–126. Association for Computational Linguistics (1999)

    Google Scholar 

  16. Alfonseca, E., Manandhar, S.: Improving an Ontology Refinement Method with Hyponymy Patterns. In: Proceedings of Language Resources and Evaluation (LREC 2002), pp. 235–239 (2001)

    Google Scholar 

  17. Etzioni, O., Cafarella, M.J., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Web-scale information extraction in knowitall (preliminary results). In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 100–110. ACM, New York (2004a)

    Chapter  Google Scholar 

  18. Etzioni, O., Cafarella, M.J., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Methods for domain-independent information extraction from the Web: An experimental comparison. In: Proceedings of the 19th National Conference on Artifical Intelligence, AAAI 2004, pp. 391–398. AAAI Press (2004b)

    Google Scholar 

  19. Etzioni, O., Cafarella, M.J., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence 165(1), 91–134 (2005)

    Article  Google Scholar 

  20. Ritter, A., Soderl, S., Etzioni, O.: What is this, anyway: Automatic hypernym discovery. In: Proceedings of AAAI 2009 Spring Symposium on Learning, pp. 88–93. AAAI Press (2009)

    Google Scholar 

  21. Rydin, S.: Building a Hyponymy Lexicon with Hierarchical Structure. In: Proceedings of the ACL 2002 Workshop on Unsupervised Lexical Acquisition, ULA 2002, pp. 26–33. Association for Computational Linguistics (2002)

    Google Scholar 

  22. Cederberg, S., Widdows, D.: Using LSA and Noun Coordination Information to Improve the Precision and Recall of Automatic Hyponymy Extraction. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, CONLL 2003, vol. 4, pp. 111–118. Association for Computational Linguistics (2003)

    Google Scholar 

  23. Ando, M., Sekine, S., Ishizaki, S.: Automatic Extraction of Hyponyms from Newspapers Using Lexico-syntactic Patterns. In: Fourth Internationa Conference on Language Resource and Evaluationl, LREC 2004, Lisbon, Portugal (2004)

    Google Scholar 

  24. Snow, R., Jurafsky, D., Ng, A.Y.: Learning Syntactic Patterns for Automatic Hypernym Discovery. In: Advances in Neural Information Processing Systems, vol. 17. MIT Press, Cambridge (2005)

    Google Scholar 

  25. Tjong Kim Sang, E.F., Hofmann, K.: Automatic Extraction of Dutch Hypernym-Hyponym Pairs. In: Proceedings of the 17th Meeting of Computational Linguistics in the Netherlands, LOT, Netherlands Graduate School of Linguistics (2007)

    Google Scholar 

  26. Paşca, M.: Acquisition of Categorized Named Entities for Web Search. In: CIKM 2004: Proceedings of The Thirteenth ACM International Conference on Information and Knowledge Management, pp. 137–145. ACM Press, New York (2004)

    Google Scholar 

  27. Tjong Kim Sang, E.F.: Extracting Hypernym Pairs from the Web. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 165–168. Association for Computational Linguistics, Prague (2007)

    Google Scholar 

  28. Kozareva, Z., Riloff, E., Hovy, E.: Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs. In: Proceeding of ACL 2008, pp. 1048–1056. The Association for Computational Linguistics (2008)

    Google Scholar 

  29. Elghamry, K.: Using the Web in Building a Corpus-Based Hypernymy-Hyponymy Lexicon with Hierarchical Structure for Arabic. In: The Sixth International Conference on Informatics and Systems, INFOS 2008, Cairo, Egypt (2008)

    Google Scholar 

  30. Sombatsrisomboon, R., Matsuo, Y., Ishizuka, M.: Acquisition of Hypernyms and Hyponyms from the WWW. In: Proceedings of the 2nd International Workshop on Active Mining, Japan (2003)

    Google Scholar 

  31. Chodorow, M.S., Byrd, R.J., Heidorn, G.E.: Extracting Semantic Hierarchies from a Large On-Line Dictionary. In: Proceedings of the 23rd Annual Meeting on Association for Computational Linguistics, ACL 1985, pp. 299–304. Association for Computational Linguistics (1985)

    Google Scholar 

  32. Widdows, D., Dorow, B.: A Graph Model for Unsupervised Lexical Acquisition. In: Proceedings of the 19th International Conference on Computational Linguistics, COLING 2002, vol. 1, pp. 1093–1099. Association for Computational Linguistics (2002)

    Google Scholar 

  33. Sumida, A., Kentaro, T.: Hacking wikipedia for hyponymy relation acquisition. In: Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP), pp. 883–888. Association for Computational Linguistics (2008)

    Google Scholar 

  34. Shinzato, K., Torisawa, K.: Acquiring hyponymy relations from web documents. In: Proceedings of HLT-NAACL, vol. 80, pp. 73–80 (2004)

    Google Scholar 

  35. Imsombut, A., Kawtrakul, A.: Automatic building of an ontology on the basis of text corpora in Thai. Language Resources and Evaluation 42(2), 137–149 (2008)

    Google Scholar 

  36. Bilgin, O., Çetinoğlu, Ö., Oflazer, K.: Building a wordnet for Turkish. Romanian Journal of Information Science and Technology 7(1-2), 163–172 (2004)

    Google Scholar 

  37. Amasyalı, M.F.: Türkçe Wordnet’in Otomatik Olarak Oluşturulması. In: SIU 2005, Kayseri (2005)

    Google Scholar 

  38. Yazıcı, E., Amasyalı, M.F.: Automatic Extraction of Semantic Relationships Using Turkish Dictionary Definitions. EMO Bilimsel Dergi 1(1), 1–13 (2011)

    Google Scholar 

  39. Güngör, O., Güngör, T.: Türkçe Bir Sözlükteki Tanımlardan Kavramlar Arasındaki Üst-kavram İlişkilerinin Çıkarılması. Akademik Bilişim Konferansı 2007 1(1), 1–13 (2007)

    Google Scholar 

  40. Þerbetçi, A., Orhan, Z., Pehlivan, İ.: Extraction of Semantic Word Relations in Turkish from Dictionary Definitions. In: Proceedings of the ACL 2011 Workshop on Relational Models of Semantics (RELMS 2011), pp. 11–18. Association for Computational Linguistics, Portland (2011)

    Google Scholar 

  41. Orhan, Z., Pehlivan, İ., Uslan, V., Önder, P.: Automated Extraction of Semantic Word Relations in Turkish Lexicon. Mathematical and Computational Applications 16(1), 13–22 (2011)

    Google Scholar 

  42. Schütze, H.: Automatic Word Sense Discrimination. Computational Linguistics - Special Issue on Word Sense Disambiguation 24(1), 97–123 (1998)

    Google Scholar 

  43. Pedersen, T., Banerjee, S., Kohli, S., Joshi, M., McInnes, B.T., Liu, Y.: The Ngram Statistics Package (Text::NSP) - A Flexible Tool for Identifying Ngrams, Collocations, and Word Associations. In: Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World (MWE 2011), pp. 131–133. Association for Computational Linguistics, Portland (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yıldırım, S., Yıldız, T. (2012). Corpus-Driven Hyponym Acquisition for Turkish Language. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28604-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28604-9_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28603-2

  • Online ISBN: 978-3-642-28604-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics