Corpus-Driven Hyponym Acquisition for Turkish Language

Yıldırım, Savaş; Yıldız, Tuğba

doi:10.1007/978-3-642-28604-9_3

Savaş Yıldırım¹⁷ &
Tuğba Yıldız¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7181))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2091 Accesses

Abstract

In this study, we propose a method for acquisition of hyponymy relations for the Turkish Language. This integrated method relies on both lexico-syntactic pattern and semantic similarity. Once the model has extracted the items using patterns it applies similarity based elimination of the incorrect ones in order to increase precision. We show that the algorithm based on a particular lexico-syntactic pattern for Turkish language can retrieve many hyponymy relations and also demonstrate that elimination based on semantic similarity gives promising results. We discuss how we measure the similarity between the concepts. The objective is to get better relevance and more precise results. The experiments show that this approach gives successful results with high precision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Leveraging Taxonomic Information from Large Language Models for Hyponymy Prediction

Predicting hypernym–hyponym relations for Chinese taxonomy learning

Article 10 February 2018

Relation Extraction: Hypernymy Discovery Using a Novel Pattern Learning Algorithm

Article 26 September 2023

References

Sak, H., Güngör, T., Saraçlar, M.: Turkish Language Resources: Morphological Parser, Morphological Disambiguator and Web Corpus. In: Nordström, B., Ranta, A. (eds.) GoTAL 2008. LNCS (LNAI), vol. 5221, pp. 417–427. Springer, Heidelberg (2008)
Chapter Google Scholar
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: WordNet: An on-line lexical database. International Journal of Lexicography 3, 235–244 (1990)
Article Google Scholar
Lenat, D., Prakash, M., Shepherd, M.: CYC: Using Common Sense Knowledge to Overcome Brittleness and Knowledge Acquisition Bottlenecks. AI Magazine 6(4), 65–85 (1986)
Google Scholar
Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)
Article Google Scholar
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
MATH Google Scholar
Alshawi, H.: Processing Dictionary Definitions with Phrasal Pattern Hierarchies. American Journal of Computational Linguistics 13(3-4), 195–202 (1987)
Google Scholar
Markowitz, J., Ahlswede, T., Evens, M.: Semantically Significant Patterns in Dictionary Definitions. In: Proceedings of the 24th Annual Meeting on Association for Computational Linguistics, ACL 1986, vol. 13, pp. 112–119. Association for Computational Linguistics (1986)
Google Scholar
Jensen, K., Binot, J.: Disambiguating Prepositional Phrase Attachments by Using On-Line Dictionary Definitions. American Journal of Computational Linguistics 13(3-4), 251–260 (1987)
Google Scholar
Nakamura, J., Nagao, M.: Extraction of semantic information from an ordinary English dictionary and its evaluation. In: Proceedings of the 12th International Conference on Computational Linguistics, COLING 1988, vol. 2, pp. 459–464. Association for Computational Linguistics (1988)
Google Scholar
Ahlswede, T., Evens, M.E.: Parsing vs. Text Processing in the Analysis of Dictionary Definitions. In: Proceedings of the 26th Annual Meeting on Association for Computational Linguistics, ACL 1988, vol. 1, pp. 217–224. Association for Computational Linguistics (1988)
Google Scholar
Hearst, M.A.: Automatic Acquisition of Hyponyms from Large Text Corpora. In: Proceedings of the 14th Conference on Computational Linguistics, COLING 1992, vol. 2, pp. 539–545. Association for Computational Linguistics (1992)
Google Scholar
Hearst, M.A.: Automated Discovery of WordNet Relations. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database and Some of its Applications, pp. 131–152. MIT Press, Cambridge (1998)
Google Scholar
Riloff, E., Shepherd, J.: A corpus-based approach for building semantic lexicons. In: Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp. 117–124 (1997)
Google Scholar
Roark, B., Charniak, E.: Noun-phrase co-occurrence statistics for semi-automatic semantic lexicon construction. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, ACL 1998, vol. 2, pp. 1110–1116. Association for Computational Linguistics, Montreal (1998)
Google Scholar
Caraballo, S.A.: Automatic Construction of a Hypernym-Labeled Noun Hierarchy From Text. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, ACL 1999, pp. 120–126. Association for Computational Linguistics (1999)
Google Scholar
Alfonseca, E., Manandhar, S.: Improving an Ontology Refinement Method with Hyponymy Patterns. In: Proceedings of Language Resources and Evaluation (LREC 2002), pp. 235–239 (2001)
Google Scholar
Etzioni, O., Cafarella, M.J., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Web-scale information extraction in knowitall (preliminary results). In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 100–110. ACM, New York (2004a)
Chapter Google Scholar
Etzioni, O., Cafarella, M.J., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Methods for domain-independent information extraction from the Web: An experimental comparison. In: Proceedings of the 19th National Conference on Artifical Intelligence, AAAI 2004, pp. 391–398. AAAI Press (2004b)
Google Scholar
Etzioni, O., Cafarella, M.J., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence 165(1), 91–134 (2005)
Article Google Scholar
Ritter, A., Soderl, S., Etzioni, O.: What is this, anyway: Automatic hypernym discovery. In: Proceedings of AAAI 2009 Spring Symposium on Learning, pp. 88–93. AAAI Press (2009)
Google Scholar
Rydin, S.: Building a Hyponymy Lexicon with Hierarchical Structure. In: Proceedings of the ACL 2002 Workshop on Unsupervised Lexical Acquisition, ULA 2002, pp. 26–33. Association for Computational Linguistics (2002)
Google Scholar
Cederberg, S., Widdows, D.: Using LSA and Noun Coordination Information to Improve the Precision and Recall of Automatic Hyponymy Extraction. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, CONLL 2003, vol. 4, pp. 111–118. Association for Computational Linguistics (2003)
Google Scholar
Ando, M., Sekine, S., Ishizaki, S.: Automatic Extraction of Hyponyms from Newspapers Using Lexico-syntactic Patterns. In: Fourth Internationa Conference on Language Resource and Evaluationl, LREC 2004, Lisbon, Portugal (2004)
Google Scholar
Snow, R., Jurafsky, D., Ng, A.Y.: Learning Syntactic Patterns for Automatic Hypernym Discovery. In: Advances in Neural Information Processing Systems, vol. 17. MIT Press, Cambridge (2005)
Google Scholar
Tjong Kim Sang, E.F., Hofmann, K.: Automatic Extraction of Dutch Hypernym-Hyponym Pairs. In: Proceedings of the 17th Meeting of Computational Linguistics in the Netherlands, LOT, Netherlands Graduate School of Linguistics (2007)
Google Scholar
Paşca, M.: Acquisition of Categorized Named Entities for Web Search. In: CIKM 2004: Proceedings of The Thirteenth ACM International Conference on Information and Knowledge Management, pp. 137–145. ACM Press, New York (2004)
Google Scholar
Tjong Kim Sang, E.F.: Extracting Hypernym Pairs from the Web. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 165–168. Association for Computational Linguistics, Prague (2007)
Google Scholar
Kozareva, Z., Riloff, E., Hovy, E.: Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs. In: Proceeding of ACL 2008, pp. 1048–1056. The Association for Computational Linguistics (2008)
Google Scholar
Elghamry, K.: Using the Web in Building a Corpus-Based Hypernymy-Hyponymy Lexicon with Hierarchical Structure for Arabic. In: The Sixth International Conference on Informatics and Systems, INFOS 2008, Cairo, Egypt (2008)
Google Scholar
Sombatsrisomboon, R., Matsuo, Y., Ishizuka, M.: Acquisition of Hypernyms and Hyponyms from the WWW. In: Proceedings of the 2nd International Workshop on Active Mining, Japan (2003)
Google Scholar
Chodorow, M.S., Byrd, R.J., Heidorn, G.E.: Extracting Semantic Hierarchies from a Large On-Line Dictionary. In: Proceedings of the 23rd Annual Meeting on Association for Computational Linguistics, ACL 1985, pp. 299–304. Association for Computational Linguistics (1985)
Google Scholar
Widdows, D., Dorow, B.: A Graph Model for Unsupervised Lexical Acquisition. In: Proceedings of the 19th International Conference on Computational Linguistics, COLING 2002, vol. 1, pp. 1093–1099. Association for Computational Linguistics (2002)
Google Scholar
Sumida, A., Kentaro, T.: Hacking wikipedia for hyponymy relation acquisition. In: Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP), pp. 883–888. Association for Computational Linguistics (2008)
Google Scholar
Shinzato, K., Torisawa, K.: Acquiring hyponymy relations from web documents. In: Proceedings of HLT-NAACL, vol. 80, pp. 73–80 (2004)
Google Scholar
Imsombut, A., Kawtrakul, A.: Automatic building of an ontology on the basis of text corpora in Thai. Language Resources and Evaluation 42(2), 137–149 (2008)
Google Scholar
Bilgin, O., Çetinoğlu, Ö., Oflazer, K.: Building a wordnet for Turkish. Romanian Journal of Information Science and Technology 7(1-2), 163–172 (2004)
Google Scholar
Amasyalı, M.F.: Türkçe Wordnet’in Otomatik Olarak Oluşturulması. In: SIU 2005, Kayseri (2005)
Google Scholar
Yazıcı, E., Amasyalı, M.F.: Automatic Extraction of Semantic Relationships Using Turkish Dictionary Definitions. EMO Bilimsel Dergi 1(1), 1–13 (2011)
Google Scholar
Güngör, O., Güngör, T.: Türkçe Bir Sözlükteki Tanımlardan Kavramlar Arasındaki Üst-kavram İlişkilerinin Çıkarılması. Akademik Bilişim Konferansı 2007 1(1), 1–13 (2007)
Google Scholar
Þerbetçi, A., Orhan, Z., Pehlivan, İ.: Extraction of Semantic Word Relations in Turkish from Dictionary Definitions. In: Proceedings of the ACL 2011 Workshop on Relational Models of Semantics (RELMS 2011), pp. 11–18. Association for Computational Linguistics, Portland (2011)
Google Scholar
Orhan, Z., Pehlivan, İ., Uslan, V., Önder, P.: Automated Extraction of Semantic Word Relations in Turkish Lexicon. Mathematical and Computational Applications 16(1), 13–22 (2011)
Google Scholar
Schütze, H.: Automatic Word Sense Discrimination. Computational Linguistics - Special Issue on Word Sense Disambiguation 24(1), 97–123 (1998)
Google Scholar
Pedersen, T., Banerjee, S., Kohli, S., Joshi, M., McInnes, B.T., Liu, Y.: The Ngram Statistics Package (Text::NSP) - A Flexible Tool for Identifying Ngrams, Collocations, and Word Associations. In: Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World (MWE 2011), pp. 131–133. Association for Computational Linguistics, Portland (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Istanbul Bilgi University, Dolapdere, 34440, Istanbul, Turkey
Savaş Yıldırım & Tuğba Yıldız

Authors

Savaş Yıldırım
View author publications
You can also search for this author in PubMed Google Scholar
Tuğba Yıldız
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research (CIC), National Polytechnic Institute (IPN), Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yıldırım, S., Yıldız, T. (2012). Corpus-Driven Hyponym Acquisition for Turkish Language. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28604-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-28604-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28603-2
Online ISBN: 978-3-642-28604-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics