Abstract
In this study, we propose a method for acquisition of hyponymy relations for the Turkish Language. This integrated method relies on both lexico-syntactic pattern and semantic similarity. Once the model has extracted the items using patterns it applies similarity based elimination of the incorrect ones in order to increase precision. We show that the algorithm based on a particular lexico-syntactic pattern for Turkish language can retrieve many hyponymy relations and also demonstrate that elimination based on semantic similarity gives promising results. We discuss how we measure the similarity between the concepts. The objective is to get better relevance and more precise results. The experiments show that this approach gives successful results with high precision.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Sak, H., Güngör, T., Saraçlar, M.: Turkish Language Resources: Morphological Parser, Morphological Disambiguator and Web Corpus. In: Nordström, B., Ranta, A. (eds.) GoTAL 2008. LNCS (LNAI), vol. 5221, pp. 417–427. Springer, Heidelberg (2008)
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: WordNet: An on-line lexical database. International Journal of Lexicography 3, 235–244 (1990)
Lenat, D., Prakash, M., Shepherd, M.: CYC: Using Common Sense Knowledge to Overcome Brittleness and Knowledge Acquisition Bottlenecks. AI Magazine 6(4), 65–85 (1986)
Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Alshawi, H.: Processing Dictionary Definitions with Phrasal Pattern Hierarchies. American Journal of Computational Linguistics 13(3-4), 195–202 (1987)
Markowitz, J., Ahlswede, T., Evens, M.: Semantically Significant Patterns in Dictionary Definitions. In: Proceedings of the 24th Annual Meeting on Association for Computational Linguistics, ACL 1986, vol. 13, pp. 112–119. Association for Computational Linguistics (1986)
Jensen, K., Binot, J.: Disambiguating Prepositional Phrase Attachments by Using On-Line Dictionary Definitions. American Journal of Computational Linguistics 13(3-4), 251–260 (1987)
Nakamura, J., Nagao, M.: Extraction of semantic information from an ordinary English dictionary and its evaluation. In: Proceedings of the 12th International Conference on Computational Linguistics, COLING 1988, vol. 2, pp. 459–464. Association for Computational Linguistics (1988)
Ahlswede, T., Evens, M.E.: Parsing vs. Text Processing in the Analysis of Dictionary Definitions. In: Proceedings of the 26th Annual Meeting on Association for Computational Linguistics, ACL 1988, vol. 1, pp. 217–224. Association for Computational Linguistics (1988)
Hearst, M.A.: Automatic Acquisition of Hyponyms from Large Text Corpora. In: Proceedings of the 14th Conference on Computational Linguistics, COLING 1992, vol. 2, pp. 539–545. Association for Computational Linguistics (1992)
Hearst, M.A.: Automated Discovery of WordNet Relations. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database and Some of its Applications, pp. 131–152. MIT Press, Cambridge (1998)
Riloff, E., Shepherd, J.: A corpus-based approach for building semantic lexicons. In: Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp. 117–124 (1997)
Roark, B., Charniak, E.: Noun-phrase co-occurrence statistics for semi-automatic semantic lexicon construction. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, ACL 1998, vol. 2, pp. 1110–1116. Association for Computational Linguistics, Montreal (1998)
Caraballo, S.A.: Automatic Construction of a Hypernym-Labeled Noun Hierarchy From Text. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, ACL 1999, pp. 120–126. Association for Computational Linguistics (1999)
Alfonseca, E., Manandhar, S.: Improving an Ontology Refinement Method with Hyponymy Patterns. In: Proceedings of Language Resources and Evaluation (LREC 2002), pp. 235–239 (2001)
Etzioni, O., Cafarella, M.J., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Web-scale information extraction in knowitall (preliminary results). In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 100–110. ACM, New York (2004a)
Etzioni, O., Cafarella, M.J., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Methods for domain-independent information extraction from the Web: An experimental comparison. In: Proceedings of the 19th National Conference on Artifical Intelligence, AAAI 2004, pp. 391–398. AAAI Press (2004b)
Etzioni, O., Cafarella, M.J., Downey, D., Kok, S., Popescu, A., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence 165(1), 91–134 (2005)
Ritter, A., Soderl, S., Etzioni, O.: What is this, anyway: Automatic hypernym discovery. In: Proceedings of AAAI 2009 Spring Symposium on Learning, pp. 88–93. AAAI Press (2009)
Rydin, S.: Building a Hyponymy Lexicon with Hierarchical Structure. In: Proceedings of the ACL 2002 Workshop on Unsupervised Lexical Acquisition, ULA 2002, pp. 26–33. Association for Computational Linguistics (2002)
Cederberg, S., Widdows, D.: Using LSA and Noun Coordination Information to Improve the Precision and Recall of Automatic Hyponymy Extraction. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, CONLL 2003, vol. 4, pp. 111–118. Association for Computational Linguistics (2003)
Ando, M., Sekine, S., Ishizaki, S.: Automatic Extraction of Hyponyms from Newspapers Using Lexico-syntactic Patterns. In: Fourth Internationa Conference on Language Resource and Evaluationl, LREC 2004, Lisbon, Portugal (2004)
Snow, R., Jurafsky, D., Ng, A.Y.: Learning Syntactic Patterns for Automatic Hypernym Discovery. In: Advances in Neural Information Processing Systems, vol. 17. MIT Press, Cambridge (2005)
Tjong Kim Sang, E.F., Hofmann, K.: Automatic Extraction of Dutch Hypernym-Hyponym Pairs. In: Proceedings of the 17th Meeting of Computational Linguistics in the Netherlands, LOT, Netherlands Graduate School of Linguistics (2007)
Paşca, M.: Acquisition of Categorized Named Entities for Web Search. In: CIKM 2004: Proceedings of The Thirteenth ACM International Conference on Information and Knowledge Management, pp. 137–145. ACM Press, New York (2004)
Tjong Kim Sang, E.F.: Extracting Hypernym Pairs from the Web. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 165–168. Association for Computational Linguistics, Prague (2007)
Kozareva, Z., Riloff, E., Hovy, E.: Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs. In: Proceeding of ACL 2008, pp. 1048–1056. The Association for Computational Linguistics (2008)
Elghamry, K.: Using the Web in Building a Corpus-Based Hypernymy-Hyponymy Lexicon with Hierarchical Structure for Arabic. In: The Sixth International Conference on Informatics and Systems, INFOS 2008, Cairo, Egypt (2008)
Sombatsrisomboon, R., Matsuo, Y., Ishizuka, M.: Acquisition of Hypernyms and Hyponyms from the WWW. In: Proceedings of the 2nd International Workshop on Active Mining, Japan (2003)
Chodorow, M.S., Byrd, R.J., Heidorn, G.E.: Extracting Semantic Hierarchies from a Large On-Line Dictionary. In: Proceedings of the 23rd Annual Meeting on Association for Computational Linguistics, ACL 1985, pp. 299–304. Association for Computational Linguistics (1985)
Widdows, D., Dorow, B.: A Graph Model for Unsupervised Lexical Acquisition. In: Proceedings of the 19th International Conference on Computational Linguistics, COLING 2002, vol. 1, pp. 1093–1099. Association for Computational Linguistics (2002)
Sumida, A., Kentaro, T.: Hacking wikipedia for hyponymy relation acquisition. In: Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP), pp. 883–888. Association for Computational Linguistics (2008)
Shinzato, K., Torisawa, K.: Acquiring hyponymy relations from web documents. In: Proceedings of HLT-NAACL, vol. 80, pp. 73–80 (2004)
Imsombut, A., Kawtrakul, A.: Automatic building of an ontology on the basis of text corpora in Thai. Language Resources and Evaluation 42(2), 137–149 (2008)
Bilgin, O., Çetinoğlu, Ö., Oflazer, K.: Building a wordnet for Turkish. Romanian Journal of Information Science and Technology 7(1-2), 163–172 (2004)
Amasyalı, M.F.: Türkçe Wordnet’in Otomatik Olarak Oluşturulması. In: SIU 2005, Kayseri (2005)
Yazıcı, E., Amasyalı, M.F.: Automatic Extraction of Semantic Relationships Using Turkish Dictionary Definitions. EMO Bilimsel Dergi 1(1), 1–13 (2011)
Güngör, O., Güngör, T.: Türkçe Bir Sözlükteki Tanımlardan Kavramlar Arasındaki Üst-kavram İlişkilerinin Çıkarılması. Akademik Bilişim Konferansı 2007 1(1), 1–13 (2007)
Þerbetçi, A., Orhan, Z., Pehlivan, İ.: Extraction of Semantic Word Relations in Turkish from Dictionary Definitions. In: Proceedings of the ACL 2011 Workshop on Relational Models of Semantics (RELMS 2011), pp. 11–18. Association for Computational Linguistics, Portland (2011)
Orhan, Z., Pehlivan, İ., Uslan, V., Önder, P.: Automated Extraction of Semantic Word Relations in Turkish Lexicon. Mathematical and Computational Applications 16(1), 13–22 (2011)
Schütze, H.: Automatic Word Sense Discrimination. Computational Linguistics - Special Issue on Word Sense Disambiguation 24(1), 97–123 (1998)
Pedersen, T., Banerjee, S., Kohli, S., Joshi, M., McInnes, B.T., Liu, Y.: The Ngram Statistics Package (Text::NSP) - A Flexible Tool for Identifying Ngrams, Collocations, and Word Associations. In: Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World (MWE 2011), pp. 131–133. Association for Computational Linguistics, Portland (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yıldırım, S., Yıldız, T. (2012). Corpus-Driven Hyponym Acquisition for Turkish Language. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28604-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-28604-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28603-2
Online ISBN: 978-3-642-28604-9
eBook Packages: Computer ScienceComputer Science (R0)