Abstract
In this study, we designed a model to determine synonymy. Our main assumption is that synonym pairs show similar semantic and dependency relation by the definition. They share same meronym/holonym and hypernym/hyponym relations. Contrary to synonymy, hypernymy and meronymy relations can probably be acquired by applying lexico-syntactic patterns to a big corpus. Such acquisition might be utilized and ease detection of synonymy. Likewise, we utilized some particular dependency relations such as object/subject of a verb, etc. Machine learning algorithms were applied on all these acquired features. The first aim is to find out which dependency and semantic features are the most informative and contribute most to the model. Performance of each feature is individually evaluated with cross validation. The model that combines all features shows promising results and successfully detects synonymy relation. The main contribution of the study is to integrate both semantic and dependency relation within distributional aspect. Second contribution is considered as being first major attempt for Turkish synonym identification based on corpus-driven approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Mandala, R., Tokunaga, T., Tanaka, H.: Combining Multiple Evidence from Different Types of Thesaurus for Query Expansion. In: 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, USA, pp. 191–197 (1999)
Bai, J., Song, D., Bruza, P., Nie, J., Cao, G.: Query Expansion Using Term Relationships in Language Models for Information Retrieval. In: 14th ACM International Conference on Information and Knowledge Management, Bremen, Germany, pp. 688–695 (2005)
Stefan, R., Liu, Y., Vasserman, A.: Translating Queries into Snippets for Improved Query Expansion. In: 22nd International Conference on Computational Linguistics, COLING 2008, Manchester, UK, pp. 737–744 (2008)
Lin, D.: Automatic Retrieval and Clustering of Similar Words. In: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Quebec, Canada, pp. 768–774 (1998)
Inkpen, D.: A Statistical Model for Near-synonym Choice. ACM Transactions on Speech and Language Processing 4(1), 1–17 (2007)
Barzilay, R., Elhadad, M.: Using Lexical Chains for Text Summarization. In: Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 10–17 (1997)
Inkpen, D.Z., Hirst, G.: Near-synonym Choice in Natural Language Generation. In: Recent Advances in Natural Language Processing III, Selected Papers from RANLP 2003, Borovets, Bulgaria, pp. 141–152 (2003)
McCarthy, D., Navigli, R.: The English Lexical Substitution Task. Language Resources and Evaluation 43(2), 139–159 (2009)
Mirkin, S., Dagan, I., Geffet, M.: Integrating Pattern-Based and Distributional Similarity Methods for Lexical Entailment Acquisition. In: Proceedings of the COLING/ACL 2006 on Main Conference Poster Sessions, Sydney, Austraila, pp. 579–586 (2006)
Harris, Z.: Distributional Structure. Word 10(23), 146–162 (1954)
Hearst, M.A.: Automatic Acquisition of Hyponyms from Large Text Corpora. In: 14th International Conference on Computational Linguistics, COLING 1992, Nantes, France, pp. 539–545 (1992)
Lin, D., Zhao, S., Qin, L., Zhou, M.: Identifying Synonyms among Distributionally Similar Words. In: IJCAI 2003, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, Acapulco, Mexico, pp. 1492–1493 (2003)
Wang, W., Thomas, C., Sheth, A.P., Chan, V.: Pattern-based Synonym and Antonym Extraction. In: Proceedings of the 48th Annual Southeast Regional Conference, Oxford, MS, USA, p. 64 (2010)
Hagiwara, M.: A Supervised Learning Approach to Automatic Synonym Identification Based on Distributional Features. In: 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Student Research Workshop, Columbus, OH, pp. 1–6 (2008)
Heylen, K., Peirsman, Y., Geeraerts, D., Speelman, D.: Modelling Word Similarity: an Evaluation of Automatic Synonymy Extraction Algorithms. In: International Conference on Language Resources and Evaluation, LREC (2008)
Hindle, D.: Noun Classification from Predicate-Argument Structures. In: 28th Annual Meeting of the Association for Computational Linguistics, Pittsburgh, Pennsylvania, USA, pp. 268–275 (1990)
Gasperin, C., Gamallo, P., Agustini, A., Lopes, G., Lima, V.: Using Syntactic Contexts for Measuring Word Similarity. In: Workshop on Knowledge Acquisition and Categorization, ESSLLI (2001)
Curran, J.R., Moens, M.: Improvements in Automatic Thesaurus Extraction. In: ACL 2002 Workshop on Unsupervised Lexical Acquisition, Philadelphia, USA, pp. 59–66 (2002)
van der Plas, L., Bouma, G.: Syntactic Contexts for Finding Semantically Related Words. In: Meeting of Computational Linguistics in the Netherlands (CLIN), Amsterdam, pp. 173–186 (2005)
Barzilay, R., McKeown, K.: Extracting Paraphrases from a Parallel Corpus. In: 39th Annual Meeting and 10th Conference of the European Chapter, Proceedings of the Conference, Toulouse, France, pp. 50–57 (2001)
Ibrahim, A., Katz, B., Lin, J.: Extracting Structural Paraphrases from Aligned Monolingual Corpora. In: The Second International Workshop on Paraphrasing: Paraphrase Acquisition and Applications, Sapporo, Japan, pp. 57–64 (2003)
Shimohata, M., Sumita, E.: Automatic Paraphrasing Based on Parallel Corpus for Normalization. In: Third International Conference on Language Resources and Evaluation, Las Palmas, Canary Islands, Spain, pp. 453–457 (2002)
van der Plas, L., Tiedemann, J.: Finding Synonyms Using Automatic Word Alignment and Measures of Distributional Similarity. In: 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, pp. 866–873 (2006)
Blondel, V.D., Sennelart, P.: Automatic Extraction of Synonyms in a Dictionary. In: SIAM Workshop on Text Mining, Arlington, VA (2002)
Wang, T., Hirst, G.: Exploring Patterns in Dictionary Definitions for Synonym Extraction. Natural Language Engineering 18(3), 313–342 (2012)
Freitag, D., Blume, M., Byrnes, J., Chow, E., Kapadia, S., Rohwer, R., Wang, Z.: New Experiments in Distributional Representations of Synonymy. In: Ninth Conference on Computational Natural Language Learning (CoNLL), Ann Arbor, Michigan, pp. 25–32 (2005)
Terra, E., Clarke, C.: Frequency Estimates for Statistical Word Similarity Measures. In: HTL/NAACL 2003, Edmonton, Canada, pp. 165–172 (2003)
Turney, P.D., Littman, M.L., Bigham, J., Shnayder, V.: Combining Independent Modules in Lexical Multiple-choice Problems. In: Recent Advances in Natural Language Processing III, Selected Papers from RANLP 2003, Borovets, Bulgaria, pp. 101–110 (2003)
Turney, P.D.: A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations. In: 22nd International Conference on Computational Linguistics, Coling 2008, Manchester, UK, pp. 905–912 (2008)
Yates, A., Goharian, N., Frieder, O.: Graded Relevance Ranking for Synonym Discovery. In: 22nd International World Wide Web Conference, WWW 2013, Rio de Janeiro, Brazil, pp. 139–140 (2013)
Hagiwara, M., Ogawa, Y., Toyama, K.: Selection of Effective Contextual Information for Automatic Synonym Acquisition. In: 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, pp. 353–360 (2006)
Yazici, E., Amasyali, M.F.: Automatic Extraction of Semantic Relationships using Turkish Dictionary Definitions. In: EMO Bilimsel Dergi, Istanbul (2011)
Sak, H., Güngör, T., Saraçlar, M.: Turkish Language Resources: Morphological Parser, Morphological Disambiguator and Web Corpus. In: Nordström, B., Ranta, A. (eds.) GoTAL 2008. LNCS (LNAI), vol. 5221, pp. 417–427. Springer, Heidelberg (2008)
Serbetci, A., Orhan, Z., Pehlivan, I.: Extraction of Semantic Word Relations in Turkish from Dictionary Definitions. In: ACL 2011 Workshop on Relational Models of Semantics, Portland, pp. 11–18 (2011)
Orhan, Z., Pehlivan, I., Uslan, V., Onder, P.: Automated Extraction of Semantic Word Relations in Turkish Lexicon. Mathematical and Computational Applications (1), 13–22 (2011)
Yildirim, S., Yildiz, T.: Automatic Extraction of Turkish Hypernym-Hyponym Pairs From Large Corpus. In: COLING (Demos), pp. 493–500 (2012)
Yildiz, T., Diri, B., Yildirim, S.: Analysis of Lexico-syntactic Patterns for Meronym Extraction from a Turkish Corpus. In: 6th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pp. 126–138 (2013)
Yıldız, T., Yıldırım, S., Diri, B.: Extraction of Part-Whole Relations from Turkish Corpora. In: Gelbukh, A. (ed.) CICLing 2013, Part I. LNCS, vol. 7816, pp. 126–138. Springer, Heidelberg (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Yıldız, T., Yıldırım, S., Diri, B. (2014). An Integrated Approach to Automatic Synonym Detection in Turkish Corpus. In: Przepiórkowski, A., Ogrodniczuk, M. (eds) Advances in Natural Language Processing. NLP 2014. Lecture Notes in Computer Science(), vol 8686. Springer, Cham. https://doi.org/10.1007/978-3-319-10888-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-10888-9_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10887-2
Online ISBN: 978-3-319-10888-9
eBook Packages: Computer ScienceComputer Science (R0)