Abstract
The paper presents a method of automatic enrichment of a very large dictionary of word combinations. The method is based on results of automatic syntactic analysis (parsing) of sentences. The dependency formalism is used for representation of syntactic trees that allows for easier treatment of information about syntactic compatibility. Evaluation of the method is presented for the Spanish language based on comparison of the automatically generated results with manually marked word combinations.
Work was done under partial support of Mexican Government (CONACyT, SNI, CGPI-IPN, PIFI-IPN), Korean Government (KIPA Professorship for Visiting Faculty Positions in Korea), and ITRI of CAU. The first author is currently on Sabbatical leave at Chung-Ang University. We thank Prof. Igor A. Bolshakov for useful discussions.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baddorf, D.S., Evens, M.W.: Finding phrases rather than discovering collocations: Searching corpora for dictionary phrases. In: Proc. of the 9th Midwest Artificial Intelligence and Cognitive Science Conference (MAICS 1998), Dayton, USA (1998)
Bank of English. Collins, http://titania.cobuild.collins.co.uk/boe_info.html
Basili, R., Pazienza, M.T., Velardi, P.: Semi-automatic extraction of linguistic information for syntactic disambiguation. Applied Artificial Intelligence 7, 339–364 (1993)
Bolshakov, I.A.: Multifunction thesaurus for Russian word processing. In: Proceedings of 4th Conference on Applied Natural language Processing, Stuttgart, October 13-15, pp. 200–202 (1994)
Bolshakov, I.A., Gelbukh, A.: A Very Large Database of Collocations and Semantic Links. In: Bouzeghoub, M., Kedad, Z., Métais, E. (eds.) NLDB 2000. LNCS, vol. 1959, pp. 103–114. Springer, Heidelberg (2001)
Bolshakov, I.A., Gelbukh, A.: Word Combinations as an Important Part of Modern Electronic Dictionaries. In: Revista SEPLN (Sociedad Español para el Procesamiento del Lenguaje Natural), Septiembre 2002, vol. 29, pp. 47–54 (2002)
Church, K., Gale, W., Hanks, P., Hindle, D.: Parsing, word associations and typical predicate-argument relations. In: Tomita, M. (ed.) Current Issues in Parsing Technology, Kluwer Academic, Dordrecht (1991)
Dagan, I., Lee, L., Pereira, F.: Similarity-based models of word cooccurrence probabilities. Machine Learning 34(1) (1999)
Gelbukh, A., Sidorov, G., Galicia Haro, S., Bolshakov, I.: Environment for Development of a Natural Language Syntactic Analyzer. In: Acta Academia 2002, Moldova, pp. 206–213 (2002)
Kim, S., Yoon, J., Song, M.: Automatic extraction of collocations from Korean text. Computers and the Humanities 35(3), 273–297 (2001)
Kita, K., Kato, Y., Omoto, T., Yano, Y.: A comparative study of automatic extraction of collocations from corpora: Mutual information vs. cost criteria. Journal of Natural Language Processing 1(1), 21–33 (1994)
Mel’čuk, I.: Dependency syntax, p. 428. New York Press, Albany (1988)
Mel’čuk, I.: Phrasemes in language and phraseology in linguistics. In: Idioms: structural and psychological perspective, pp. 167–232
Oxford collocation dictionary, Oxford (2003)
Smadja, F.: Retrieving collocations from texts: Xtract. Computational linguistics 19(1), 143–177 (1993)
Smadja, F., McKeown, K.R., Hatzivassiloglou, V.: Translating collocations for bilingual lexicons: A statistical approach. Computational Linguistics 22(1), 1–38 (1996)
Strzalkowski, T.: Evaluating natural language processing techniques in information retrieval. In: Strzalkowski, T. (ed.) Natural language information retrieval, pp. 113–146. Kluwer, Dordrecht (1999)
Yu, J., Jin, Z., Wen, Z.: Automatic extraction of collocations (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gelbukh, A., Sidorov, G., Han, SY., Hernández-Rubio, E. (2004). Automatic Enrichment of Very Large Dictionary of Word Combinations on the Basis of Dependency Formalism. In: Monroy, R., Arroyo-Figueroa, G., Sucar, L.E., Sossa, H. (eds) MICAI 2004: Advances in Artificial Intelligence. MICAI 2004. Lecture Notes in Computer Science(), vol 2972. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24694-7_44
Download citation
DOI: https://doi.org/10.1007/978-3-540-24694-7_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21459-5
Online ISBN: 978-3-540-24694-7
eBook Packages: Springer Book Archive