Abstract
The aim of this paper is to develop (i) a general framework for the analysis of verb-noun (VN) collocations in English and Romanian, and (ii) a system for the extraction of VN-collocations from large tagged and annotated corpora. We identify VN-collocations in two steps: (i) by calculation of the frequent lexical co-occurrences of each VN-pair, and (ii) the identification of the most typical lexico-grammatical constructions in which each VN-pair is involved in.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Firth, J.R.: Papers in linguistics 1934-1951. Oxford University Press, Oxford (1957)
Halliday, M.: An Introduction to Functional Grammar. Arnold, London (1985)
Sinclair, J.: Corpus, Concordance, Collocation. Oxford University Press, Oxford (1991)
Hunston, S., Francis, G.: Pattern Grammar - A Corpus-Driven Approach to the Lexical Grammar of English. John Benjamins, Amsterdam (2000)
Gledhill, C.: Collocations in Science Writing. Gunter Narr Verlag, Tübingen (2000)
Cowie, A.P.: The treatment of collocations and idioms in learner’s dictionaries. Applied Linguistics 2(3), 223–235 (1981)
Smadja, F.A., McKeown, K.R.: Automatically extracting and representing collocations for language generation. In: Proceedings of ACL 1990, Pittsburgh, Pennsylvania, pp. 252–259 (1990)
Grossmann, F., Tutin, A. (eds.): Les Collocations. Analyse et traitement, coll. Travaux et Recherches en Linguistique Appliquée, Amsterdam, De Werelt (2003)
Hausmann, F.J.: Was sind eigentlich Kollokationnen? In: Steyer, K. (ed.) Wortverbindungen – mehr oder weniger fest, pp. 309–334 (2004)
Gledhill, C., Frath, P.: Collocation, phrasème, dénomination: vers une théorie de la créativité phraséologique, La Linguistique 43(1), 65–90 (2007)
Quasthoff, U.: Tools for Automatic Lexicon Maintenance: Acquisition, Error Correction, and the Generation of Missing Values. In: Proceedings LREC 1998, ELRA, pp. 853–856 (1998)
Seretan, V., Nerima, L., Wehrli, E.: A tool for multi-word collocation extraction and visualization in multilingual corpora. In: Proceedings of EURALEX 2004, Lorient, France, vol. 2, pp. 755–766 (2004)
Tutin, A.: Pour une modélisation dynamique des collocations dans les textes. In: Actes du congrès EURALEX 2004, Lorient, France, vol. 1, pp. 207–221 (2004)
Heid, U., Ritz, J.: Extracting collocations and their contexts from corpora. In: Actes de COMPLEX-2005, Budapest (2005)
Ritz, J., Heid, U.: Extraction tools for collocations and their morphosyntactic specificities. In: Proceedings of LREC-2006, Genova, Italia (2006)
Tufiş, D., Ion, R., Ceauşu, A., Stefănescu, D.: Combined Aligners. In: Proceeding of the ACL2005 Workshop on Building and Using Parallel Corpora: Data-driven Machine Translation and Beyond, pp. 107–110. Ann Arbor, Michigan (2005)
Kermes, H.: Off-line (and On-line) Text Analysis for Computational Lexicography. Arbeitspapiere des Instituts für Maschinelle Sprachverarbeitung (AIMS) 9(3) (2003)
Rousselot, F., Montessuit, N.: LIKES un environnement d’ingénierie linguistique et d’ingénierie des connaissances. In: Workshop INTEX Sofia Bulgarie (2004)
Stefanescu, D., Tufis, D., Irimia, E.: Extragerea colocatiilor dintr-un text. In: Atelierul, Resurse lingvistice si instrumente pentru prelucrarea limbii române, pp. 89–95. Universitatea Al.I.Cuza Iasi, Romania (2006)
Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufiş, D., Varga, D.: The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages. In: Proceedings of the 5th LREC Conference, pp. 2142–2147 (2006)
Schmid, D.: Probabilistic Part-of-Speech Tagging Using Decision Trees. In: Proceedings of International Conference on New Methods in Language Processing (1994)
Ion, R.: TTL: A portable framework for tokenization, tagging and lemmatization of large corpora. Technical Report, Research Institute for Artificial Intelligence, Romanian Academy, Bucharest (2006) (in Romanian)
Gledhill, C.: Portée, Pivot, Paradigme: trois termes pour faire le point sur les expressions verbo-nominales. In: Frath, P. (ed.) Zeitschrift für Französische Sprache und Literatur Beihefte, vol. 35, pp. 59–76. Franz Steiner Verlag, Stuttgart (2008)
Banks, D.: The Range of Range: A transitivity problem for systemic linguistics. Anglophonia 8, 195–206 (2000)
Grimshaw, J., Mester, A.: Light Verbs and θ-Marking. Linguistic Inquiry 19, 205–232 (1988)
Gross, G.: Les constructions converses du français, Genève, Droz (1989)
Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Todirascu, A., Gledhill, C.: Extracting Collocations in Context: The case of Verb-Noun Constructions in English and Romanian. In: Recherches Anglaises et Nord-Américaines (RANAM), Université Marc Bloch Strasbourg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Todirascu, A., Gledhill, C., Stefanescu, D. (2009). Extracting Collocations in Contexts. In: Vetulani, Z., Uszkoreit, H. (eds) Human Language Technology. Challenges of the Information Society. LTC 2007. Lecture Notes in Computer Science(), vol 5603. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04235-5_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-04235-5_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04234-8
Online ISBN: 978-3-642-04235-5
eBook Packages: Computer ScienceComputer Science (R0)