Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5603))

Included in the following conference series:

Abstract

The aim of this paper is to develop (i) a general framework for the analysis of verb-noun (VN) collocations in English and Romanian, and (ii) a system for the extraction of VN-collocations from large tagged and annotated corpora. We identify VN-collocations in two steps: (i) by calculation of the frequent lexical co-occurrences of each VN-pair, and (ii) the identification of the most typical lexico-grammatical constructions in which each VN-pair is involved in.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Firth, J.R.: Papers in linguistics 1934-1951. Oxford University Press, Oxford (1957)

    Google Scholar 

  2. Halliday, M.: An Introduction to Functional Grammar. Arnold, London (1985)

    Google Scholar 

  3. Sinclair, J.: Corpus, Concordance, Collocation. Oxford University Press, Oxford (1991)

    Google Scholar 

  4. Hunston, S., Francis, G.: Pattern Grammar - A Corpus-Driven Approach to the Lexical Grammar of English. John Benjamins, Amsterdam (2000)

    Book  Google Scholar 

  5. Gledhill, C.: Collocations in Science Writing. Gunter Narr Verlag, Tübingen (2000)

    Google Scholar 

  6. Cowie, A.P.: The treatment of collocations and idioms in learner’s dictionaries. Applied Linguistics 2(3), 223–235 (1981)

    Article  Google Scholar 

  7. Smadja, F.A., McKeown, K.R.: Automatically extracting and representing collocations for language generation. In: Proceedings of ACL 1990, Pittsburgh, Pennsylvania, pp. 252–259 (1990)

    Google Scholar 

  8. Grossmann, F., Tutin, A. (eds.): Les Collocations. Analyse et traitement, coll. Travaux et Recherches en Linguistique Appliquée, Amsterdam, De Werelt (2003)

    Google Scholar 

  9. Hausmann, F.J.: Was sind eigentlich Kollokationnen? In: Steyer, K. (ed.) Wortverbindungen – mehr oder weniger fest, pp. 309–334 (2004)

    Google Scholar 

  10. Gledhill, C., Frath, P.: Collocation, phrasème, dénomination: vers une théorie de la créativité phraséologique, La Linguistique 43(1), 65–90 (2007)

    Google Scholar 

  11. Quasthoff, U.: Tools for Automatic Lexicon Maintenance: Acquisition, Error Correction, and the Generation of Missing Values. In: Proceedings LREC 1998, ELRA, pp. 853–856 (1998)

    Google Scholar 

  12. Seretan, V., Nerima, L., Wehrli, E.: A tool for multi-word collocation extraction and visualization in multilingual corpora. In: Proceedings of EURALEX 2004, Lorient, France, vol. 2, pp. 755–766 (2004)

    Google Scholar 

  13. Tutin, A.: Pour une modélisation dynamique des collocations dans les textes. In: Actes du congrès EURALEX 2004, Lorient, France, vol. 1, pp. 207–221 (2004)

    Google Scholar 

  14. Heid, U., Ritz, J.: Extracting collocations and their contexts from corpora. In: Actes de COMPLEX-2005, Budapest (2005)

    Google Scholar 

  15. Ritz, J., Heid, U.: Extraction tools for collocations and their morphosyntactic specificities. In: Proceedings of LREC-2006, Genova, Italia (2006)

    Google Scholar 

  16. Tufiş, D., Ion, R., Ceauşu, A., Stefănescu, D.: Combined Aligners. In: Proceeding of the ACL2005 Workshop on Building and Using Parallel Corpora: Data-driven Machine Translation and Beyond, pp. 107–110. Ann Arbor, Michigan (2005)

    Google Scholar 

  17. Kermes, H.: Off-line (and On-line) Text Analysis for Computational Lexicography. Arbeitspapiere des Instituts für Maschinelle Sprachverarbeitung (AIMS) 9(3) (2003)

    Google Scholar 

  18. Rousselot, F., Montessuit, N.: LIKES un environnement d’ingénierie linguistique et d’ingénierie des connaissances. In: Workshop INTEX Sofia Bulgarie (2004)

    Google Scholar 

  19. Stefanescu, D., Tufis, D., Irimia, E.: Extragerea colocatiilor dintr-un text. In: Atelierul, Resurse lingvistice si instrumente pentru prelucrarea limbii române, pp. 89–95. Universitatea Al.I.Cuza Iasi, Romania (2006)

    Google Scholar 

  20. Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufiş, D., Varga, D.: The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages. In: Proceedings of the 5th LREC Conference, pp. 2142–2147 (2006)

    Google Scholar 

  21. Schmid, D.: Probabilistic Part-of-Speech Tagging Using Decision Trees. In: Proceedings of International Conference on New Methods in Language Processing (1994)

    Google Scholar 

  22. Ion, R.: TTL: A portable framework for tokenization, tagging and lemmatization of large corpora. Technical Report, Research Institute for Artificial Intelligence, Romanian Academy, Bucharest (2006) (in Romanian)

    Google Scholar 

  23. Gledhill, C.: Portée, Pivot, Paradigme: trois termes pour faire le point sur les expressions verbo-nominales. In: Frath, P. (ed.) Zeitschrift für Französische Sprache und Literatur Beihefte, vol. 35, pp. 59–76. Franz Steiner Verlag, Stuttgart (2008)

    Google Scholar 

  24. Banks, D.: The Range of Range: A transitivity problem for systemic linguistics. Anglophonia 8, 195–206 (2000)

    Google Scholar 

  25. Grimshaw, J., Mester, A.: Light Verbs and θ-Marking. Linguistic Inquiry 19, 205–232 (1988)

    Google Scholar 

  26. Gross, G.: Les constructions converses du français, Genève, Droz (1989)

    Google Scholar 

  27. Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  28. Todirascu, A., Gledhill, C.: Extracting Collocations in Context: The case of Verb-Noun Constructions in English and Romanian. In: Recherches Anglaises et Nord-Américaines (RANAM), Université Marc Bloch Strasbourg (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Todirascu, A., Gledhill, C., Stefanescu, D. (2009). Extracting Collocations in Contexts. In: Vetulani, Z., Uszkoreit, H. (eds) Human Language Technology. Challenges of the Information Society. LTC 2007. Lecture Notes in Computer Science(), vol 5603. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04235-5_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04235-5_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04234-8

  • Online ISBN: 978-3-642-04235-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics