Abstract
This chapter presents a symbolic machine learning method that automatically infers, from descriptions of noun-verb pairs found in a corpus in which the verb plays (or not) one of the qualia roles of the noun, corpus-specific morpho-syntactic and semantic patterns that convey qualia relations. The patterns are explanatory and linguistically motivated, and can be applied to a corpus to efficiently extract GL resources and populate Generative Lexicons. The linguistic relevance of these patterns is examined, and the N-V qualia pairs that they can detect or not is discussed. Comparisons to other methods for corpus-based qualia couple extraction are also presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Armstrong, S. (1996). Multext: Multilingual text tools and corpora. In H. Feldweg & W. Hinrichs (Eds.), Lexikon und text (pp. 107–119). Tübingen: Niemeyer.
Armstrong, A., Bouillon, P., & Robert, G. (1995). Tagger overview (Technical Report). Geneva, Switzerland: ISSCO. http://www.issco.unige.ch/staff/robert/tatoo/tagger.html
Bouaud, J., Habert, B., Nazarenko, A., & Zweigenbaum, P. (1997). Regroupements issus de dépendances syntaxiques en corpus: catégorisation et confrontation avec deux modélisations conceptuelles. In Proceedings of IC’97 (pp. 207–223). Ingénierie des Connaissances, Roscoff, France.
Bouillon, P., Lehmann, S., Manzi, S., & Petitpierre, D. (1998). Développement de lexiques à grande échelle. In Proceedings of Colloque de Tunis 1997 « La mémoire des mots » (pp. 71–80). Tunis, Tunisia.
Bouillon, P., Baud, R. H., Robert, G., & Ruch, P. (2000). Indexing by statistical tagging. In Proceedings of JADT’2000 (pp. 35–42). Journées internationales d’analyse de données textuelles, Lausanne, Switzerland.
Ceusters, W., Spyns, P., DeMoor, G., & Martin, W. (1996). Tagging of medical texts: The multi-TALE project. Amsterdam: Ios Press.
Church, K. W., & Gale, W. A. (1991). Concordances for parallel texts. In Proceedings of the 7th annual conference of the UW Centre for the New OED and Text Research (pp. 40–62). University of Waterloo, Ontario, Canada.
Church, K. W., & Hanks, P. (1989). Word association norms, mutual information, and lexicography. In Proceedings of ACL’89. 27th Annual Meeting of the Association for Computational Linguistics (pp. 76–83), Vancouver, Canada.
Claveau, V. (2003). Acquisition automatique de lexiques sémantiques pour la recherche d’information. PhD thesis, Université de Rennes 1, France.
Claveau, V., & L’Homme, M.-C. (2004). Discovering specific relationships between nouns and verbs in a specialized French corpus. In Proceedings of CompuTerm’04, 3rd International Workshop on Computational Terminology, Geneva, Switzerland.
Claveau, V., & Sébillot, P. (2004a). From efficiency to portability: Acquisition of semantic relations by semi-supervised machine learning. In Proceedings of COLING’04. 20th International Conference on Computational Linguistics (pp. 261–267), Geneva, Switzerland.
Claveau, V., & Sébillot, P. (2004b). Extension de requêtes par lien sémantique nom-verbe acquis sur corpus. In Proceedings of TALN’04, Traitement automatique des langues naturelle, Fes, Morocco.
Claveau, V., Sébillot, P., Fabre, C., & Bouillon, P. (2003). Learning semantic lexicons from a part-of-speech and semantically tagged corpus using inductive logic programming. Journal of Machine Learning Research, special issue on Inductive Logic Programming, 4, 493–525.
Daille, B. (1994). Approche mixte pour l’extraction automatique de terminologie: statistique lexicale et filtres linguistiques. PhD thesis, Université Paris VII, France.
Dunning, T. E. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 61–74.
Fabre, C. (1996). Interprétation automatique des séquences binominales en anglais et en français. Application à la recherche d’informations. PhD thesis, Université de Rennes 1, France.
Fabre, C., & Sébillot, P. (1999). Semantic interpretation of binominal sequences and information retrieval. In Proceedings of international ICSC congress on computational intelligence: Methods and applications, CIMA’99. Symposium on Advances in Intelligent Data Analysis AIDA’99, Rochester.
Fellbaum, C. (Ed.). (1998). WordNet: An electronic lexical database. Cambridge, MA: MIT Press.
Galy, É. (2000). Repérer en corpus les associations sémantiques privilégiées entre le nom et le verbe: le cas de la fonction dénotée par le nom. Master’s thesis, Université de Toulouse – Le Mirail, France.
Grefenstette, G. (1994). Explorations in automatic thesaurus discovery. Dordrecht: Kluwer Academic Publishers.
Grefenstette, G. (1997). SQLET: Short query linguistic expansion techniques, palliating one-word queries by providing intermediate structure to text. In Proceedings of RIAO’97. Recherche d’Informations Assistée par Ordinateur (pp. 500–509), McGill-University, Montreal, Quebec, Canada.
Harris, Z., Gottfried, M., Ryckman, T., Mattick, P., Jr., Daladier, A., Harris, T. N., & Harris, S. (1989). The form of information in science, analysis of immunology sublanguage (Boston studies in the philosophy of science, Vol. 104). Dordrecht: Kluwer Academic Publisher.
Hearst, M. A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proceedings of COLING’92. 14th International Conference on Computational Linguistics (pp. 539–545), Nantes, France.
Lapata, M., & Lascarides, A. (2003). A probabilisitic account of logical metonymy. Computational Linguistics, 29(2), 263–317.
Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing. Cambridge, MA: MIT Press.
Mitchell, T. M. (1997). Machine learning. New York: McGraw-Hill.
Muggleton, S., & De Raedt, L. (1994). Inductive logic programming: Theory and methods. Journal of Logic Programming, 19–20, 629–679.
Oueslati, R. (1999). Aide à l’acquisition de connaissances à partir de corpus. PhD thesis, Université Louis Pasteur, Strasbourg, France.
Pearce, D. (2002). A comparative evaluation of collocation extraction techniques. In Proceedings of LREC’02. 3rd International Conference on Language Resources and Evaluation, Las Palmas de Gran Canaria, Spain.
Petitpierre, D., & Russel, G. (1998). Mmorph – The multext morphology program (Technical Report). Geneva: ISSCO.
Pichon, R., & Sébillot, P. (1997). Acquisition automatique d’informations lexicales à partir de corpus: un bilan (Research Report, INRIA, No3321). France.
Pustejovsky, J. (1995). The generative lexicon. Cambridge, MA: MIT Press.
Pustejovsky, J., Bergler, S., & Anick, P. (1993). Lexical semantic techniques for corpus analysis. Computational Linguistics, 19(2), 331–358.
Pustejovsky, J., Boguraev, B., Verhagen, M., Buitelaar, P., & Johnston, M. (1997). Semantic indexing and typed hyperlinking. In Proceedings of American Association for Artificial Intelligence Conference (pp. 120–128). Spring Symposium on Natural Language Processing for the World Wide Web, Stanford.
Smadja, F. (1993). Retrieving collocations from text: Xtract. Computational Linguistics, 19(1), 143–178.
Vandenbroucke, L. (2000). Indexation automatique par couples nom-verbe pertinents, Mémoire de DES en information et documentation. Université Libre de Bruxelles, Belgium.
Wilks, Y., & Stevenson, M. (1996). The grammar of sense: Is word-sense tagging much more than part-of-speech tagging? (Technical Report). Sheffield: University of Sheffield.
Yarowsky, D. (1992). Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. In Proceedings of COLING’92. 14th International Conference on Computational Linguistics, Nantes, France.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Claveau, V., Sébillot, P. (2013). Automatic Acquisition of GL Resources, Using an Explanatory, Symbolic Technique. In: Pustejovsky, J., Bouillon, P., Isahara, H., Kanzaki, K., Lee, C. (eds) Advances in Generative Lexicon Theory. Text, Speech and Language Technology, vol 46. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-5189-7_19
Download citation
DOI: https://doi.org/10.1007/978-94-007-5189-7_19
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-5188-0
Online ISBN: 978-94-007-5189-7
eBook Packages: Computer ScienceComputer Science (R0)