Automatic Acquisition of GL Resources, Using an Explanatory, Symbolic Technique

Claveau, Vincent; Sébillot, Pascale

doi:10.1007/978-94-007-5189-7_19

Vincent Claveau⁶ &
Pascale Sébillot⁶

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 46))

1433 Accesses
6 Citations

Abstract

This chapter presents a symbolic machine learning method that automatically infers, from descriptions of noun-verb pairs found in a corpus in which the verb plays (or not) one of the qualia roles of the noun, corpus-specific morpho-syntactic and semantic patterns that convey qualia relations. The patterns are explanatory and linguistically motivated, and can be applied to a corpus to efficiently extract GL resources and populate Generative Lexicons. The linguistic relevance of these patterns is examined, and the N-V qualia pairs that they can detect or not is discussed. Comparisons to other methods for corpus-based qualia couple extraction are also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Purely Symbolic Induction of Structure

Grammar Induction - Experimental Results

The Extraction of Linguistic Knowledge and Construction of Linguistic Resources

References

Armstrong, S. (1996). Multext: Multilingual text tools and corpora. In H. Feldweg & W. Hinrichs (Eds.), Lexikon und text (pp. 107–119). Tübingen: Niemeyer.
Google Scholar
Armstrong, A., Bouillon, P., & Robert, G. (1995). Tagger overview (Technical Report). Geneva, Switzerland: ISSCO. http://www.issco.unige.ch/staff/robert/tatoo/tagger.html
Bouaud, J., Habert, B., Nazarenko, A., & Zweigenbaum, P. (1997). Regroupements issus de dépendances syntaxiques en corpus: catégorisation et confrontation avec deux modélisations conceptuelles. In Proceedings of IC’97 (pp. 207–223). Ingénierie des Connaissances, Roscoff, France.
Google Scholar
Bouillon, P., Lehmann, S., Manzi, S., & Petitpierre, D. (1998). Développement de lexiques à grande échelle. In Proceedings of Colloque de Tunis 1997 « La mémoire des mots » (pp. 71–80). Tunis, Tunisia.
Google Scholar
Bouillon, P., Baud, R. H., Robert, G., & Ruch, P. (2000). Indexing by statistical tagging. In Proceedings of JADT’2000 (pp. 35–42). Journées internationales d’analyse de données textuelles, Lausanne, Switzerland.
Google Scholar
Ceusters, W., Spyns, P., DeMoor, G., & Martin, W. (1996). Tagging of medical texts: The multi-TALE project. Amsterdam: Ios Press.
Google Scholar
Church, K. W., & Gale, W. A. (1991). Concordances for parallel texts. In Proceedings of the 7th annual conference of the UW Centre for the New OED and Text Research (pp. 40–62). University of Waterloo, Ontario, Canada.
Google Scholar
Church, K. W., & Hanks, P. (1989). Word association norms, mutual information, and lexicography. In Proceedings of ACL’89. 27th Annual Meeting of the Association for Computational Linguistics (pp. 76–83), Vancouver, Canada.
Google Scholar
Claveau, V. (2003). Acquisition automatique de lexiques sémantiques pour la recherche d’information. PhD thesis, Université de Rennes 1, France.
Google Scholar
Claveau, V., & L’Homme, M.-C. (2004). Discovering specific relationships between nouns and verbs in a specialized French corpus. In Proceedings of CompuTerm’04, 3rd International Workshop on Computational Terminology, Geneva, Switzerland.
Google Scholar
Claveau, V., & Sébillot, P. (2004a). From efficiency to portability: Acquisition of semantic relations by semi-supervised machine learning. In Proceedings of COLING’04. 20th International Conference on Computational Linguistics (pp. 261–267), Geneva, Switzerland.
Google Scholar
Claveau, V., & Sébillot, P. (2004b). Extension de requêtes par lien sémantique nom-verbe acquis sur corpus. In Proceedings of TALN’04, Traitement automatique des langues naturelle, Fes, Morocco.
Google Scholar
Claveau, V., Sébillot, P., Fabre, C., & Bouillon, P. (2003). Learning semantic lexicons from a part-of-speech and semantically tagged corpus using inductive logic programming. Journal of Machine Learning Research, special issue on Inductive Logic Programming, 4, 493–525.
Google Scholar
Daille, B. (1994). Approche mixte pour l’extraction automatique de terminologie: statistique lexicale et filtres linguistiques. PhD thesis, Université Paris VII, France.
Google Scholar
Dunning, T. E. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 61–74.
Google Scholar
Fabre, C. (1996). Interprétation automatique des séquences binominales en anglais et en français. Application à la recherche d’informations. PhD thesis, Université de Rennes 1, France.
Google Scholar
Fabre, C., & Sébillot, P. (1999). Semantic interpretation of binominal sequences and information retrieval. In Proceedings of international ICSC congress on computational intelligence: Methods and applications, CIMA’99. Symposium on Advances in Intelligent Data Analysis AIDA’99, Rochester.
Google Scholar
Fellbaum, C. (Ed.). (1998). WordNet: An electronic lexical database. Cambridge, MA: MIT Press.
MATH Google Scholar
Galy, É. (2000). Repérer en corpus les associations sémantiques privilégiées entre le nom et le verbe: le cas de la fonction dénotée par le nom. Master’s thesis, Université de Toulouse – Le Mirail, France.
Google Scholar
Grefenstette, G. (1994). Explorations in automatic thesaurus discovery. Dordrecht: Kluwer Academic Publishers.
MATH Google Scholar
Grefenstette, G. (1997). SQLET: Short query linguistic expansion techniques, palliating one-word queries by providing intermediate structure to text. In Proceedings of RIAO’97. Recherche d’Informations Assistée par Ordinateur (pp. 500–509), McGill-University, Montreal, Quebec, Canada.
Google Scholar
Harris, Z., Gottfried, M., Ryckman, T., Mattick, P., Jr., Daladier, A., Harris, T. N., & Harris, S. (1989). The form of information in science, analysis of immunology sublanguage (Boston studies in the philosophy of science, Vol. 104). Dordrecht: Kluwer Academic Publisher.
Google Scholar
Hearst, M. A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proceedings of COLING’92. 14th International Conference on Computational Linguistics (pp. 539–545), Nantes, France.
Google Scholar
Lapata, M., & Lascarides, A. (2003). A probabilisitic account of logical metonymy. Computational Linguistics, 29(2), 263–317.
Google Scholar
Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing. Cambridge, MA: MIT Press.
MATH Google Scholar
Mitchell, T. M. (1997). Machine learning. New York: McGraw-Hill.
MATH Google Scholar
Muggleton, S., & De Raedt, L. (1994). Inductive logic programming: Theory and methods. Journal of Logic Programming, 19–20, 629–679.
Article Google Scholar
Oueslati, R. (1999). Aide à l’acquisition de connaissances à partir de corpus. PhD thesis, Université Louis Pasteur, Strasbourg, France.
Google Scholar
Pearce, D. (2002). A comparative evaluation of collocation extraction techniques. In Proceedings of LREC’02. 3rd International Conference on Language Resources and Evaluation, Las Palmas de Gran Canaria, Spain.
Google Scholar
Petitpierre, D., & Russel, G. (1998). Mmorph – The multext morphology program (Technical Report). Geneva: ISSCO.
Google Scholar
Pichon, R., & Sébillot, P. (1997). Acquisition automatique d’informations lexicales à partir de corpus: un bilan (Research Report, INRIA, N^o3321). France.
Google Scholar
Pustejovsky, J. (1995). The generative lexicon. Cambridge, MA: MIT Press.
Google Scholar
Pustejovsky, J., Bergler, S., & Anick, P. (1993). Lexical semantic techniques for corpus analysis. Computational Linguistics, 19(2), 331–358.
Google Scholar
Pustejovsky, J., Boguraev, B., Verhagen, M., Buitelaar, P., & Johnston, M. (1997). Semantic indexing and typed hyperlinking. In Proceedings of American Association for Artificial Intelligence Conference (pp. 120–128). Spring Symposium on Natural Language Processing for the World Wide Web, Stanford.
Google Scholar
Smadja, F. (1993). Retrieving collocations from text: Xtract. Computational Linguistics, 19(1), 143–178.
Google Scholar
Vandenbroucke, L. (2000). Indexation automatique par couples nom-verbe pertinents, Mémoire de DES en information et documentation. Université Libre de Bruxelles, Belgium.
Google Scholar
Wilks, Y., & Stevenson, M. (1996). The grammar of sense: Is word-sense tagging much more than part-of-speech tagging? (Technical Report). Sheffield: University of Sheffield.
Google Scholar
Yarowsky, D. (1992). Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. In Proceedings of COLING’92. 14th International Conference on Computational Linguistics, Nantes, France.
Google Scholar

Download references

Author information

Authors and Affiliations

IRISA – CNRS, Rennes, France
Vincent Claveau & Pascale Sébillot

Authors

Vincent Claveau
View author publications
You can also search for this author in PubMed Google Scholar
Pascale Sébillot
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vincent Claveau .

Editor information

Editors and Affiliations

Department Of Computer Science, Volen Center for Comples Systems, Brandeis University, South Street 415, Waltham, 02454, Massachusetts, USA
James Pustejovsky
, Faculté de Traduction, Université de Genève, Geneva, 1211, Switzerland
Pierrette Bouillon
, Information and Media Center, Toyohashi University of Technology, 1-1 Hibarigaoka, Tenpakucho, Toyohashi, 441-8580, Japan
Hitoshi Isahara
, Dept. of Ling Theory & Structure, Nat. Inst. f. Japanese Lang. and Ling., 10-2 Midoricho, Tachikawa, Tachikawa, 190-8561, Japan
Kyoko Kanzaki
Dept. Linguistics, Seoul National University, Seoul, 151-742, Korea, Republic of (South Korea)
Chungmin Lee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Claveau, V., Sébillot, P. (2013). Automatic Acquisition of GL Resources, Using an Explanatory, Symbolic Technique. In: Pustejovsky, J., Bouillon, P., Isahara, H., Kanzaki, K., Lee, C. (eds) Advances in Generative Lexicon Theory. Text, Speech and Language Technology, vol 46. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-5189-7_19

Download citation

DOI: https://doi.org/10.1007/978-94-007-5189-7_19
Published: 30 October 2012
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-5188-0
Online ISBN: 978-94-007-5189-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics