Abstract
In spite of the widespread usage of geometric models of meaning in computational linguistics and information retrieval research, they have been until recently mostly utilized for modeling lexical meaning. The ability to deal with concept combination, however, is the essential capacity of human language, and any semantic theory should be able to handle it.
Making use of Word Space Models (Schütze 1998) and Random Indexing (Sahlgren 2005), we explore the hypothesis that compositional meaning can be captured in such models by adopting a number of mathematical operations for vector composition (summation, component product, tensor product and convolution) to model semantic composition in a multiword unit identification task.
This work was supported by German “Federal Ministry of Economics” (BMWi) under the project Theseus (number 01MQ07019).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval. ACM Press, Addison-Wesley (1999)
Baldwin, T., Bannard, C., Tanaka, T., Widdows, D.: An empirical model of multiword expression decomposability. In: Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (2003)
Bannard, C., Baldwin, T., Lascarides, A.: A statistical approach to the semantics of verb-particles. In: Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, Sapporo, Japan (2003)
Berry, M., Drmac, Z., Jessup, E.R.: Matrices, Vector Spaces, and Information Retrieval. SIAM Review 41(2), 335–362 (1999)
Clark, S., Pulman, S.: Combining symbolic and distributional models of meaning. In: Proceedings of the AAAI Spring Symposium on Quantum Interaction, Stanford, CA, pp. 52–55 (2007)
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by Latent Semantic Analysis. Journal of the American Society of Information Science 41(6), 391–407 (1990)
Dowty, D.R., Wall, R., Peters, S.: Introduction to Montague Semantics. Kluwer Academic Publishers, Dordrecht (1981)
Evert, S., Krenn, B.: Methods for the qualitative evaluation of lexical association measures. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (2001)
Evert, S.: The statistics of word cooccurrences: word pairs and collocations. Ph.D. Thesis, University of Stuttgart (2004)
Firth, J.: A synopsis of linguistic theory 1930-1955. Studies in Linguistic Analysis, pp. 1–32. Longman (1957)
Frege, G.: Letter to Jourdain. In: Gabriel, G., et al. (eds.) Philosophical and Mathematical Correspondence, Chicago University Press 1980 (1914)
Gärdenfors, P.: Conceptual Spaces: The Geometry of Thought. The MIT Press, Cambridge (2004)
Jones, S., Sinclair, J.M.: English lexical collocations. Cahiers de Lexicologie 24 (1974)
Katz, G., Giesbrecht, E.: Automatic identification of non-compositional multiword expressions using Latent Semantic Analysis. In: Proceedings of the ACL/Coling Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties (2006)
Kintsch, W.: Predication. Cognitive Science 25(2) (2001)
Krenn, B.: The usual suspects: data-oriented models for the identification and representation of lexical collocations. In: Dissertations in Computational Linguistics and Language Technology, German Research Center for Artificial Intelligence and Saarland University, Saarbrücken, Germany (2000)
Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: the Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review 104, 211–240 (1997)
Lin, D.: Automatic identification of noncompositional phrases. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (1999)
Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Mitchell, J., Lapata, M.: Vector-based models of semantic composition. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 236–244 (2008)
Sag, I.A., Baldwin, T., Bond, F., Copestake, A., Flickinger, D.: Multiword expressions: a pain in the neck for NLP. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, p. 1. Springer, Heidelberg (2002)
Sahlgren, M.: An Introduction to Random Indexing. In: Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE, Copenhagen, Denmark, August 16 (2005)
Schone, P., Jurafsky, D.: Is knowledge-free induction of multiword unit dictionary headwords a solved problem? In: Proceedings of Empirical Methods in Natural Language Processing, Pittsburgh, PA (2001)
Schütze, H.: Automatic word sense discrimination. Computational Linguistics 24(1), 97–124 (1998)
Smolensky, P., Legendre, G.: The Harmonic Mind: from Neural Computation to Optimality-Theoretic Grammar. MIT Press, Cambridge (2006)
Sowa, J.F.: Knowledge Representation: Logical, Philosophical, and Computational Foundations. Brooks Cole Publishing Co., Pacific Grove (2000)
Widdows D.: Semantic vector products: some initial investigations. In: Proceedings of the Second AAAI Symposium on Quantum Interaction (2008)
Widdows, D., Ferraro, K.: Semantic vectors: a scalable open source package and online technology management application. In: Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008) (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Giesbrecht, E. (2009). In Search of Semantic Compositionality in Vector Spaces. In: Rudolph, S., Dau, F., Kuznetsov, S.O. (eds) Conceptual Structures: Leveraging Semantic Technologies. ICCS 2009. Lecture Notes in Computer Science(), vol 5662. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03079-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-03079-6_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03078-9
Online ISBN: 978-3-642-03079-6
eBook Packages: Computer ScienceComputer Science (R0)