Abstract
A sub-symbolic encoding methodology for natural language sentences is presented. The procedure is based on the creation of an LSA-inspired semantic space and associates rotation operators derived from Geometric Algebra to word bigrams of the sentence. The operators are subsequently applied to an orthonormal standard basis of the created semantic space according to the order in which words appear in the sentence. The final rotated basis is then coded as a vector and its orthogonal part constitutes the sub-symbolic coding of the sentence. Preliminary experimental results for a classification task, compared with the traditional LSA methodology, show the effectiveness of the approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Landauer Thomas, K., Dumais, S.T.: A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction, and Representation of Knowledge. Psychological Review 104(2), 211–240 (1997)
Haugeland, J.: Understanding Natural Language (Seventy-Sixth Annual Meeting of the American Philosophical Association, Eastern Division). The Journal of Philosophy 76(11), 619–632 (1979)
Li, X., Roth, D.: Learning Question Classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics (COLING’02) (2002)
Cavnar, W.B., Trenkle, J.M.: N-Gram-Based Text Categorization. In: Proceedings of the SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, pp. 161–169 (1994)
Croft, W.B., Lafferty, J.: Language Modeling for Information Retrieval. Kluwer Academic Publishers, Dordrecht (2003)
Liu, N., Zhang, B., Yan, J., Chen, Z., Liu, W., Bai, F., Chien, L.: Text Representation: From Vector to Tensor. In: ICDM. Proceedings of the Fifth IEEE international Conference on Data Mining (2005), pp. 725–728. IEEE Computer Society Press, Washington, DC (2005), http://dx.doi.org/10.1109/ICDM.2005.144
Madsen, R.E.: Modeling Text using State Space Model. Technical Report (2004), http://www2.imm.dtu.dk/pubdb/p.php?3998
Zhang, D., Lee, W.S.: Question classification using support vector machines. Research and development in information retrieval (2003)
Nguyen, M.L., Shimazu, A., Nguyen, T.T.: Subtree mining for question classification problem. In: Twentieth International Joint Conference on Artificial Intelligence (IJCAI), Hyderabad, India (January 6-12, 2007)
Kanejiya, D., Kumar, A., Prasad, S.: Automatic evaluation of students answers using syntactically enhanced LSA. In: Proceedings of the Human Language Technology Conference (HLT-NAACL 2003), Workshop on Building Educational Applications using NLP (2003)
Wiemer-Hastings, P., Zipitria, I.: Rules for Syntax, Vectors for Semantics. In: Proceedings of the 23rd Annual Conference of the Cognitive Science Society, Edinburgh (2001)
Dennis, S.: Introducing word order in an LSA framework. In: Landauer, T., McNamara, D., Dennis, S., Kintsch, W. (eds.) Handbook of Latent Semantic Analysis, Erlbaum (2006)
Doucet, A., Ahonen-Myka, H.: Non-Contiguous Word Sequences for Information Retrieval. Second ACL Workshop on Multiword Expressions: Integrating Processing 88–95 (July 2004)
Li, Y., McLean, D., Bandar, Z.A., O’Shea, J.D., Crockett, K.: Sentence Similarity Based on Semantic Nets and Corpus Statistics. IEEE Transactions on Knowledge and Data Engineering 18(8), 1138–1150 (2006)
Agostaro, F., Pilato, G., Vassallo, G., Gaglio, S.: A Sub-Symbolic Approach to Word Modelling for Domain Specific Speech Recognition. In: Proceedings of IEEE CAMP. International Workshop on Computer Architecture for Machine Perception, pp. 321–326 (2005)
Lounesto, P.: Clifford Algebra and Spinors. Cambridge University Press, Cambridge (1997)
Schoute, P.H.: Mehrdimensionale Geometrie. Leipzig: G.J.Gschensche Verlagsha 1 (Sammlung Schubert XXXV): Die linearen Rume, 1902. 2 (Sammlung Schubert XXXVI): Die Polytope (1905)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pilato, G., Augello, A., Vassallo, G., Gaglio, S. (2007). Geometric Algebra Rotors for Sub-symbolic Coding of Natural Language Sentences. In: Apolloni, B., Howlett, R.J., Jain, L. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2007. Lecture Notes in Computer Science(), vol 4692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74819-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-74819-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74817-5
Online ISBN: 978-3-540-74819-9
eBook Packages: Computer ScienceComputer Science (R0)