Abstract
Associative networks are a connectionist language model with the ability to handle large sets of documents. In this research we investigated the use of natural language processing techniques (part-of-speech tagging and parsing) in combination with Associative Networks for document categorization and compare the results to a TF-IDF baseline. By filtering out unwanted observations and preselecting relevant data based on sentence structure, natural language processing can pre-filter information before it enters the associative network, thus improving results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bechtel, W.: Connectionism and the philosophy of mind: an overview. The Southern Journal of Philosophy 26, 17–41 (1988)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Klein, D., Manning, C.: Fast Exact Inference with a Factored Model for Natural Language Parsing. Adv. in Neural Information Processing Systems 15, 3–10 (2003)
Marcus, G.F.: The Algebraic Mind: Integrating Connectionism and Cognitive Science. MIT Press, Cambridge (2001)
Miller, G.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)
Ramos, J.: Using TF-IDF to Determine Word Relevance in Document Queries. In: Proceedings of the First Instructional Conference on Machine Learning, iCML (2003)
Schank, R.C.: Dynamic Memory: A Theory of Learning in Computers and People. Cambridge University Press, New York (1982)
Schank, R.C., Abelson, R.P.: Scripts, Plans, Goals and Understanding. Erlbaum, Hillsdale, New Jersey (1977)
Sun, J., Chen, Z., Zeng, H., Lu, Y., Shi, C., Ma, W.: Supervised latent semantic indexing for document categorization. In: Proceedings for ICDM, pp. 535–538 (2004)
Toutanova, K., Klein, D., Manning, C., Singer, Y.: Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network. In: Proceedings of HLT-NAACL, pp. 252–259 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bloom, N. (2012). Using Natural Language Processing to Improve Document Categorization with Associative Networks. In: Bouma, G., Ittoo, A., Métais, E., Wortmann, H. (eds) Natural Language Processing and Information Systems. NLDB 2012. Lecture Notes in Computer Science, vol 7337. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31178-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-31178-9_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31177-2
Online ISBN: 978-3-642-31178-9
eBook Packages: Computer ScienceComputer Science (R0)