Abstract
Many applications, such as word-sense disambiguation and information retrieval, can benefit from text classification. Text classifiers based on Independent Component Analysis (ICA) try to make the most of the independent components of text documents and give in many cases good classification effects. Short-text documents, however, usually have little overlap in their feature terms and, in this case, ICA can not work well. Our aim is to solve the short-text problem in text classification by using Latent Semantic Analysis (LSA) as a data preprocessing method, then employing ICA for the preprocessed data. The experiment shows that using ICA and LSA together rather than only using ICA in Chinese short-text classification can provide better classification effects.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Honkela, T., Hyvärinen, A.: Linguistic Feature Extraction Using Independent Component Analysis. In: Proc. Int. Joint Conf. on Neural Networks (IJCNN), Budapest, Hungary (2004)
Sevillano, X., AlÃas, F., Socoró, J.C.: Reliability in ICA-based Text Classification. In: Puntonet, C.G., Prieto, A.G. (eds.) ICA 2004. LNCS, vol. 3195, pp. 1213–1220. Springer, Heidelberg (2004)
Kolenda, T., Hansen, L.K.: Independent Components in Text. Advances in Neural Information Processing Systems 13, 235–256 (2000)
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Landauer, T.K., Foltz, P.W., Laham, D.: Introduction to Latent Semantic Analysis. Discourse Processes 25, 259–284 (1998)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by Latent Semantic Analysis. Journal of the American Society of Information Science 41, 391–407 (1990)
Hyvärinen, A.: Survey on Independent Component Analysis. Neural Computing Surveys 2, 94–128 (1999)
Isbell, C.L., Viola, P.: Restructuring Sparse High Dimensional Data for Effective Retrieval. Advances in Neural Information Processing Systems 11, 480–486 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pu, Q., Yang, GW. (2006). Short-Text Classification Based on ICA and LSA. In: Wang, J., Yi, Z., Zurada, J.M., Lu, BL., Yin, H. (eds) Advances in Neural Networks - ISNN 2006. ISNN 2006. Lecture Notes in Computer Science, vol 3972. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11760023_39
Download citation
DOI: https://doi.org/10.1007/11760023_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34437-7
Online ISBN: 978-3-540-34438-4
eBook Packages: Computer ScienceComputer Science (R0)