Abstract
Dimensionality reduction is a frequent pre-processing step in classification tasks. It helps to improve the accuracy of classification by better representing the dataset and also alleviates the curse of dimensionality by reducing the number of dimensions. Traditional dimensionality reduction techniques such as PCA or Kernel PCA are well known techniques that find a lower dimensional subspace which best represents the higher dimensional dataset. On the other hand, random projection can also be considered as a dimension reduction technique that tries to approximate the same topology of higher dimensional data in a lower dimensional space. Both approaches reduce dimensions but because of their different objectives they have not been successfully integrated. Here we show that in practice and more specifically in a supervised setting like classification, we can link the two methods to make random projection more informed in making the low dimensional representation competitive with the original data set with respect to classification accuracy. In this paper we propose a novel dimensionality reduction technique, namely informed weighted random projection, that combines Kernel PCA and random projection in an efficient way. The kernel PCA algorithm is applied initially to obtain a sub-space of reduced dimensions then the new lower dimensional bases derived by the kernel PCA are weighted in proportion to the measured robustness coefficient of each base. The proposed dimensionality reduction scheme has been applied on several benchmark datasets from the UCI repository and experimental results show that informed weighted random projection attains higher accuracy than the usual unweighted combination for all the datasets used in our experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bellman, R.E.: Adaptive control processes - A guided tour. Princeton University Press, Princeton (1961)
Donoho, D.L.: High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality, http://www-stat.stanford.edu/~donoho/Lectures/AMS2000/Curses.pdf
Shlens, J.: A tutorial on Principal Component Analysis, Systems Neurobiology Laboratory, Salk Institute for Biological Studies (2005)
Movellan, J.R.: Tutorial on Principal Component Analysis, http://mplab.ucsd.edu/tutorials/pca.pdf
Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 245–250. ACM, New York (2001)
Dasgupta, S., Gupta, A.: An elementary proof of the Johnson-Lindenstrauss lemma, http://charlotte.ucsd.edu/~dasgupta/papers/jl.pdf
Saul, L.K., Weinberger, K.Q., Ham, J.H., Sha, F., Lee, D.D.: Spectral methods for dimensionality reduction, Semisupervised Learning. MIT Press, Cambridge (2006)
Jolliffe, I.T.: Principal Component Analysis. Springer (2002)
Arriaga, R.I., Vempala, S.: An algorithmic theory of learning: Robust concepts and random projection. J. Mach. Learn. 63, 161–182 (2006)
Forman, G.: UCI Machine Learning Repository (1999), http://archive.ics.uci.edu/ml/datasets/Spambase
Sigillito, V.: UCI Machine Learning Repository (1999), http://archive.ics.uci.edu/ml/datasets/Ionosphere
Street, N.: UCI Machine Learning Repository (1999), http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnotstic%29
UCI Machine Learning Repository (1997), http://archive.ics.uci.edu/ml/datasets/Statlog+%28Heart%29
Sigillito, V.: UCI Machine Learning Repository (1990), http://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes
Street, N.: UCI Machine Learning Repository (1995), http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Progonostic%29
Aeberhard, S.: UCI Machine Learning Repository (1991), http://archive.ics.uci.edu/ml/datasets/Wine
Bendi, V.R., Babu, M.S.P., Venkateswarlu, N.B.: UCI Machine Learning Repository (2012), http://archive.ics.uci.edu/ml/datasets/ILPD+%28Indian+liver+Patient+Dataset%29
Quinlan: UCI Machine Learning Repository (1989), http://archive.ics.uci.edu/ml/datasets/Statlog+%28Australian+Credit+Approval%29
Forsyth, R.S.: UCI Machine Learning Repository (1990), http://archive.ics.uci.edu/ml/datasets/Liver+Disorders
German, B.: UCI Machine Learning Repository (1987), http://archive.ics.uci.edu/ml/datasets/Glass+Identification
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sen, J., Karnick, H. (2013). Informed Weighted Random Projection for Dimension Reduction. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8347. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53917-6_39
Download citation
DOI: https://doi.org/10.1007/978-3-642-53917-6_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53916-9
Online ISBN: 978-3-642-53917-6
eBook Packages: Computer ScienceComputer Science (R0)