Skip to main content

Informed Weighted Random Projection for Dimension Reduction

  • Conference paper
Advanced Data Mining and Applications (ADMA 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8347))

Included in the following conference series:

Abstract

Dimensionality reduction is a frequent pre-processing step in classification tasks. It helps to improve the accuracy of classification by better representing the dataset and also alleviates the curse of dimensionality by reducing the number of dimensions. Traditional dimensionality reduction techniques such as PCA or Kernel PCA are well known techniques that find a lower dimensional subspace which best represents the higher dimensional dataset. On the other hand, random projection can also be considered as a dimension reduction technique that tries to approximate the same topology of higher dimensional data in a lower dimensional space. Both approaches reduce dimensions but because of their different objectives they have not been successfully integrated. Here we show that in practice and more specifically in a supervised setting like classification, we can link the two methods to make random projection more informed in making the low dimensional representation competitive with the original data set with respect to classification accuracy. In this paper we propose a novel dimensionality reduction technique, namely informed weighted random projection, that combines Kernel PCA and random projection in an efficient way. The kernel PCA algorithm is applied initially to obtain a sub-space of reduced dimensions then the new lower dimensional bases derived by the kernel PCA are weighted in proportion to the measured robustness coefficient of each base. The proposed dimensionality reduction scheme has been applied on several benchmark datasets from the UCI repository and experimental results show that informed weighted random projection attains higher accuracy than the usual unweighted combination for all the datasets used in our experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bellman, R.E.: Adaptive control processes - A guided tour. Princeton University Press, Princeton (1961)

    MATH  Google Scholar 

  2. Donoho, D.L.: High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality, http://www-stat.stanford.edu/~donoho/Lectures/AMS2000/Curses.pdf

  3. Shlens, J.: A tutorial on Principal Component Analysis, Systems Neurobiology Laboratory, Salk Institute for Biological Studies (2005)

    Google Scholar 

  4. Movellan, J.R.: Tutorial on Principal Component Analysis, http://mplab.ucsd.edu/tutorials/pca.pdf

  5. Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 245–250. ACM, New York (2001)

    Chapter  Google Scholar 

  6. Dasgupta, S., Gupta, A.: An elementary proof of the Johnson-Lindenstrauss lemma, http://charlotte.ucsd.edu/~dasgupta/papers/jl.pdf

  7. Saul, L.K., Weinberger, K.Q., Ham, J.H., Sha, F., Lee, D.D.: Spectral methods for dimensionality reduction, Semisupervised Learning. MIT Press, Cambridge (2006)

    Google Scholar 

  8. Jolliffe, I.T.: Principal Component Analysis. Springer (2002)

    Google Scholar 

  9. Arriaga, R.I., Vempala, S.: An algorithmic theory of learning: Robust concepts and random projection. J. Mach. Learn. 63, 161–182 (2006)

    Article  MATH  Google Scholar 

  10. Forman, G.: UCI Machine Learning Repository (1999), http://archive.ics.uci.edu/ml/datasets/Spambase

  11. Sigillito, V.: UCI Machine Learning Repository (1999), http://archive.ics.uci.edu/ml/datasets/Ionosphere

  12. Street, N.: UCI Machine Learning Repository (1999), http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnotstic%29

  13. UCI Machine Learning Repository (1997), http://archive.ics.uci.edu/ml/datasets/Statlog+%28Heart%29

  14. Sigillito, V.: UCI Machine Learning Repository (1990), http://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes

  15. Street, N.: UCI Machine Learning Repository (1995), http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Progonostic%29

  16. Aeberhard, S.: UCI Machine Learning Repository (1991), http://archive.ics.uci.edu/ml/datasets/Wine

  17. Bendi, V.R., Babu, M.S.P., Venkateswarlu, N.B.: UCI Machine Learning Repository (2012), http://archive.ics.uci.edu/ml/datasets/ILPD+%28Indian+liver+Patient+Dataset%29

  18. Quinlan: UCI Machine Learning Repository (1989), http://archive.ics.uci.edu/ml/datasets/Statlog+%28Australian+Credit+Approval%29

  19. Forsyth, R.S.: UCI Machine Learning Repository (1990), http://archive.ics.uci.edu/ml/datasets/Liver+Disorders

  20. German, B.: UCI Machine Learning Repository (1987), http://archive.ics.uci.edu/ml/datasets/Glass+Identification

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sen, J., Karnick, H. (2013). Informed Weighted Random Projection for Dimension Reduction. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8347. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53917-6_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-53917-6_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-53916-9

  • Online ISBN: 978-3-642-53917-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics