Skip to main content
Log in

Critical data points-based unsupervised linear dimension reduction technology for science data

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Recent advances in machine learning and data mining have led to powerful methods for the analysis and visualization of high dimensional data. This paper proposes an unsupervised linear dimension reduction algorithm named critical points preserving projection (CPPP). Selecting some key data points to represent the others has become more and more popular owing to its effectiveness and efficiency. Rather than considering all data points equally, the proposed algorithm just preserves both local neighborhood structure and global variance of critical data points. We explore a joint modification of locality preserving projection and principal component analysis to achieve these objectives. Experimental results on the UCI data sets show its good performance on pattern classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://archive.ics.uci.edu/ml/datasets.html.

References

  1. Bellman R (1961) Adaptive control process: a guide tour. Princeton University Press, Princeton, New Jersey

    Book  Google Scholar 

  2. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  3. Jensen FV, Nielsen TD (2007) Bayesian networks and decision graphs. Springer, New York

    Book  MATH  Google Scholar 

  4. Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer-Verlag, New York

    MATH  Google Scholar 

  5. Turki T, Roshan U (2014) Weighted maximum variance dimensionality reduction. Pattern recognition. Springer International Publishing, New York

    Google Scholar 

  6. Niyogi X (2004) Locality preserving projections. Neural Inform Process Systems 16:153

    Google Scholar 

  7. He X, Cai D, Yan S et al (2005) Neighborhood preserving embedding. In: Tenth IEEE International Conference on Computer Vision, vol 2, pp 1208–1213

  8. Cai D, He X (2005) Orthogonal locality preserving indexing. In: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, New York, pp 3–10

  9. Kokiopoulou E, Saad Y (2007) Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique. Pattern Anal Mach Intell IEEE Trans 29(12):2143–2156

    Article  Google Scholar 

  10. Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323

    Article  Google Scholar 

  11. Boutsidis C, Zouzias A, Mahoney MW et al (2015) Randomized dimensionality reduction for k-means clustering. IEEE Trans Inform Theory 61(2):1045

    Article  MathSciNet  Google Scholar 

  12. Wang SJ, Yan S, Yang J et al (2014) A general exponential framework for dimensionality reduction. Image Process IEEE Trans 23(2):920–930

    Article  MathSciNet  Google Scholar 

  13. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Article  Google Scholar 

  14. Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inform Process Syst 14:585–591

    Google Scholar 

  15. Zhang Z, Zha H (2004) Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. J Shanghai Univ (English Edition) 8(4):406–424

    Article  MathSciNet  MATH  Google Scholar 

  16. Donoho DL, Grimes C (2003) Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc Natl Acad Sci 100(10):5591–5596

    Article  MathSciNet  MATH  Google Scholar 

  17. Weinberger KQ, Sha F, Saul LK (2004) Learning a kernel matrix for nonlinear dimensionality reduction. In: Proceedings of the twenty-first international conference on Machine learning. ACM, New York, p 106

  18. Vlassis N, Motomura Y, Kröse B (2002) Supervised dimension reduction of intrinsically low-dimensional data. Neural Comput 14(1):191–215

    Article  MATH  Google Scholar 

  19. Bengio Y, Paiement JF, Vincent P et al (2004) Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering. Adv Neural Inform Process Syst 16:177–184

    Google Scholar 

  20. Vlassis N, Motomura Y, Krose B, Fukunaga K (1990) Introduction to statistical pattern recognition. Academic press, San Diego

    Google Scholar 

  21. He X, Cai D, Min W (2005) Statistical and computational analysis of locality preserving projection. In: Proceedings of the 22nd international conference on Machine learning. ACM, New York, pp 281–288

  22. Albert R, Jeong H, Barabási AL (2000) Error and attack tolerance of complex networks. Nature 406(6794):378–382

    Article  Google Scholar 

  23. White S, Smyth P (2003) Algorithms for estimating relative importance in networks. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, pp 266–275

  24. Nieminen J (1974) On the centrality in a graph. Scand J Psychol 15(1):332–336

    Article  Google Scholar 

  25. Freeman LC, Roeder D, Mulholland RR (1980) Centrality in social networks: II experimental results. Soc Netw 2(2):119–141

    Article  Google Scholar 

  26. Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry, pp 35–41

  27. Saul LK, Roweis ST (2003) Think globally, fit locally: unsupervised learning of low dimensional manifolds. J Mach Learn Res 4:119–155

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the associate editor and all reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chuanhe Huang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, D., Xiong, N., He, J. et al. Critical data points-based unsupervised linear dimension reduction technology for science data. J Supercomput 72, 2962–2976 (2016). https://doi.org/10.1007/s11227-015-1421-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-015-1421-0

Keywords