Abstract
Recent advances in machine learning and data mining have led to powerful methods for the analysis and visualization of high dimensional data. This paper proposes an unsupervised linear dimension reduction algorithm named critical points preserving projection (CPPP). Selecting some key data points to represent the others has become more and more popular owing to its effectiveness and efficiency. Rather than considering all data points equally, the proposed algorithm just preserves both local neighborhood structure and global variance of critical data points. We explore a joint modification of locality preserving projection and principal component analysis to achieve these objectives. Experimental results on the UCI data sets show its good performance on pattern classification.





Similar content being viewed by others
References
Bellman R (1961) Adaptive control process: a guide tour. Princeton University Press, Princeton, New Jersey
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
Jensen FV, Nielsen TD (2007) Bayesian networks and decision graphs. Springer, New York
Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer-Verlag, New York
Turki T, Roshan U (2014) Weighted maximum variance dimensionality reduction. Pattern recognition. Springer International Publishing, New York
Niyogi X (2004) Locality preserving projections. Neural Inform Process Systems 16:153
He X, Cai D, Yan S et al (2005) Neighborhood preserving embedding. In: Tenth IEEE International Conference on Computer Vision, vol 2, pp 1208–1213
Cai D, He X (2005) Orthogonal locality preserving indexing. In: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, New York, pp 3–10
Kokiopoulou E, Saad Y (2007) Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique. Pattern Anal Mach Intell IEEE Trans 29(12):2143–2156
Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Boutsidis C, Zouzias A, Mahoney MW et al (2015) Randomized dimensionality reduction for k-means clustering. IEEE Trans Inform Theory 61(2):1045
Wang SJ, Yan S, Yang J et al (2014) A general exponential framework for dimensionality reduction. Image Process IEEE Trans 23(2):920–930
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inform Process Syst 14:585–591
Zhang Z, Zha H (2004) Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. J Shanghai Univ (English Edition) 8(4):406–424
Donoho DL, Grimes C (2003) Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc Natl Acad Sci 100(10):5591–5596
Weinberger KQ, Sha F, Saul LK (2004) Learning a kernel matrix for nonlinear dimensionality reduction. In: Proceedings of the twenty-first international conference on Machine learning. ACM, New York, p 106
Vlassis N, Motomura Y, Kröse B (2002) Supervised dimension reduction of intrinsically low-dimensional data. Neural Comput 14(1):191–215
Bengio Y, Paiement JF, Vincent P et al (2004) Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering. Adv Neural Inform Process Syst 16:177–184
Vlassis N, Motomura Y, Krose B, Fukunaga K (1990) Introduction to statistical pattern recognition. Academic press, San Diego
He X, Cai D, Min W (2005) Statistical and computational analysis of locality preserving projection. In: Proceedings of the 22nd international conference on Machine learning. ACM, New York, pp 281–288
Albert R, Jeong H, Barabási AL (2000) Error and attack tolerance of complex networks. Nature 406(6794):378–382
White S, Smyth P (2003) Algorithms for estimating relative importance in networks. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, pp 266–275
Nieminen J (1974) On the centrality in a graph. Scand J Psychol 15(1):332–336
Freeman LC, Roeder D, Mulholland RR (1980) Centrality in social networks: II experimental results. Soc Netw 2(2):119–141
Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry, pp 35–41
Saul LK, Roweis ST (2003) Think globally, fit locally: unsupervised learning of low dimensional manifolds. J Mach Learn Res 4:119–155
Acknowledgments
The authors would like to thank the associate editor and all reviewers for their helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, D., Xiong, N., He, J. et al. Critical data points-based unsupervised linear dimension reduction technology for science data. J Supercomput 72, 2962–2976 (2016). https://doi.org/10.1007/s11227-015-1421-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-015-1421-0