Synonyms
Definition
High dimensional datasets is frequently encountered in data mining and statistical learning. Dimension reduction eliminates noisy data dimensions and thus and improves accuracy in classification and clustering, in addition to reduced computational cost. Here the focus is on unsupervised dimension reduction. The wide used technique is principal component analysis which is closely related to K-means cluster. Another popular method is Laplacian embedding which is closely related to spectral clustering.
Historical Background
Principal component analysis (PCA) was introduced by Pearson in 1901 and formalized in 1933 by Hotelling. PCA is the foundation for modern dimension reduction. A large number of linear dimension reduction techniques were developed during 1950–1970s.
Laplacian graph embedding (also called quadratic placement) is developed by Hall [8] in 1971. Spectral graph partitioning [6], is initially studied in 1970s; it is...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Alpert C.J. and Kahng A.B. Recent directions in netlist partitioning: a survey. Integ. VLSI J., 19:1–81, 1995.
Belkin M. and Niyogi P. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in Neural Information Processing Systems 14, 2001.
Chan P.K., Schlag M., and Zien J.Y. Spectral k-way ratio-cut partitioning and clustering. IEEE Trans. CAD-Integ. Circuit. Syst., 13:1088–1096, 1994.
Ding C. and He X. K-means clustering and principal component analysis. In Proc. 21st Int. Conf. on Machine Learning, 2004.
Ding C., He X., Zha H., and Simon H. Unsupervised learning: self-aggregation in scaled principal component space. Principles of Data Mining and Knowledge Discovery, 6th European Conf., 2002, pp. 112–124.
Fiedler M. Algebraic connectivity of graphs. Czech. Math. J., 23:298–305, 1973.
Hagen M. and Kahng A.B. New spectral methods for ratio cut partitioning and clustering. IEEE. Trans. Comput. Aided Desig., 11:1074–1085, 1992.
Hall K.M. R-dimensional quadratic placement algorithm. Manage. Sci., 17:219–229, 1971.
Ng A.Y., Jordan M.I., and Weiss Y. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, 2001.
Pothen A., Simon H.D., and Liou K.P. Partitioning sparse matrices with egenvectors of graph. SIAM J. Matrix Anal. Appl., 11:430–452, 1990.
Shi J. and Malik J. Normalized cuts and image segmentation. IEEE. Trans. Pattern Anal. Mach. Intell., 22:888–905, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this entry
Cite this entry
Ding, C. (2009). Dimension Reduction Techniques for Clustering. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_612
Download citation
DOI: https://doi.org/10.1007/978-0-387-39940-9_612
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-35544-3
Online ISBN: 978-0-387-39940-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering