Exemplar-based low-rank matrix decomposition for data clustering

Wang, Lijun; Dong, Ming

doi:10.1007/s10618-014-0347-0

Exemplar-based low-rank matrix decomposition for data clustering

Published: 29 January 2014

Volume 29, pages 324–357, (2015)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Lijun Wang¹ &
Ming Dong²

591 Accesses
5 Citations
Explore all metrics

Abstract

Today, digital data is accumulated at a faster than ever speed in science, engineering, biomedicine, and real-world sensing. The ubiquitous phenomenon of massive data and sparse information imposes considerable challenges in data mining research. In this paper, we propose a theoretical framework, Exemplar-based low-rank sparse matrix decomposition (EMD), to cluster large-scale datasets. Capitalizing on recent advances in matrix approximation and decomposition, EMD can partition datasets with large dimensions and scalable sizes efficiently. Specifically, given a data matrix, EMD first computes a representative data subspace and a near-optimal low-rank approximation. Then, the cluster centroids and indicators are obtained through matrix decomposition, in which we require that the cluster centroids lie within the representative data subspace. By selecting the representative exemplars, we obtain a compact “sketch”of the data. This makes the clustering highly efficient and robust to noise. In addition, the clustering results are sparse and easy for interpretation. From a theoretical perspective, we prove the correctness and convergence of the EMD algorithm, and provide detailed analysis on its efficiency, including running time and spatial requirements. Through extensive experiments performed on both synthetic and real datasets, we demonstrate the performance of EMD for clustering large-scale data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

Notes

References

Achlioptas D, Mcsherry F (2007) Fast computation of low-rank matrix approximations. J ACM 54(2):9
Article MathSciNet Google Scholar
Baker CTH (1997) The numerical treatment of integral equations. Clarendon Press, Oxford
Google Scholar
Barron AR, Rissanen J, Yu B (1998) The minimum description length principle in coding and modeling. IEEE Trans Inf Theory 44(6):2743–2760
Article MATH MathSciNet Google Scholar
Berry MW, Browne M, Langville AN, Pauca PV, Plemmons RJ (September 2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52(1):155–173
Google Scholar
Berry MW, Pulatova SA, Stewart GW (2005) Algorithm 844: computing sparse reduced-rank approximations to sparse matrices. ACM Trans Math Softw 31(2):252–269
Article MATH MathSciNet Google Scholar
Chen Y, Wang L, Dong M, Hua J (2009) Exemplar-based visualization of large document corpus. IEEE Trans Visual Comput Graphics 15(6):1169–1176
Article Google Scholar
Chung FRK (1997) Spectral graph theory. American Mathematical Society
Delves LM, Mohamed JL (1985) Computational methods for integral equations. Cambridge University Press, New York
Book MATH Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc B 39:1–38
MATH MathSciNet Google Scholar
Dhillon I, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the 9th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 551–556
Dhillon IS, Guan Y, Kulis B (2005) A unified view of kernel k-means, spectral clustering and graph cuts. Technical Report TR-04-25, University of Texas Dept. of Computer Science
Ding C, He X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of SIAM International Conference of Data Mining, pp 606–610
Ding C, He X, Zha H, Simon HD (2001) A min-max cut algorithm for graph partitioning and data clustering. In: IEEE International Conference on Data Mining, pp 107–114
Ding C, Li T, Jordan MI (2008) Convex and semi-nonnegative matrix factorizations. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 99. IEEE Computer Society, Los Alamitos
Ding C, Li T, Peng W (2006) Nonnegative matrix factorization and probabilistic latent semantic indexing: equivalence chi-square statistic, and a hybrid method. Proc Natl Conf Artif Intell 21(1):342
Google Scholar
Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 126–135
Drineas P, Frieze A, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. IEEE J Mach Learn 56(1–3):9–33
Article MATH Google Scholar
Drineas P, Kannan R, Mahoney M (2006) Fast monte-carlo algorithms for matrices ii: computing low-rank approximations to a matrix. SIAM J Comput 36:158–183
Article MATH MathSciNet Google Scholar
Drineas P, Kannan R, Mahoney MW (2006) Fast monte carlo algorithms for matrices iii: computing a compressed approximate matrix decomposition. SIAM J Comput 36:184–206
Article MATH MathSciNet Google Scholar
Drineas P, Mahoney MW (2005) On the nyström method for approximating a gram matrix for improved kernel-based learning. J Mach Learn Res 6:2153–2175
MATH MathSciNet Google Scholar
Duda HO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New York
MATH Google Scholar
Fiedler M (1973) Algebraic connectivity of graphs. Czechoslov Math J 23(98):298–305
MathSciNet Google Scholar
Fowlkes C, Belongie S, Chung F, Malik J (2004) Spectral grouping using the nyström method. IEEE Trans Pattern Anal Mach Intell 26(2):214–225
Article Google Scholar
Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. W. H. Freeman, New York
MATH Google Scholar
Golub GH, Van Loan CF (1996) Matrix computations, 3rd edn. Johns Hopkins University Press, Baltimore
MATH Google Scholar
Hagen L, Kahng AB (1992) New spectral methods for ratio cut partitioning and clustering. IEEE Trans Comput Aided Des Integr Circuits Syst 11(9):1074–1085
Article Google Scholar
Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469
MATH MathSciNet Google Scholar
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31:264–323
Article Google Scholar
Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, New York
MATH Google Scholar
Kim H, Park H (2007) Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 23(12):1495–1502
Article Google Scholar
Lang K (1995) News weeder: learning to filter netnews. In: Proceedings of the 12th International Conference on Machine Learning, pp 331–339
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Article Google Scholar
Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. Neural Inf Proc Syst 13:556–562
Google Scholar
Li T, Ding C (2006) The relationships among various nonnegative matrix factorization methods for clustering. In: Proceedings of the IEEE International Conference on Data Mining, pp 362–371
MacQueen JB (1967) Some methods for classsification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp 281–297
Mahdavi M, Abolhassani H (2009) Harmony k-means algorithm for document clustering. Data Min Knowl Disc 18:370–391. doi:10.1007/s10618-008-0123-0
Article MathSciNet Google Scholar
Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137
Article Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Article MATH Google Scholar
Sha F, Lin Y, Saul LK, Lee DD (2007) Multiplicative updates for nonnegative quadratic programming. Neural Comput 19(8):2004–2031
Article MATH MathSciNet Google Scholar
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Article Google Scholar
Shyamalkumar ND, Varadarajan K (2007) Efficient subspace approximation algorithms. In: SODA ’07 Proceedings of the 18th annual ACM-SIAM symposium on Discrete algorithms, pp 532–540
Stewart GW (1999) Four algorithms for the efficient computation of truncated qr approximations to a sparse matrix. Numer Math 83:313–323
Article MATH MathSciNet Google Scholar
Strehl A, Ghosh J, Cardie C (2002) Cluster ensembles: a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Google Scholar
Sun J, Xie Y, Zhang H, Faloutsos C (2008) Less is more: sparse graph mining with compact matrix decomposition. Stat Anal Data Min 1(1):6–22
Article MathSciNet Google Scholar
Tong H, Papadimitriou S, Sun J, Yu PS, Faloutsos C (2008) Colibri: fast mining of large static and dynamic graphs. In: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 686–694
Wang F, Li T, Wang X, Zhu S, Ding C (2011) Community discovery using nonnegative matrix factorization. Data Min Knowl Disc 22:493–521. doi:10.1007/s10618-010-0181-y
Article MATH MathSciNet Google Scholar
Wang L, Dong M (2011) On the clustering of large-scale data: a matrix-based approach. In: To appear Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN 2011), p 10
Williams CK, Seeger M (2001) Using the nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference, MIT Press, pp 682–688
Xu W, Gong Y (2004) Document clustering by concept factorization. In: SIGIR ’04: proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, ACM, New York, pp 202–209
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp 267–273
Yan D, Huang L, Jordan M (2009) Fast approximate spectral clustering. Technical Report UCB/EECS-2009-45, EECS Department, University of California, Berkeley
Yen GG, Wu Z (2008) Ranked centroid projection: a data visualization approach with self-organizing maps. IEEE Trans Neural Netw 19(2):245–259
Article Google Scholar
Zhang K, Kwok JT (2006) Block-quantized kernel matrix for fast spectral embedding. In: ICML ’06: proceedings of the 23rd international conference on Machine learning, ACM, New York, pp 1097–1104
Zhang K, Tsang IW, Kwok JT (2008) Improved nyström low-rank approximation and error analysis. In ICML ’08: proceedings of the 25th international conference on Machine learning, ACM, New York, pp 1232–1239

Download references

Author information

Authors and Affiliations

Department of Computer Science, Wayne State University, 5057 Woodward Avenue, Suite 3212, Detroit, MI , 48201, USA
Lijun Wang
Department of Computer Science, Wayne State University, 5057 Woodward Avenue, Suite 14110.1, Detroit, MI , 48201, USA
Ming Dong

Authors

Lijun Wang
View author publications
You can also search for this author inPubMed Google Scholar
Ming Dong
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Lijun Wang.

Additional information

Responsible editor: Sugato Basu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, L., Dong, M. Exemplar-based low-rank matrix decomposition for data clustering. Data Min Knowl Disc 29, 324–357 (2015). https://doi.org/10.1007/s10618-014-0347-0

Download citation

Received: 26 September 2011
Accepted: 15 January 2014
Published: 29 January 2014
Issue Date: March 2015
DOI: https://doi.org/10.1007/s10618-014-0347-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exemplar-based low-rank matrix decomposition for data clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Large-Scale Spectral Clustering with Stochastic Nyström Approximation

k-SubMix: Common Subspace Clustering on Mixed-Type Data

Iterative Multi-mode Discretization: Applications to Co-clustering

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Exemplar-based low-rank matrix decomposition for data clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Large-Scale Spectral Clustering with Stochastic Nyström Approximation

k-SubMix: Common Subspace Clustering on Mixed-Type Data

Iterative Multi-mode Discretization: Applications to Co-clustering

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now