skip to main content
10.1145/2554850.2554979acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Single multiplicatively updated matrix factorization for co-clustering

Published: 24 March 2014 Publication History

Abstract

Co-clustering treats a data matrix in a symmetric fashion that a partitioning of rows can induce a partitioning of columns, and vice versa. It has been shown to be advantageous over traditional clustering. However, the time and space complexities of most co-clustering algorithms are costly which limit their effectiveness on large datasets. To address this problem, we propose a single multiplicatively updated matrix factorization for co-clustering, in which only one factor matrix needs to be updated by a multiplicative rule derived from nonnegative matrix tri-factorization (NMTF) and other matrices can be obtained from alternative nonnegative least squares. Moreover, we extend this hybrid method to symmetric NMTF that is conducted on proximity matrix. Extensive experiments on several large text datasets show that our approach outperforms state-of-the-art co-clustering algorithms in terms of purity and entropy but with much less time and space costs.

References

[1]
M. W. Berry, M. Browne, A. N. Langville, P. V. Pauca, and R. J. Plemmons. Algorithms and applications for approximate nonnegative matrix factorization. Computational Statistics and Data Analysis, 52(1): 155--173, September 2007.
[2]
H. Cho, I. S. Dhillon, Y. Guan, and S. Sra. Minimum sum-squared residue co-clustering of gene expression data. In Proceedings of the 4th SIAM International Conference on Data Mining, pages 114--125, 2004.
[3]
I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 269--274, 2001.
[4]
I. S. Dhillon, S. Mallela, and D. S. Modha. Information-theoretic co-clustering. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 89--98, 2003.
[5]
C. Ding, T. Li, and W. Peng. Nonnegative matrix factorization and probabilistic latent semantic indexing: equivalence, chi-square statistic, and a hybrid method. In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 1, AAAI'06, pages 342--347. AAAI Press, 2006.
[6]
C. Ding, T. Li, W. Peng, and H. Park. Orthogonal nonnegative matrix t-factorizations for clustering. In KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 126--135, 2006.
[7]
K. Jingu and P. Haesun. Fast nonnegative matrix factorization: An active-set-like method and comparisons. SIAM Journal on Scientific Computing (SISC), 33(6): 3261--3281, 2011.
[8]
D. Kim, S. Sra, and I. S. Dhillon. Fast projection-based methods for the least squares nonnegative matrix approximation problem. Stat. Anal. Data Min., 1(1): 38--51, 2008.
[9]
H. Kim and H. Park. Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J. Matrix Anal. Appl., 30(2): 713--730, July 2008.
[10]
D. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. In Proceedings of 15th Annual Conference on Neural Information Processing Systems, pages 556--562, 2001.
[11]
T. Li and C. Ding. The relationships among various nonnegative matrix factorization methods for clustering. In ICDM '06: Proceedings of the Sixth International Conference on Data Mining, pages 362--371, 2006.
[12]
C.-J. Lin. Projected gradient methods for nonnegative matrix factorization. Neural Comput., 19(10): 2756--2779, Oct. 2007.
[13]
B. Long, Z. M. Zhang, and P. S. Yu. Co-clustering by block value decomposition. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 635--640, 2005.
[14]
P. Paatero and U. Tapper. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5(2): 111--126, 1994.
[15]
F. Pan, X. Zhang, and W. Wang. Crd: fast co-clustering on large datasets utilizing sampling-based matrix decomposition. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD '08, pages 173--184, 2008.
[16]
S. Zhong and J. Ghosh. Generative model-based document clustering: a comparative study. Knowledge Information System, 8(3): 374--384, 2005.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '14: Proceedings of the 29th Annual ACM Symposium on Applied Computing
March 2014
1890 pages
ISBN:9781450324694
DOI:10.1145/2554850
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 March 2014

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SAC 2014
Sponsor:
SAC 2014: Symposium on Applied Computing
March 24 - 28, 2014
Gyeongju, Republic of Korea

Acceptance Rates

SAC '14 Paper Acceptance Rate 218 of 939 submissions, 23%;
Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 57
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media