Co-clustering with augmented matrix

Wu, Meng-Lun; Chang, Chia-Hui; Liu, Rui-Zhe

doi:10.1007/s10489-012-0401-9

Co-clustering with augmented matrix

Published: 13 December 2012

Volume 39, pages 153–164, (2013)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Meng-Lun Wu¹,
Chia-Hui Chang¹ &
Rui-Zhe Liu¹

449 Accesses
7 Citations
Explore all metrics

Abstract

Clustering plays an important role in data mining as many applications use it as a preprocessing step for data analysis. Traditional clustering focuses on the grouping of similar objects, while two-way co-clustering can group dyadic data (objects as well as their attributes) simultaneously. Most co-clustering research focuses on single correlation data, but there might be other possible descriptions of dyadic data that could improve co-clustering performance. In this research, we extend ITCC (Information Theoretic Co-Clustering) to the problem of co-clustering with augmented matrix. We proposed CCAM (Co-Clustering with Augmented Matrix) to include this augmented data for better co-clustering. We apply CCAM in the analysis of on-line advertising, where both ads and users must be clustered. The key data that connect ads and users are the user-ad link matrix, which identifies the ads that each user has linked; both ads and users also have their feature data, i.e. the augmented matrix. To evaluate the proposed method, we use two measures: classification accuracy and K-L divergence. The experiment is done using the advertisements and user data from Morgenstern, a financial social website that focuses on the advertisement agency. The experiment results show that CCAM provides better performance than ITCC since it considers the use of augmented matrix during clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

Article 27 November 2022

A comprehensive survey on community detection methods and applications in complex information networks

Article 18 April 2024

Notes

Morgenstern: http://www.morgenstern.com.tw/users2/index.php/.

References

Agarwal D, Merugu S (2007) Predictive discrete latent factor models for large scale dyadic data. In: KDD’07: proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, San Jose, pp 26–35
Google Scholar
Mild A, Reutterer T (2001) Collaborative filtering methods for binary market basket data analysis. In: Lecture notes in computer science, pp 302–313
Google Scholar
Banerjee A, Dhillon I-S, Ghosh J, Merugu S, Modha D-S (2004) A generalized maximum entropy approach to Bregman co-clustering and matrix approximation. In: KDD’04: proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, Seattle, pp 509–514
Google Scholar
Blei D-M, Ng A-Y, Jordan M-I (2003) Latent Dirichlet allocation. J Mach Learn Res 993–1022. doi:10.1162/jmlr.2003.3.4-5.993
Chen G, Wang F, Zhang C (2009) Collaborative filtering using orthogonal nonnegative matrix tri-factorization. In: Information processing and management, IPM, pp 368–379
Google Scholar
Cover T, Thomas J (1991) Elements of information theory. Wiley, New York
Book MATH Google Scholar
Dai W, Xue G-R, Yang Q, Yu Y (2007) Co-clustering based classification for out-of-domain documents. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, San Jose, California, USA, pp 210–219
Chapter Google Scholar
Dhillon I-S (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: KDD’01: proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, California, pp 269–274
Chapter Google Scholar
Dhillon I-S, Mallela S, Modha D-S (2003) Information theoretic co-clustering. In: KDD’03: proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, pp 89–98
Chapter Google Scholar
Ding C, He X, Simon H-D (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of the 5th SIAM international conference on data mining, Newport Beach, CA, USA, pp 606–610
Google Scholar
Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix tri-factorization for clustering. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, Philadelphia, PA, USA, pp 126–135
Chapter Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1). doi:10.1145/1656274.1656278
Konstas I, Stathopoulos V, Jose J-M (2009) On social networks and collaborative recommendation. In: Proceedings of the 32nd international ACM SIGIR conference on research and development, Boston, MA, USA, pp 195–202
Chapter Google Scholar
Li B, Yang Q, Xue X (2009) Can movies and books collaborate? Cross-domain collaborative filtering for sparsity reduction. In: Proc of the 21st int’l joint conf on artificial intelligence (IJCAI 2009), pp 2052–2057
Google Scholar
Long B, Zhang Z, Yu P-S (2005) Co-clustering by block value decomposition. In: KDD’05: proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM press, Chicago, pp 635–640
Chapter Google Scholar
Scott D-W (2009) Sturges’ rule. In: WIREs computational statistics, vol 1, pp 303–306
Google Scholar
Shafiei M, Milios E (2005) Model-based overlapping co-clustering, supported by grants from the. Natural Sciences and Engineering Research Council of Canada, IT Interactive Services Inc, and GINIus Inc
Shafiei M, Milios E (2006) Latent Dirichlet co-clustering. In: Data mining 2006. ICDM’06. Sixth international conference, Hong Kong, December 18–22, 2006, pp 542–551
Google Scholar
Shan H, Banerjee A (2008) Bayesian co-clustering. In: Data mining 2008. ICDM’08. Eighth IEEE international conference, Pisa, December 15–19, 2008, pp 530–539
Chapter Google Scholar
Shi K, Li L (2012) High performance genetic algorithm based text clustering using parts of speech and outlier elimination. J Appl Intell. doi:10.1007/s10489-012-0382-8
Slonim N, Tishby N (2000) Document clustering using word clusters via the information bottleneck method. In: proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval, Athens, Greece, pp 208–215
Google Scholar
Sugiyama K, Hatano K, Yoshikawa M (2004) Adaptive web search based on user profile constructed without any effort from users. In: International world wide web conference proceedings of the 13th international conference on world wide web, New York, NY, USA, pp 675–684
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
Meng-Lun Wu, Chia-Hui Chang & Rui-Zhe Liu

Authors

Meng-Lun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Chia-Hui Chang
View author publications
You can also search for this author in PubMed Google Scholar
Rui-Zhe Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chia-Hui Chang.

Additional information

This paper is partially supported by National Science Council, Taiwan under grant NSC-100-2628-E-8-012-MY3.

Appendices

Appendix A: Proof of Lemma 1

For a fixed co-clustering $(\hat{A}, \hat{U})$, we can re-write the loss in mutual information as K-L divergence or relative entropy measure as

(29)

Proof

By definition,

where in the second step, $g(a \mid\hat{a})=\frac{g(a)}{g(\hat{a})}$ solidifies the followed equality.

where in the second step, $h(u \mid\hat{u})=\frac{h(u)}{h(\hat{u})}$ solidifies the followed equality. □

Appendix B: Proof of Lemma 2

Proof

Since Eq. (18) is proved in [9], we focus on the rest two equations.

(30)

Similarly, we could use the same argument to prove Eq. (20).

(31)

□

Appendix C: Proof of Theorem 2

The CCAM algorithm could monotonically decreases the objective function Eq. (6). Since

$$ q^{(t)} (\hat{A} , \hat{U}) \geq q^{(t+1)} (\hat{A} , \hat{U}) $$

(32)

Proof

Let $\varPhi=\varphi\cdot D(h(U, L) \,||\,\hat{h}^{(t+1)}(U, L))$. For t=1,3,…,2T+1.

The inequality follows from step 1 since $C_{A}^{t+1}(a)$ is selected to minimize the objective function.

By using an identical argument, we can prove Eq. (32) for t=2,4,…,2T+2. Let $\varLambda=\lambda\cdot D(f(A, S) \,||\,\hat {f}^{(t+1)}(A, S))$.

□

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, ML., Chang, CH. & Liu, RZ. Co-clustering with augmented matrix. Appl Intell 39, 153–164 (2013). https://doi.org/10.1007/s10489-012-0401-9

Download citation

Published: 13 December 2012
Issue Date: July 2013
DOI: https://doi.org/10.1007/s10489-012-0401-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Co-clustering with augmented matrix

Abstract

Access this article

Similar content being viewed by others