Abstract
Recently, co-clustering algorithms are widely used in heterogeneous information networks mining, and the distance metric is still a challenging problem. Bregman divergence is used to measure the distance in traditional co-clustering algorithms, but the hierarchical structure and the feature of the entity itself are not considered. In this paper, an agglomerative hierarchical co-clustering algorithm based on Bregman divergence is proposed to learn hierarchical structure of multiple entities simultaneously. In the aggregation process, the cost of merging two co-clusters is measured by a monotonic Bregman function, integrating heterogeneous relations and features of entities. The robustness of algorithms based on different divergences is tested on synthetic data sets. Experiments on the DBLP data sets show that our algorithm improves the accuracy over existing co-clustering algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dhillon, I.S., Mallela, S., Modha, D.S.: Information Theoretic co-clustering. In: 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 89–98. ACM, Washington (2003)
Banerjee, A., Dhillon, I., Modha, D.S.: A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation. In: 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 509–514. ACM, Washington (2004)
Li, J., Shao, B., Li, T., Ogihara, M.: Hierarchical Co-clustering: A New Way to Organize the Music Data. IEEE Transactions on Multimedia 14(2), 471–481 (2012)
Cheng, W., Zhang, X., Pan, F., Wang, W.: Hierarchical Co-clustering based on Entropy Splitting. In: 21st ACM International Conference on Information and Knowledge Management, Maui, pp. 1472–1476 (2012)
Banerjee, A., Merugu, S., Dhillon, I., Ghosh, J.: Clustering with Bregman Divergence. Journal of Machine Learning Research 6, 1705–1749 (2005)
Matus, T., Sanjoy, D.: Agglomerative Bregman Clustering. In: 29th International Conference on Machine Learning, Edinburgh, pp. 1527–1534 (2012)
Hosseini, M., Abolhassani, H.: Hierarchical co-clustering for web queries and selected uRLs. In: Benatallah, B., Casati, F., Georgakopoulos, D., Bartolini, C., Sadiq, W., Godart, C. (eds.) WISE 2007. LNCS, vol. 4831, pp. 653–662. Springer, Heidelberg (2007)
Mandhani, B., Joshi, S., Kummamuru, K.: A Matrix Density based Algorithm to Hierarchically Co-cluster Documents and Words. In: 12th International Conference on World Wide Web, Budapest, pp. 511–518 (2003)
Ienco, D., Pensa, R.G., Meo, R.: Parameter-free hierarchical co-clustering by n-ary splits. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009, Part I. LNCS, vol. 5781, pp. 580–595. Springer, Heidelberg (2009)
Li, J., Li, T.: HCC: a Hierarchical Co-clustering Algorithm. In: Special Interest Group on Information Retrieval, Geneva, pp. 861–862 (2010)
Huang, F., Yang, Y., Li, T., Zhang, J., Rutayisire, T., Mahmood, A.: Semi-supervised Hierarchical Co-clustering. In: Li, T., Nguyen, H.S., Wang, G., Grzymala-Busse, J., Janicki, R., Hassanien, A.E., Yu, H. (eds.) RSKT 2012. LNCS, vol. 7414, pp. 310–319. Springer, Heidelberg (2012)
Wu, M.-L., Chang, C.-H., Liu, R.-Z.: Co-clustering with Augmented Data Matrix. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 289–300. Springer, Heidelberg (2011)
Ward, J., Hierarchical Grouping, H.: to Optimize an Objective Function. Journal of the American Statistical Association 58(301), 236–244 (1963)
Lomet, A., Govaert, G., Grandvalet, Y.: Design of Artificial Data Tables for Co-clustering Analysis. Technical Report, France (2012)
Ji, M., Sun, Y., Danilevsky, M., Han, J., Gao, J.: Graph Regularized Transductive Classification on Heterogeneous Information Networks. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part I. LNCS, vol. 6321, pp. 570–586. Springer, Heidelberg (2010)
Wang, H., Nie, F., Huang, H., Makedon, F.: Fast Nonnegative Matrix Tri-factorization for Large-scale Data Co-clustering. In: International Joint Conference on Artificial Intelligence, Barcelona, pp. 1553–1558 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Shen, G., Yang, W., Wang, W., Yu, M., Dong, G. (2014). Agglomerative Hierarchical Co-clustering Based on Bregman Divergence. In: Herawan, T., Ghazali, R., Deris, M. (eds) Recent Advances on Soft Computing and Data Mining. Advances in Intelligent Systems and Computing, vol 287. Springer, Cham. https://doi.org/10.1007/978-3-319-07692-8_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-07692-8_37
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07691-1
Online ISBN: 978-3-319-07692-8
eBook Packages: EngineeringEngineering (R0)