An improved clustering ensemble method based link analysis

Hao, Zhi-Feng; Wang, Li-Juan; Cai, Rui-Chu; Wen, Wen

doi:10.1007/s11280-013-0208-6

An improved clustering ensemble method based link analysis

Published: 20 March 2013

Volume 18, pages 185–195, (2015)
Cite this article

World Wide Web Aims and scope Submit manuscript

Zhi-Feng Hao^1,2,
Li-Juan Wang^1,2,
Rui-Chu Cai¹ &
…
Wen Wen¹

491 Accesses
4 Citations
Explore all metrics

Abstract

Clustering Ensemble aggregates several base clustering analyses into a consensus clustering result, which is more accurate, stable and meaningful than standard clustering algorithm. In this paper, the ensemble information is described by data cluster association matrix. However, most data cluster association matrix overlooks an important type of information about the relationship between clusters. This paper proposes a new method WETU to refine the data cluster association matrix with link-based similarity measure. The refined data cluster association matrix is obtained according to the similarity of clusters among all base clustering results, not in one base clustering result. In addition, WETU can provide more discriminative information than CSM and WTU. The data cluster association matrix is refined into high level real-valued matrix, which can be aggregated by real-valued method, such as Global k-means. Experiments on synthetic dataset and UCI datasets show that the proposed method outperforms standard K-means, base clustering algorithm and CSM+Global k-means and WTU+Global k-means.T

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Adamic, L.A., Adar, E.: Friends and neighbors on the Web. Soc. Networks 25(3), 211–230 (2003)
Article Google Scholar
Ayad, H., and Kamel, M.: “Finding Natural Clusters Using Multiclusterer Combiner Based on Shared Nearest Neighbors,” Proc. Int’l Work. Mult. Classif. Syst., 166–175 (2003)
Borges, J., Levene, M.: Ranking pages by topology and popularity within Web sites. World Wide Web 9, 301–316 (2006)
Article Google Scholar
Domeniconi, C., Al-Razgan, M.: Weighted Cluster Ensembles: Methods and Analysis. ACM Trans. Knowl. Discov. Data 2(4), 1–40 (2009)
Article Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. John Wiley & Sons, New York (2001)
MATH Google Scholar
Fern, X.Z., Brodley, C.E.: “Random projection for high dimensional clustering: A cluster ensemble approach,” Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), Washington DC, 186–193 (2003)
Fischer, B., Buhmann, J.M.: Bagging for path-based clustering. IEEE Trans. Pattern Anal. Mach. Intell. 25(11), 1411–1415 (2003)
Google Scholar
Fouss, F., Pirotte, A., Renders, J.M., Saerens, M.: Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation. EEE Trans. Knowl. Data Eng. 19(3), 355–369 (2007)
Article Google Scholar
Fred, A.L.N., Jain, A.K.: Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005)
Google Scholar
Getoor, L., Diehl, C.P.: Link mining: a survey. ACM SIGKDD Explor. Newsl. 7(2), 3–12 (2005)
Article Google Scholar
Gionis, A., Mannila, H. and Tsaparas, P.: “Clustering Aggregation,” Proc. Int’l Conf. Data Eng., 341–352 (2005)
Iam-On, N., Boongoen, T., Garrett, S., Price, C.: A link-based approach to the cluster ensemble problem. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2396–2409 (2011)
Article Google Scholar
Jain, A.K., Law, M.H.C.: Data clustering: A user’s dilemma”, Pattern Recognition and Machine Intelligence, pp. 1–10. Springer-Verlag, Berlin (2005)
Book Google Scholar
Jain, A., Murty, M., Flynn, P.: Data clustering: a review. ACM Comput. Surv. 31, 264–323 (1999)
Article Google Scholar
Karypis, G., Kumar, V.: Multilevel k-Way Partitioning Scheme for Irregular Graphs. J. Parallel Distrib. Comput. 48(1), 96–129 (1998)
Article MathSciNet Google Scholar
Kellam, P., Liu, X., Martin, N.J., Orengo, C., Swift, S. and Tucker, A.: “Comparing, contrasting and combining clusters in viral gene expression data,” in Proc. 6th Workshop Intell. Data Anal. Med. Pharmocol., 56–62 (2001)
Kuncheva, L.I., Vetrov, D.P.: Evaluation of stability of k-means cluster ensembles with respect to random initialization. IEEE Trans. Pattern Anal. Mach. Intell. 28(11), 1798–1808 (2006)
Article Google Scholar
Li, J.Q., Zhao, Y., Garcia-Molina, H.: A path-based approach for web page retrieval. World Wide Web 15, 257–283 (2012)
Article Google Scholar
Likas, A., Vlassis, N., Verbeek, J.J.: The Global k-Means Clustering Algorithm. Pattern Recognit. 36, 451–461 (2003)
Article Google Scholar
Lin, Z., King, I. and Lyu, M.R.: “PageSim: A Novel Link-Based Similarity Measure for the World Wide Web,”Proc. IEEE/WIC/ACM Int’l Conf. Web Intell., 687–693 (2006)
Minaei-Bidgoli, B. Topchy, A. and Punch, W.: “A Comparison of Resampling Methods for Clustering Ensembles,” Proc. Int’l Conf. Mach. Learn. Models Technol. Appl., 939–945 (2004)
Monti, S., Tamayo, P., Mesirov, J.P., Golub, T.R.: Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach. Learn. 52, 91–118 (2003)
Google Scholar
Natthakan Iam-On, Tossapon Boongoen, Improved Link-Based Cluster Ensembles,WCCI 2012 IEEE World Congress on Computational Intelligence. Brisbane(2012)
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Syst 14, 849–856 (2001)
Google Scholar
Nguyen, N. and Caruana, R.: “Consensus Clusterings,” Proc. IEEE Int’l Conf. Data Min., 607–612 (2007)
Punera, K., Ghosh, J.: Soft cluster ensembles. In: de Oliveira Valente, J., Pedrycz, W. (eds.) Advances in fuzzy clustering and its applications. Wiley, Hoboken (2007)
Google Scholar
Strehl, A., Ghosh, J.: Cluster Ensembles: a Knowledge Reuse Framework for Combining Multiple Partitions. J. Mach. Learn. Res. 3, 583–617 (2002)
MathSciNet Google Scholar
Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1866–1881 (2005)
Article Google Scholar
Wang, T.: CA-Tree: a Hierarchical Structure for Efficient and Scalable Coassociation-Based Cluster Ensembles. IEEE Trans. Syst. Man Cybern.—PART B: Cybern. 41(3), 686–698 (2011)
Article Google Scholar
Wei, F., Qian, W., Wang, C., Zhou, A.: Detecting overlapping community structures in networks. World Wide Web 12, 235–261 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer, Guangdong University of technology, Guangzhou, 510006, China
Zhi-Feng Hao, Li-Juan Wang, Rui-Chu Cai & Wen Wen
School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510006, China
Zhi-Feng Hao & Li-Juan Wang

Authors

Zhi-Feng Hao
View author publications
You can also search for this author in PubMed Google Scholar
Li-Juan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Rui-Chu Cai
View author publications
You can also search for this author in PubMed Google Scholar
Wen Wen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li-Juan Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hao, ZF., Wang, LJ., Cai, RC. et al. An improved clustering ensemble method based link analysis. World Wide Web 18, 185–195 (2015). https://doi.org/10.1007/s11280-013-0208-6

Download citation

Received: 31 October 2012
Revised: 22 February 2013
Accepted: 04 March 2013
Published: 20 March 2013
Issue Date: March 2015
DOI: https://doi.org/10.1007/s11280-013-0208-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved clustering ensemble method based link analysis

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Data clustering: application and trends

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An improved clustering ensemble method based link analysis

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Data clustering: application and trends

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation