A clustering ensemble framework based on elite selection of weighted clusters

Parvin, Hamid; Minaei-Bidgoli, Behrouz

doi:10.1007/s11634-013-0130-x

A clustering ensemble framework based on elite selection of weighted clusters

Regular Article
Published: 10 April 2013

Volume 7, pages 181–208, (2013)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

Hamid Parvin¹ &
Behrouz Minaei-Bidgoli¹

608 Accesses
54 Citations
Explore all metrics

Abstract

Each clustering algorithm usually optimizes a qualification metric during its progress. The qualification metric in conventional clustering algorithms considers all the features equally important; in other words each feature participates in the clustering process equivalently. It is obvious that some features have more information than others in a dataset. So it is highly likely that some features should have lower importance degrees during a clustering or a classification algorithm; due to their lower information or their higher variances and etc. So it is always a desire for all artificial intelligence communities to enforce the weighting mechanism in any task that identically uses a number of features to make a decision. But there is always a certain problem of how the features can be participated in the clustering process (in any algorithm, but especially in clustering algorithm) in a weighted manner. Recently, this problem is dealt with by locally adaptive clustering (LAC). However, like its traditional competitors the LAC suffers from inefficiency in data with imbalanced clusters. This paper solves the problem by proposing a weighted locally adaptive clustering (WLAC) algorithm that is based on the LAC algorithm. However, WLAC algorithm suffers from sensitivity to its two parameters that should be tuned manually. The performance of WLAC algorithm is affected by well-tuning of its parameters. Paper proposes two solutions. The first is based on a simple clustering ensemble framework to examine the sensitivity of the WLAC algorithm to its manual well-tuning. The second is based on cluster selection method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agrawal R, Gehrke J, Gunopulos D, Raghavan P (1998) Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD international conference on Management of data, pp 94–105
Alizadeh H, Minaei-Bidgoli B, Parvin H (2011a) A new criterion for clusters validation. Artificial Intelligence Applications and innovations, IFIP advances in information and communication technology, vol 364, pp 110–115
Alizadeh H, Minaei-Bidgoli B, Parvin H, Moshki M (2011b) An asymmetric criterion for cluster validation. Developing concepts in applied intelligence, Studies in computational intelligence, vol 363, pp 1–14
Blum A, Rivest R (1992) Training a 3-node neural network is NP-complete. Neural Netw 5:117–127
Google Scholar
Chang JW, Jin DS (2002) A new cell-based clustering method for large-high dimensional data in data mining applications. In: Proceedings of the ACM symposium on Applied computing, pp 503–507
Cheng CH, Fu AW, Zhang Y (1999) Entropy-based subspace clustering for mining numerical data. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining, pp 84–93
Domeniconi C, Al-Razgan M (2009) Weighted cluster ensembles: methods and analysis. ACM Trans Knowl Discov Data. doi:10.1145/1460797.1460800
Domeniconi C, Gunopulos D, Ma S, Yan B, Al-Razgan M, Papadopoulos D (2007) Locally adaptive metrics for clustering high dimensional data. Data Min Knowl Discov 14:63–97
Article MathSciNet Google Scholar
Dudoit S, Fridlyand J (2003) Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19:1090–1099
Article Google Scholar
Faceli K, Marcilio CP, Souto D (2006) Multi-objective clustering ensemble. In: Proceedings of the sixth international conference on hybrid intelligent systems (HIS’06)
Fred A (2001) Finding consistent clusters in data partitions. In: Second international workshop on multiple classifier systems, pp 309–318
Fred A, Jain AK (2002a) Data clustering using evidence accumulation. In: Proceedings of the 16th international conference on pattern recognition, pp 276–280
Fred A, Jain AK (2002b) Evidence accumulation clustering based on the k-means algorithm. In: Joint IAPR international workshops on structural, syntactic, and statistical pattern recognition, pp 442–451
Fred A, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27:835–850
Article Google Scholar
Fern XZ, Lin W (2008) Cluster ensemble selection. In: SIAM international conference on data mining, pp 787–797
Jain AK, Dubes RC (1998) Algorithms for clustering data. Prentice Hall, Englewood Cliffs
Google Scholar
Kohavi R, John RG (1997) Wrappers for feature subset selection. Artif Intell 97:273–324
Article MATH Google Scholar
Liu B, Xia Y, Yu PS (2000) Clustering through decision tree construction. In: Proceedings of the ninth international conference on information and knowledge management, pp 20–29
Miller R, Yang Y (1997) Association rules over interval data. In: Proceedings of ACM SIGMOD international conference on management of data, pp 452–461
Mirzaei A, Rahmati M, Ahmadi M (2008) A new method for hierarchical clustering combination. Intell Data Anal 12:549–571
Google Scholar
Minaei-Bidgoli B, Parvin H, Alinejad H, Alizadeh H, Punch W (2011) Effects of resampling method and adaptation on clustering ensemble efficacy. Artif Intell Rev. doi:10.1007/s10462-011-9295-x
Munkres J (1957) Algorithms for the assignment and transportation problems. J Soc Indus Appl Math 5:32–38
Article MathSciNet MATH Google Scholar
Newman CBDJ, Hettich S, Merz C (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLSummary.html
Parsons L, Haque E, Liu H (2004) Subspace clustering for high dimensional data: a review. ACM SIGKDD Explor Newsl 6:90–105
Article Google Scholar
Parvin H, Beigi A, Mozayani N (2012a) A clustering ensemble learning method based on the ant colony clustering algorithm. Int J Appl Comput Math 11:286–302
Google Scholar
Parvin H, Minaei-Bidgoli B, Parvin S, Alinejad H (2012b) A new classifier ensemble methodology based on subspace learning. J Exp Theor Artif Intell. doi:10.1080/0952813X.2012.715683
Procopiuc CM, Jones M, Agarwal PK, Murali TM (2002) A Monte Carlo algorithm for fast projective clustering. In: Proceedings of the ACM SIGMOD conference on management of data, pp 418–427
Srikant R, Agrawal R (1996) Mining quantitative association rules in large relational tables. In: Proceedings of the ACM SIGMOD conference on management of data
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering, Iran University of Science and Technology (IUST), Tehran, Iran
Hamid Parvin & Behrouz Minaei-Bidgoli

Authors

Hamid Parvin
View author publications
You can also search for this author in PubMed Google Scholar
Behrouz Minaei-Bidgoli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamid Parvin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Parvin, H., Minaei-Bidgoli, B. A clustering ensemble framework based on elite selection of weighted clusters. Adv Data Anal Classif 7, 181–208 (2013). https://doi.org/10.1007/s11634-013-0130-x

Download citation

Received: 09 March 2012
Revised: 10 March 2013
Accepted: 26 March 2013
Published: 10 April 2013
Issue Date: June 2013
DOI: https://doi.org/10.1007/s11634-013-0130-x

Keywords

Mathematics Subject Classification

62H30

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A clustering ensemble framework based on elite selection of weighted clusters

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

Hybrid approaches to optimization and machine learning methods: a systematic literature review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A clustering ensemble framework based on elite selection of weighted clusters

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

Hybrid approaches to optimization and machine learning methods: a systematic literature review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation