ABSTRACT
Semi-supervised consensus clustering integrates supervised information into consensus clustering in order to improve the quality of clustering. In this paper, we study the novel Semi-MultiCons semi-supervised consensus clustering method extending the previous MultiCons approach. Semi-MultiCons aims to improve the clustering result by integrating pairwise constraints in the consensus creation process and infer the number of clusters K using frequent closed itemsets extracted from the ensemble members. Experimental results show that the proposed method outperforms other state-of-art semi-supervised consensus algorithms.
Supplemental Material
- Atheer Al-Najdi. 2016. A closed patterns-based approach to the consensus clustering problem. Ph.D. Dissertation. Université Côte d'Azur.Google Scholar
- Mikhail Bilenko, Sugato Basu, and Raymond J Mooney. 2004. Integrating constraints and metric learning in semi-supervised clustering. In Proceedings of the 21st International Conference on Machine learning. 11.Google ScholarDigital Library
- Tossapon Boongoen and Natthakan Iam-On. 2018. Cluster ensembles: A survey of approaches with recent extensions and applications. Computer Science Review, Vol. 28 (2018), 1--25.Google ScholarCross Ref
- Ian Davidson, Kiri L Wagstaff, and Sugato Basu. 2006. Measuring constraint-set utility for partitional clustering algorithms. In European Conference on Principles of Data Mining and Knowledge Discovery. Springer, 115--126.Google ScholarCross Ref
- Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml.Google Scholar
- Kurt Hornik. 2005. A CLUE for CLUster ensembles. Journal of Statistical Software, Vol. 14, 12 (2005), 1--25.Google ScholarCross Ref
- Zhiwu Lu and Yuxin Peng. 2013. Exhaustive and efficient constraint propagation: A graph-based learning approach and its applications. International Journal of Computer Vision, Vol. 103, 3 (2013), 306--325.Google ScholarCross Ref
- Amjad Mahmood, Tianrui Li, Yan Yang, Hongjun Wang, and Mehtab Afzal. 2015. Semi-supervised evolutionary ensembles for Web video categorization. Knowledge-Based Systems, Vol. 76 (2015), 53--66.Google ScholarDigital Library
- Sandro Vega-Pons and José Ruiz-Shulcloper. 2011. A Survey of Clustering Ensemble Algorithms. International Journal of Pattern Recognition and Artificial Intelligence, Vol. 25 (2011), 337--372.Google ScholarCross Ref
- Kiri Wagstaff and Claire Cardie. 2000. Clustering with instance-level constraints. AAAI/IAAI, Vol. 1097 (2000), 577--584.Google Scholar
- Kiri L Wagstaff, Sugato Basu, and Ian Davidson. 2006. When is constrained clustering beneficial, and why? Ionosphere, Vol. 58, 60.1 (2006), 62--63.Google Scholar
- Wenchao Xiao, Yan Yang, Hongjun Wang, Tianrui Li, and Huanlai Xing. 2016. Semi-supervised hierarchical clustering ensemble and its application. Neurocomputing, Vol. 173 (2016), 1362--1376.Google ScholarDigital Library
- Zhiwen Yu, Peinan Luo, Jiming Liu, Hau-San Wong, Jane You, Guoqiang Han, and Jun Zhang. 2018. Semi-supervised ensemble clustering based on selected constraint projection. IEEE Transactions on Knowledge and Data Engineering, Vol. 30, 12 (2018), 2394--2407.Google ScholarDigital Library
- Zhiwen Yu, Peinan Luo, Jane You, Hau-San Wong, Hareton Leung, Si Wu, Jun Zhang, and Guoqiang Han. 2016. Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Transactions on Knowledge and Data Engineering, Vol. 28, 3 (2016), 701--714.Google ScholarDigital Library
Index Terms
- Semi-supervised Consensus Clustering Based on Frequent Closed Itemsets
Recommendations
Semi-supervised consensus clustering based on closed patterns
AbstractSemi-supervised consensus clustering, also called semi-supervised ensemble clustering, is a recently emerged technique that integrates prior knowledge into consensus clustering in order to improve the quality of the clustering result. ...
A method for mining top-rank-k frequent closed itemsets
Collective intelligent information and database systemsMining frequent closed itemsets (FCIs) is important in mining non-redundant (minimal) association rules. Therefore, many algorithms have been developed for mining FCIs with reduced mining time and memory usage. For mining FCIs, algorithms use the minimum ...
Mining frequent closed itemsets without candidate generation
ISPA'05: Proceedings of the Third international conference on Parallel and Distributed Processing and ApplicationsMining frequent closed itemsets provides complete and non-redundant result for the analysis of frequent pattern. Most of the previous studies adopted the FP-tree based conditional FP-tree generation and candidate itemsets generation-and-test approaches. ...
Comments