Abstract
Multiobjective approaches to data clustering return sets of solutions that correspond to trade-offs between different clustering objectives. Here, an established ensemble technique (evidence-accumulation) is applied to the identification of shared features within the set of clustering solutions returned by the multiobjective clustering method MOCK. We show that this approach can be employed to achieve a four-fold reduction in the number of candidate solutions, whilst maintaining the accuracy of MOCK’s best clustering solutions. We also find that the resulting knowledge provides a novel design basis for the visual exploration and comparison of different clustering solutions. There are clear parallels with recent work on ‘innovization’, where it was suggested that the design-space analysis of the solution sets returned by multiobjective optimization may provide deep insight into the core design principles of good solutions.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bandaru, S., Deb, K.: Automated discovery of vital knowledge from Pareto-optimal solutions: First results from engineering design. In: 2010 IEEE Congress on Evolutionary Computation (CEC), pp. 1–8. IEEE (2010)
Bandaru, S., Deb, K.: Automated Innovization for Simultaneous Discovery of Multiple Rules in Bi-objective Problems. In: Takahashi, R.H.C., Deb, K., Wanner, E.F., Greco, S. (eds.) EMO 2011. LNCS, vol. 6576, pp. 1–15. Springer, Heidelberg (2011)
Bezdek, J.C., Pal, N.R.: Cluster validation with generalized Dunn’s indices. In: Proceedings of the Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems, pp. 190–193. IEEE (1995)
Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Information Fusion 6(1), 5–20 (2005)
Corne, D.W., Jerram, N.R., Knowles, J.D., Oates, M.J.: PESA-II: Region-based selection in evolutionary multiobjective optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 283–290 (2001)
Delattre, M., Hansen, P.: Bicriterion cluster analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 2(4), 277–291 (1980)
Fred, A.L.N., Jain, A.K.: Combining multiple clusterings using evidence accumulation. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(6), 835–850 (2005)
Ghaemi, R., Sulaiman, M.N., Ibrahim, H., Mustapha, N.: A survey: clustering ensembles techniques. In: Proceedings of Computer, Electrical, and Systems Science, and Engineering (CESSE), vol. 38, pp. 644–653 (2009)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Journal of Intelligent Information Systems 17(2), 107–145 (2001)
Handl, J., Knowles, J.: Exploiting the Trade-off — The Benefits of Multiple Objectives in Data Clustering. In: Coello Coello, C.A., Hernández Aguirre, A., Zitzler, E. (eds.) EMO 2005. LNCS, vol. 3410, pp. 547–560. Springer, Heidelberg (2005)
Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. IEEE Transactions on Evolutionary Computation 11(1), 56–76 (2007)
Handl, J., Knowles, J., Kell, D.B.: Computational cluster validation for post-genomic data analysis. Bioinformatics 21(15), 3201–3212 (2005)
Maulik, U., Mukhopadhyay, A., Bandyopadhyay, S.: Combining Pareto-optimal clusters using supervised learning for identifying co-expressed genes. BMC Bioinformatics 10(1), 27 (2009)
Roth, V., Lange, T., Braun, M., Buhmann, J.M.: A resampling approach to cluster validation. In: COMPSTAT, pp. 123–128 (2002)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20, 53–65 (1987)
Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining multiple partitions. The Journal of Machine Learning Research 3, 583–617 (2003)
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63(2), 411–423 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Handl, J., Knowles, J. (2013). Evidence Accumulation in Multiobjective Data Clustering. In: Purshouse, R.C., Fleming, P.J., Fonseca, C.M., Greco, S., Shaw, J. (eds) Evolutionary Multi-Criterion Optimization. EMO 2013. Lecture Notes in Computer Science, vol 7811. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37140-0_41
Download citation
DOI: https://doi.org/10.1007/978-3-642-37140-0_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37139-4
Online ISBN: 978-3-642-37140-0
eBook Packages: Computer ScienceComputer Science (R0)