Meta-learning Based Evolutionary Clustering Algorithm

Tomp, Dmitry; Muravyov, Sergey; Filchenkov, Andrey; Parfenov, Vladimir

doi:10.1007/978-3-030-33607-3_54

Dmitry Tomp^14,15,
Sergey Muravyov^14,15,
Andrey Filchenkov^14,15 &
…
Vladimir Parfenov¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11871))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1790 Accesses
3 Citations

Abstract

In this work, we address the hard clustering problem. We present a new clustering algorithm based on evolutionary computation searching a best partition with respect to a given quality measure. We present 32 partition transformation that are used as mutation operators. The algorithm is a $(1+1)$ evolutionary strategy that selects a random mutation on each step from a subset of preselected mutation operators. Such selection is performed with a classifier trained to predict usefulness of each mutation for a given dataset. Comparison with state-of-the-art approach for automated clustering algorithm and hyperparameter selection shows the superiority of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An efficient evolutionary algorithm with a nearest neighbor search technique for clustering analysis

Article 03 October 2020

Using a Reverse Engineering Type Paradigm in Clustering. An Evolutionary Programming Based Approach

Introduction to Evolutionary Data Clustering and Its Applications

Notes

1.
This phenomenon is most likely related to properties of a specific CVI and can be further mitigated, e.g. by applying different initialization method or using a more complex mutation/evolutionary scheme.
2.
Full collection of comparison boxplots can be found at https://bit.ly/2Zr3WwG.

References

Ma, P.C., Chan, K.C., Yao, X., Chiu, D.K.: An evolutionary clustering algorithm for gene expression microarray data analysis. Trans. Evol. Comp. 10, 296–314 (2006)
Article Google Scholar
Punj, G., Stewart, D.W.: Cluster analysis in marketing research: review and suggestions for application. J. Mark. Res. 20(2), 134–148 (1983)
Article Google Scholar
Farseev, A., Samborskii, I., Filchenkov, A., Chua, T.-S.: Cross-domain recommendation via clustering on multi-layer graphs. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 195–204. ACM (2017)
Google Scholar
Kleinberg, J.: An impossibility theorem for clustering. In: Proceedings of the 15th International Conference on Neural Information Processing Systems, NIPS 2002, pp. 463–470. MIT Press, Cambridge (2002)
Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Article Google Scholar
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 2, 224–227 (1979)
Article Google Scholar
Caliński, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat.-Theory Methods 3(1), 1–27 (1974)
Article MathSciNet Google Scholar
Moulavi, D., Jaskowiak, P.A., Campello, R.J.G.B., Zimek, A., Sander, J.: Density-based clustering validation, April 2014
Google Scholar
Arbelaitz, O., Gurrutxaga, I., Muguerza, J., PéRez, J.M., Perona, I.N.: An extensive comparative study of cluster validity indices. Pattern Recogn. 46, 243–256 (2013)
Article Google Scholar
Hruschka, E.R., Campello, R.J.G.B., Freitas, A.A., De Carvalho, A.C.P.L.F.: A survey of evolutionary algorithms for clustering. Trans. Syst. Man Cyber. Part C 39, 133–155 (2009)
Article Google Scholar
Ferrari, D.G., de Castro, L.N.: Clustering algorithm selection by meta-learning systems. Inf. Sci. 301, 181–194 (2015)
Article Google Scholar
Muravyov, S., Filchenkov, S.: Meta-learning system for automated clustering. In: AutoML@ PKDD/ECML, pp. 99–101 (2017)
Google Scholar
Shalamov, V., Filchenkov, A., Shalyto, A.: Heuristic and metaheuristic solutions of pickup and delivery problem for self-driving taxi routing. Evol. Syst. 10, 11 (2017)
Google Scholar
Cole, R.: Clustering with genetic algorithms. Ph.D. thesis (1998)
Google Scholar
Hruschka, E.R., Ebecken, N.F.F.: A genetic algorithm for cluster analysis. Intell. Data Anal. 7, 15–25 (2003)
Article Google Scholar
Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. Trans. Sys. Man Cyber. Part B 28, 301–315 (1998)
Article Google Scholar
Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. Trans. Evol. Comp 11, 56–76 (2007)
Article Google Scholar
Muravyov, S., Antipov, D., Buzdalova, A., Filchenkov, A.: Efficient computation of fitness function for evolutionary clustering. MENDEL 25, 87–94 (2019)
Article Google Scholar
Pillay, N., Qu, R.: Hyper-Heuristics: Theory and Applications. Springer, Switzerland (2018). https://doi.org/10.1007/978-3-319-96514-7
Book Google Scholar
Woodward, J.R., Swan, J.: The automatic generation of mutation operators for genetic algorithms. In: Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation, pp. 67–74. ACM (2012)
Google Scholar
Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: Openml: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013)
Article Google Scholar
Shalamov, V., Efimova, V., Muravyov, S., Filchenkov, A.: Reinforcement-based method for simultaneous clustering algorithm selection and its hyperparameters optimization. Procedia Comput. Sci. 136, 144–153 (2018)
Article Google Scholar
Hutter, F., Hoos, H., Leyton-Brown, H.: An evaluation of sequential model-based optimization for expensive blackbox functions. In: Proceedings of the 15th Annual Conference Companion on Genetic and Evolutionary Computation, pp. 1209–1216. ACM (2013)
Google Scholar

Download references

Acknowledgments

The authors would like to thank Maxim Buzdalov for useful comments. The research was financially supported by The Russian Science Foundation, Agreement 17-71-30029.

Author information

Authors and Affiliations

Machine Learning Lab, ITMO University, 49 Kronverksky Pr., St. Petersburg, 197101, Russia
Dmitry Tomp, Sergey Muravyov & Andrey Filchenkov
Information Technologies and Programming Faculty, ITMO University, 49 Kronverksky Pr., St. Petersburg, 197101, Russia
Dmitry Tomp, Sergey Muravyov, Andrey Filchenkov & Vladimir Parfenov

Authors

Dmitry Tomp
View author publications
You can also search for this author in PubMed Google Scholar
Sergey Muravyov
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Filchenkov
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Parfenov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sergey Muravyov .

Editor information

Editors and Affiliations

University of Manchester, Manchester, UK
Hujun Yin
Technical University of Madrid, Madrid, Spain
David Camacho
University of Birmingham, Birmingham, UK
Peter Tino
University of Huelva, Huelva, Spain
Antonio J. Tallón-Ballesteros
University of Exeter, Exeter, UK
Ronaldo Menezes
University of Manchester, Manchester, UK
Richard Allmendinger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tomp, D., Muravyov, S., Filchenkov, A., Parfenov, V. (2019). Meta-learning Based Evolutionary Clustering Algorithm. In: Yin, H., Camacho, D., Tino, P., Tallón-Ballesteros, A., Menezes, R., Allmendinger, R. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2019. IDEAL 2019. Lecture Notes in Computer Science(), vol 11871. Springer, Cham. https://doi.org/10.1007/978-3-030-33607-3_54

Download citation

DOI: https://doi.org/10.1007/978-3-030-33607-3_54
Published: 18 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33606-6
Online ISBN: 978-3-030-33607-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics