Abstract
In this paper, I discuss current developments in cluster analysis to bring forth earlier developments by E. Braverman and his team. Specifically, I begin by recalling their Spectrum clustering method and Matrix diagonalization criterion. These two include a number of user-specified parameters such as the number of clusters and similarity threshold, which corresponds to the state of affairs as it was at early stages of data science developments; it remains so currently, too. Meanwhile, a data-recovery view of the Principal Component Analysis method admits a natural extension to clustering which embraces two of the most popular clustering methods, K-Means partitioning and Ward agglomerative clustering. To see that, one needs just adjusting the point of view and recognising an equivalent complementary criterion demanding the clusters to be simultaneously “large-sized” and “anomalous”. Moreover, this paradigm shows that the complementary criterion can be reformulated in terms of object-to-object similarities. This criterion appears to be equivalent to the heuristic Matrix diagonalization criterion by Dorofeyuk-Braverman. Moreover, a greedy one-by-one cluster extraction algorithm for this criterion appears to be a version of the Braverman’s Spectrum algorithm – but with automated adjustment of parameters. An illustrative example with mixed scale data completes the presentation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aiserman, M.A., Braverman, E.M., Rosonoer, L.I.: Method of Potential Functions in the Theory of Machine Learning. Nauka Publishers: Main Editorial for Physics and Mathematics, Moscow (1970). (in Russian)
de Amorim, R., Makarenkov, V., Mirkin, B.: A-Ward\(_{p\beta }\): effective hierarchical clustering using the Minkowski metric and a fast k-means initialisation. Inf. Sci. 370, 343–354 (2016)
de Amorim, R.C., Shestakov, A., Mirkin, B., Makarenkov, V.: The Minkowski central partition as a pointer to a suitable distance exponent and consensus partitioning. Patt. Recogn. 67, 62–72 (2017)
Arkadiev, A.G., Braverman, E.M.: Machine Learning for Classification of Objects. Nauka Publishers: Main Editorial for Physics and Mathematics, Moscow (1971). (in Russian)
Bashkirov, O.A., Braverman, E.M., Muchnik, I.B.: Algorithms for machine learning of visual patterns using potential functions. Autom. Remote Control 5, 25 (1964). (in Russian)
Braverman, E., Dorofeyuk, A., Lumelsky, V., Muchnik, I.: Diagonalization of similarity matrices and measuring of hidden factors. In: Issues of extension of capabilities of automata, pp. 42–79. Institute of Control Problems Press, Moscow (1971). (in Russian)
Chiang, M., Mirkin, B.: Intelligent choice of the number of clusters in k-means clustering: an experimental study with different cluster spreads. J. Classif. 27(1), 3–40 (2010)
Dorofeyuk, A.A.: Machine learning algorithm for unsupervised pattern recognition based on the method of potential functions. Autom. Remote Control (USSR) 27, 1728–1737 (1966)
Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. Patt. Recogn. 41(1), 176–190 (2008)
Frey, B., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Holzinger, K.J., Harman, H.H.: Factor Analysis. University of Chicago Press, Chicago (1941)
Kung, S.Y.: Kernel Methods and Machine Learning. Cambridge University Press, Cambridge (2014)
Mirkin, B.G.: The method of principal clusters. Autom. Remote Control 48(10), 1379–1388 (1987)
Mirkin, B.: Sequential fitting procedures for linear data aggregation model. J. Classif. 7, 167–195 (1990)
Mirkin, B.: Core Concepts in Data Analysis: Summarization, Correlation, Visualization. Springer, London (2011)
Mirkin, B.: Clustering: A Data Recovery Approach. Chapman and Hall/CRC Press (2012)
Mirkin, B., Tokmakov, M., de Amorim, R., Makarenkov, V.: Capturing the number of clusters with K-Means using a complementary criterion, affinity propagation, and Ward agglomeration (2017). (Submitted)
Taran, Z., Mirkin, B.: Exploring patterns of corporate social responsibility using a complementary k-means clustering criterion (2017). (Submitted)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Mirkin, B. (2018). Braverman’s Spectrum and Matrix Diagonalization Versus iK-Means: A Unified Framework for Clustering. In: Rozonoer, L., Mirkin, B., Muchnik, I. (eds) Braverman Readings in Machine Learning. Key Ideas from Inception to Current State. Lecture Notes in Computer Science(), vol 11100. Springer, Cham. https://doi.org/10.1007/978-3-319-99492-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-99492-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99491-8
Online ISBN: 978-3-319-99492-5
eBook Packages: Computer ScienceComputer Science (R0)