Abstract
Choice of proximity measure for the nodes greatly affects the results of graph clustering. In this paper, we consider several proximity measures transformed with a number of functions including the logarithmic function, the power function, and a family of activation functions. Transformations are tested in experiments in which several classical datasets are clustered using the k-Means, Ward, and the spectral method. The analysis of experimental results with statistical methods shows that a number of transformed proximity measures outperform their non-transformed versions. The top-performing transformed measures are the Heat measure transformed with the power function, the Forest measure transformed with the power function, and the Forest measure transformed with the logarithmic function.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Hereinafter, a clustering algorithm is denoted by such a triplet. The first element in a triplet is a clustering method, the second is a proximity measure, and the third is a transformation.
- 2.
Recall that an algorithm here refers to a triplet: a clustering method, a proximity measure, and a transformation.
References
Avrachenkov, K., Chebotarev, P., Rubanov, D.: Kernels on graphs as proximity measures. In: Bonato, A., Chung Graham, F., Prałat, P. (eds.) WAW 2017. LNCS, vol. 10519, pp. 27–41. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67810-8_3
Chebotarev, P.: The walk distances in graphs. Discrete Appl. Math. 160, 1484–1500 (2012)
Chebotarev, P.: Studying new classes of graph metrics. In: Nielsen, F., Barbaresco, F. (eds.) GSI 2013. LNCS, vol. 8085, pp. 207–214. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40020-9_21
Chebotarev, P., Shamis, E.: On the proximity measure for graph vertices provided by the inverse Laplacian characteristic matrix. In: Abstracts of the Conference “Linear Algebra and its Application”, 10–12 June 1995, pp. 6–7 (1995)
Chebotarev, P., Shamis, E.: On a duality between metrics and \(\varSigma \)-proximities. Autom. Remote Control. 59, 608–612 (1998)
Chebotarev, P., Shamis, E.: On proximity measures for graph vertices. Autom. Remote Control. 59, 1443–1459 (1998)
Chebotarev, P., Shamis, E.: The forest metrics for graph vertices. Electron. Notes Discret. Math. 11, 98–107 (2002)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Deza, M.M., Deza, E.: Encyclopedia of Distances. Springer, Berlin (2016). https://doi.org/10.1007/978-3-662-52844-0
Estrada, E.: The communicability distance in graphs. Linear Algebr. Its Appl. 436, 4317–4328 (2012)
Fouss, F., Yen, L., Pirotte, A., Saerens, M.: An experimental investigation of graph kernels on a collaborative recommendation task. In: Proceedings of the Sixth International Conference on Data Mining (ICDM 2006), pp. 863–868 (2006)
Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32, 675–701 (1937)
Goddard, W., Oellermann, O.R.: Distance in graphs. In: Dehmer, M. (ed.) Structural Analysis of Complex Networks, pp. 49–72. Birkhäuser, Boston (2010). https://doi.org/10.1007/978-0-8176-4789-6_3
Hartigan, J.A., Wong, M.A.: Algorithm as 136: a \(k\)-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979)
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)
Ivashkin, V., Chebotarev, P.: Do logarithmic proximity measures outperform plain ones in graph clustering? In: Kalyagin, V., Nikolaev, A., Pardalos, P., Prokopyev, O. (eds.) NET 2016. PROMS, vol. 197, pp. 87–105. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56829-4_8
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silvermank, R., Wu, A.Y.: A local search approximation algorithm for \(k\)-means clustering. Comput. Geom. 28(2–3), 89–112 (2004)
Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)
Kondor, R.I., Lafferty, J.D.: Diffusion kernels on graphs and other discrete input spaces. In: Proceedings of ICML, pp. 315–322 (2002)
Milligan, G., Cooper, M.: A study of the comparability of external criteria for hierarchical cluster-analysis. Multivar. Behav. Res. 21, 441–458 (1986)
Nemenyi, P.: Distribution-free multiple comparisons. Biometrics 18(2), 263 (1962)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002)
Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1, 27–64 (2007)
Schenker, A., Last, M., Bunke, H., Kandel, A.: Comparison of distance measures for graph-based clustering of documents. In: Hancock, E., Vento, M. (eds.) GbRPR 2003. LNCS, vol. 2726, pp. 202–213. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45028-9_18
Sommer, F., Fouss, F., Saerens, M.: Comparison of graph node distances on clustering tasks. In: Villa, A.E.P., Masulli, P., Pons Rivero, A.J. (eds.) ICANN 2016. LNCS, vol. 9886, pp. 192–201. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44778-0_23
Ward, J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)
Yen, L., Fouss, F., Decaestecker, C., Francq, P., Saerens, M.: Graph nodes clustering based on the commute-time kernel. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 1037–1045. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71701-0_117
Yen, L., Vanvyve, D., Wouters, F.: Clustering using a random walk based distance measure. In: Proceedings of the 13th European Symposium on Artificial Neural Networks, ESAAN-2005, pp. 317–324 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix A
CD-diagrams
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Aynulin, R. (2019). Efficiency of Transformations of Proximity Measures for Graph Clustering. In: Avrachenkov, K., Prałat, P., Ye, N. (eds) Algorithms and Models for the Web Graph. WAW 2019. Lecture Notes in Computer Science(), vol 11631. Springer, Cham. https://doi.org/10.1007/978-3-030-25070-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-25070-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25069-0
Online ISBN: 978-3-030-25070-6
eBook Packages: Computer ScienceComputer Science (R0)