Efficiency of Transformations of Proximity Measures for Graph Clustering

Aynulin, Rinat

doi:10.1007/978-3-030-25070-6_2

Rinat Aynulin^17,18

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11631))

Included in the following conference series:

International Workshop on Algorithms and Models for the Web-Graph

288 Accesses
5 Citations

Abstract

Choice of proximity measure for the nodes greatly affects the results of graph clustering. In this paper, we consider several proximity measures transformed with a number of functions including the logarithmic function, the power function, and a family of activation functions. Transformations are tested in experiments in which several classical datasets are clustered using the k-Means, Ward, and the spectral method. The analysis of experimental results with statistical methods shows that a number of transformed proximity measures outperform their non-transformed versions. The top-performing transformed measures are the Heat measure transformed with the power function, the Forest measure transformed with the power function, and the Forest measure transformed with the logarithmic function.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Hereinafter, a clustering algorithm is denoted by such a triplet. The first element in a triplet is a clustering method, the second is a proximity measure, and the third is a transformation.
2.
Recall that an algorithm here refers to a triplet: a clustering method, a proximity measure, and a transformation.

References

Avrachenkov, K., Chebotarev, P., Rubanov, D.: Kernels on graphs as proximity measures. In: Bonato, A., Chung Graham, F., Prałat, P. (eds.) WAW 2017. LNCS, vol. 10519, pp. 27–41. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67810-8_3
Chapter Google Scholar
Chebotarev, P.: The walk distances in graphs. Discrete Appl. Math. 160, 1484–1500 (2012)
Article MathSciNet Google Scholar
Chebotarev, P.: Studying new classes of graph metrics. In: Nielsen, F., Barbaresco, F. (eds.) GSI 2013. LNCS, vol. 8085, pp. 207–214. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40020-9_21
Chapter MATH Google Scholar
Chebotarev, P., Shamis, E.: On the proximity measure for graph vertices provided by the inverse Laplacian characteristic matrix. In: Abstracts of the Conference “Linear Algebra and its Application”, 10–12 June 1995, pp. 6–7 (1995)
Google Scholar
Chebotarev, P., Shamis, E.: On a duality between metrics and \(\varSigma \)-proximities. Autom. Remote Control. 59, 608–612 (1998)
MATH Google Scholar
Chebotarev, P., Shamis, E.: On proximity measures for graph vertices. Autom. Remote Control. 59, 1443–1459 (1998)
MATH Google Scholar
Chebotarev, P., Shamis, E.: The forest metrics for graph vertices. Electron. Notes Discret. Math. 11, 98–107 (2002)
Article MathSciNet Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet MATH Google Scholar
Deza, M.M., Deza, E.: Encyclopedia of Distances. Springer, Berlin (2016). https://doi.org/10.1007/978-3-662-52844-0
Book MATH Google Scholar
Estrada, E.: The communicability distance in graphs. Linear Algebr. Its Appl. 436, 4317–4328 (2012)
Article MathSciNet Google Scholar
Fouss, F., Yen, L., Pirotte, A., Saerens, M.: An experimental investigation of graph kernels on a collaborative recommendation task. In: Proceedings of the Sixth International Conference on Data Mining (ICDM 2006), pp. 863–868 (2006)
Google Scholar
Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32, 675–701 (1937)
Article Google Scholar
Goddard, W., Oellermann, O.R.: Distance in graphs. In: Dehmer, M. (ed.) Structural Analysis of Complex Networks, pp. 49–72. Birkhäuser, Boston (2010). https://doi.org/10.1007/978-0-8176-4789-6_3
Chapter Google Scholar
Hartigan, J.A., Wong, M.A.: Algorithm as 136: a \(k\)-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979)
MATH Google Scholar
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)
Article Google Scholar
Ivashkin, V., Chebotarev, P.: Do logarithmic proximity measures outperform plain ones in graph clustering? In: Kalyagin, V., Nikolaev, A., Pardalos, P., Prokopyev, O. (eds.) NET 2016. PROMS, vol. 197, pp. 87–105. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56829-4_8
Chapter Google Scholar
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silvermank, R., Wu, A.Y.: A local search approximation algorithm for \(k\)-means clustering. Comput. Geom. 28(2–3), 89–112 (2004)
Article MathSciNet Google Scholar
Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)
Article Google Scholar
Kondor, R.I., Lafferty, J.D.: Diffusion kernels on graphs and other discrete input spaces. In: Proceedings of ICML, pp. 315–322 (2002)
Google Scholar
Milligan, G., Cooper, M.: A study of the comparability of external criteria for hierarchical cluster-analysis. Multivar. Behav. Res. 21, 441–458 (1986)
Article Google Scholar
Nemenyi, P.: Distribution-free multiple comparisons. Biometrics 18(2), 263 (1962)
Google Scholar
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002)
Google Scholar
Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1, 27–64 (2007)
Article Google Scholar
Schenker, A., Last, M., Bunke, H., Kandel, A.: Comparison of distance measures for graph-based clustering of documents. In: Hancock, E., Vento, M. (eds.) GbRPR 2003. LNCS, vol. 2726, pp. 202–213. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45028-9_18
Chapter MATH Google Scholar
Sommer, F., Fouss, F., Saerens, M.: Comparison of graph node distances on clustering tasks. In: Villa, A.E.P., Masulli, P., Pons Rivero, A.J. (eds.) ICANN 2016. LNCS, vol. 9886, pp. 192–201. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44778-0_23
Chapter Google Scholar
Ward, J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)
Article MathSciNet Google Scholar
Yen, L., Fouss, F., Decaestecker, C., Francq, P., Saerens, M.: Graph nodes clustering based on the commute-time kernel. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 1037–1045. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71701-0_117
Chapter Google Scholar
Yen, L., Vanvyve, D., Wouters, F.: Clustering using a random walk based distance measure. In: Proceedings of the 13th European Symposium on Artificial Neural Networks, ESAAN-2005, pp. 317–324 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Kotel’nikov Institute of Radio-engineering and Electronics (IRE) of Russian Academy of Sciences, Mokhovaya 11-7, Moscow, 125009, Russia
Rinat Aynulin
Moscow Institute of Physics and Technology, 9 Inststitutskii per., Dolgoprudny, Moscow Region, 141700, Russia
Rinat Aynulin

Authors

Rinat Aynulin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rinat Aynulin .

Editor information

Editors and Affiliations

Inria, Sophia Antipolis, France
Konstantin Avrachenkov
Ryerson University, Toronto, ON, Canada
Paweł Prałat
The University of Queensland, Brisbane, QLD, Australia
Nan Ye

Appendices

Appendix A

CD-diagrams

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aynulin, R. (2019). Efficiency of Transformations of Proximity Measures for Graph Clustering. In: Avrachenkov, K., Prałat, P., Ye, N. (eds) Algorithms and Models for the Web Graph. WAW 2019. Lecture Notes in Computer Science(), vol 11631. Springer, Cham. https://doi.org/10.1007/978-3-030-25070-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-25070-6_2
Published: 04 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25069-0
Online ISBN: 978-3-030-25070-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics