Abstract
Networks have become ubiquitous in many real world applications and to cluster similar networks is an important problem. There are various properties of graphs such as clustering coefficient (CC), density, arboricity, etc. We introduce a measure, Clique Conversion Coefficient (CCC), which captures the clique forming tendency of nodes in an undirected graph. CCC could either be used as a weighted average of the values in a vector or as the vector itself. Our experiments show that CCC provides additional information about a graph in comparison to related measures like CC and density. We cluster the real world graphs using a combination of the features CCC, CC, and density and show that without CCC as one of the features, graphs with similar clique forming tendencies are not clustered together. The clustering with the use of CCC would have applications in the areas of Social Network Analysis, Protein-Protein Interaction Analysis, etc., where cliques have an important role. We perform the clustering of ego networks of the YOUTUBE network using values in CCC vector as features. The quality of the clustering is analyzed by contrasting the frequent subgraphs in each cluster. The results highlight the utility of CCC in clustering subgraphs of a large graph.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Luce, R.D., Perry, A.D.: A method of matrix analysis of group structure. Pschometrica 14(1), 95–116 (1949)
Altaf-Ul-Amin, M., Nishikata, K., Koma, T., Miyasato, T., Shinbo, Y., Arifuzzaman, M., Wada, C., Maeda, M., Oshima, T.: Prediction of protein functions based on k-cores of protein-protein interaction networks and amino acid sequences. Genome Inf. 14, 498–499 (2003)
Leskovec, J., Kleinberg, J.M., Faloutsos, C.: Graph evolution: densification and shrinking diameters. TKDD 1(1) (2007)
Schwikowski, B., Uetz, P., Fields, S.: A network of protein-protein interactions in yeast. Nat. Biotechnol. 18(12), 1257–1261 (2000)
McAuley, J.J., Leskovec, J.: Discovering social circles in ego networks. TKDD 8, 4:1–4:28 (2014)
Chen, B., Matsumoto, M., Wang, J., Zhang, Z., Zhang, J.: A short proof of Nash-Williams’ theorem for the arboricity of a graph. Graphs Comb. 10(1), 27–28 (1994)
Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6(1), 29–123 (2009)
Leskovec, J., Krevl, A.: SNAP Datasets: Stanford Large Network Dataset Collection, June 2014. http://snap.stanford.edu/data
Holton, D.A., Sheehan, J.: The Petersen Graph. Cambridge University Press, Cambridge (1993). doi:10.2277/0521435943. ISBN 0-521-43594-3
Kreyszig, E.: Advanced Engineering Mathematics, 4th edn. Wiley, New York (1979)
Rosen, K.H.: Discrete Mathematics and Its Applications, 7th edn. McGraw-Hill (2011). p. 655
Erdos, P., Renyi, A., Sos, V.: On a problem of graph theory. Stud. Sci. Math. 1, 215–235 (1966)
Freeman, L.: A set of measures of centrality based on betweenness. Sociometry 40, 35–41 (1977)
Awodey, S.: Isomorphisms. Oxford University Press, Category theory (2006)
Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Anal. Appl. 13(1), 113–129 (2010)
Nijssen, S., Kok, J.: A quickstart in frequent structure mining can make a difference. In: Proceedings of the SIGKDD (2004). http://www.liacs.nl/home/snijssen/gaston
Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Measurement and analysis of online social networks. In: Proceedings of the 5th ACM/Usenix Internet Measurement Conference (IMC 2007), San Diego, CA, October 2007
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X., Simoudis, E., Han, J., Fayyad, U.M. (eds.): A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), pp. 226–231. AAAI Press (1996)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Sambaturu, P., Karlapalem, K. (2017). CCCG: Clique Conversion Ratio Driven Clustering of Graphs. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10235. Springer, Cham. https://doi.org/10.1007/978-3-319-57529-2_59
Download citation
DOI: https://doi.org/10.1007/978-3-319-57529-2_59
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57528-5
Online ISBN: 978-3-319-57529-2
eBook Packages: Computer ScienceComputer Science (R0)