Abstract
Efficiently searching top-k representative vertices is crucial for understanding the structure of large dynamic graphs. Recent studies show that communities formed by a vertex with high local clustering coefficient and its neighbours can achieve enhanced information propagation speed as well as disease transmission speed. However, local clustering coefficient, which measures the cliquishness of a vertex in its local neighbourhood, prefers vertices with small degrees. To remedy this issue, in this paper we propose a new ranking measure, weighted clustering coefficient (WCC) of vertices, by integrating both local clustering coefficient and degree. WCC not only inherits the properties of local clustering coefficient but also approximately measures the density (i.e., average degree) of its neighbourhood subgraph. Thus, vertices with higher WCC are more likely to be representative. We study efficiently computing and monitoring top-k representative vertices based on WCC over large dynamic graphs. To reduce the search space, we propose a series of heuristic upper bounds for WCC to prune a large portion of disqualifying vertices from the search space. We also develop an approximation algorithm by utilizing Flajolet-Martin sketch to trade acceptable accuracy for enhanced efficiency. An efficient incremental algorithm dealing with frequent updates in dynamic graphs is explored as well. Extensive experimental results on a variety of real-life graph datasets demonstrate the efficiency and effectiveness of our approaches.









Similar content being viewed by others
Notes
References
Alon, N., Yuster, R., Zwick, U.: Finding and counting given length cycles. Algorithmica 17, 354–364 (1997)
Angel, A., Koudas, N., Sarkas, N., Srivastava, D.: Dense subgraph maintenance under streaming edge weight updates for real-time story identification. PVLDB 5(6), 574–585 (2012)
Bahmani, B., Kumar, R., Vassilvitskii, S.: Densest subgraph in streaming and mapreduce. PVLDB 5(5), 454–465 (2012)
Becchetti, L., Boldi, P., Castillo, C., Gionis, A.: Efficient semi-streaming algorithms for local triangle counting in massive graphs. In: KDD, pp 16–24 (2008)
Bonchi, F., Gullo, F., Kaltenbrunner, A., Volkovich, Y.: Core decomposition of uncertain graphs. In: KDD, pp 1316–1325 (2014)
Chan, K.Y.Y., Vitevitch, M.S.: The influence of the phonological neighborhood clustering coefficient on spoken word recognition. J. Exp. Psychol. Hum. Percept. Perform. 35(6), 1934–1949 (2009)
Chu, S., Cheng, J.: Triangle listing in massive networks and its applications. In: KDD, pp. 672–680 (2011)
Coleman, J.S.: Social Capital in the Creation of Human Capital. Am. J. Sociol. 94, S95–S120 (1988)
Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. In: STOC, pp. 1–6 (1987)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn. The MIT Press (2009)
Flajolet, P., Martin, G.N.: Probabilistic counting algorithms for data base applications. J. Comput. Syst. Sci. 31, 182–209 (1985)
Goyal, A., Lu, W., Lakshmanan, L.V.: Celf++: optimizing the greedy algorithm for influence maximization in social networks. In: Proceedings of the 20th international conference companion on world wide web, pp 47–48. ACM (2011)
Huang, X., Cheng, H., Li, R.-H., Qin, L., Yu, J.X.: Top-k structural diversity search in large networks. Proc. VLDB Endow. 6(13), 1618–1629 (2013)
Huang, X., Cheng, H., Qin, L., Tian, W., Yu, J.X.: Querying k-truss community in large and dynamic graphs. In: SIGMOD, pp 1311–1322 (2014)
Huang, X., Lakshmanan, L.V., Yu, J.X., Cheng, H.: Approximate closest community search in networks. Proc. VLDB Endowment 9(4), 276–287 (2015)
Huang, X., Lu, W., Lakshmanan, L.V.: Truss decomposition of probabilistic graphs: Semantics and algorithms (2016)
Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 58, 1–11 (2008)
Itai, A., Rodeh, M.: Finding a minimum circuit in a graph. In: STOC, pp. 1–10 (1977)
Jha, M., Seshadhri, C., Pinar, A.: A space efficient streaming algorithm for triangle counting using the birthday paradox. In: KDD, pp 589–597 (2013)
Kempe, D., Kleinberg, J., Tardos, É.: Maximizing the spread of influence through a social network. In: KDD, pp 137–146 (2003)
Latapy, M.: Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci. 407(1 - 3), 458–473 (2008)
Lin, X., Yuan, Y., Zhang, Q., stars, Y. Zhang. Selecting.: The k most representative skyline operator. In: ICDE, pp 86–95 (2007)
Lu, J., Senellart, P., Lin, C., Du, X., Wang, S., Chen, X.: Optimal top-k generation of attribute combinations based on ranked lists. In: SIGMOD, pp 409–420 (2012)
Luo, Y., Lin, X., Wang, W., Zhou, X.: Spark: Top-k keyword query in relational databases. In: SIGMOD, pp. 115–126 (2007)
Olsen, P.W., Labouseur, A.G., Hwang, J.-H: Efficient top-k closeness centrality search IEEE 30th International Conference on Data Engineering, Chicago, ICDE 2014, IL, USA, March 31 - April 4, 2014. (2014) doi:10.1109/ICDE.2014.6816651
Pfeiffer III, J.J., Neville, J.: Methods to determine node centrality and clustering in graphs with uncertain structure. arXiv preprint arXiv:1104.0319(2011)
Qin, L., Yu, J.X., Chang, L.: Diversifying top-k results. Proc. VLDB Endow. 1124–1135 (2012)
Rubinov, M., Sporns, O.: Complex network measures of brain connectivity: Uses and interpretations. NeuroImage 52(3), 1059–1069 (2010)
Soffer, S.N., Vázquez, A.: Network clustering coefficient without degree-correlation biases. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 71(5) (2005)
Strogatz, S.H.: Exploring complex networks. Nature 6825, 268–276 (2001)
Suri, S., Vassilvitskii, S.: Counting triangles and the curse of the last reducer. In: WWW, pp. 607–614 (2011)
Tangwongsan, K., Pavan, A., Tirthapura, S.: Parallel triangle counting in massive streaming graphs. In: CIKM, pp. 781–786 (2013)
L.H.U., Mamoulis, N., Berberich, K., Bedathur, S.: Durable top-k search in document archives. In: SIGMOD, pp. 555–566 (2010)
Wang, H., Li, M., Wang, J., Pan, Y.: A new method for identifying essential proteins based on edge clustering coefficient, pp. 87–98 (2011)
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 409–10 (1998)
Yan, X., He, B., Zhu, F., Han, J.: Top-k aggregation queries over large networks. In: ICDE (2010)
Yu, A., Agarwal, P.K., Yang, J.: Processing a large number of continuous preference top-k queries. In: SIGMOD, pp. 397–408 (2012)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, X., Chang, L., Zheng, K. et al. Ranking weighted clustering coefficient in large dynamic graphs. World Wide Web 20, 855–883 (2017). https://doi.org/10.1007/s11280-016-0420-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-016-0420-2