Skip to main content

Improve Network Clustering via Diversified Ranking

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9197))

Abstract

Clustering is one fundamental task in network analysis. A widely-used clustering method is k-means clustering, where clustering is iteratively refined by minimizing the distance between each data point and its cluster center. For k-means clustering, one key issue is initialization, which heavily affects its accuracy and computational cost. This issue is particularly critical when applying k-means clustering to graph data where nodes are not embedded in a metric space. In this paper, we propose to use diversified ranking method to initialize k-means clustering, i.e., finding a set of seed nodes. In diversified ranking, seed nodes are figured out by considering their centrality and diversity in a unified manner. With seed nodes as starting points, k-means clustering is used to cluster nodes into groups. We apply the proposed method to detect communities in synthetic network and real-world network. Results indicate that the proposed method exhibits high effectiveness and efficiency.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Albert, R., Barabási, A.L.: Statistical mechanics of complex networks. J. Reviews of Modern Physics 74(1), 47 (2002)

    Article  MATH  Google Scholar 

  2. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. J. Physical Review E 69(2), 026113 (2004)

    Article  Google Scholar 

  3. Shen, H.W., Cheng, X.Q., Guo, J.F.: Exploring the structural regularities in networks. J. Physical Review E 84(5), 056111 (2011)

    Article  Google Scholar 

  4. Gopalan, P.K., Blei, D.M.: Efficient discovery of overlapping communities in massive networks. J. Proceedings of the National Academy if Sciences 110(36), 14534–14539 (2013)

    Google Scholar 

  5. Sun, B.J., Shen, H.W., Cheng, X.Q.: Detecting overlapping communities in massive networks. J. EPL 108(6), 68001 (2014)

    Article  Google Scholar 

  6. Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. J. Proceedings of the National Academy if Sciences 99(12), 7821–7826 (2002)

    Google Scholar 

  7. McDaid, A., Hurley, N.: Detecting overlapping communities with model-based overlapping seed expansion. In: 2010 International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 112–119. IEEE (2010)

    Google Scholar 

  8. Andersen, R., Lang, K.J.: Communities from seed sets. In: Proceedings of the 15th international Conference on World Wide Web, pp. 223–232. ACM (2006)

    Google Scholar 

  9. Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Systems with Applications. 40, 200–210 (2013)

    Article  Google Scholar 

  10. Arthur, D., Vassilvitskii, S.: k-means++: the advantage of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics, Philadelphia (2007)

    Google Scholar 

  11. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing ordering to the web. J. (1999)

    Google Scholar 

  12. Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 335–336. ACM (1998)

    Google Scholar 

  13. Mei, Q., Guo, J., Radev, D.: Divrank: the interplay of prestige and diversity in information networks. In: Proceedings of the 16th ACM SIGKDD International Conference on on Knowledge Discovery and Data Mining, pp. 1009–1018. ACM (2010)

    Google Scholar 

  14. Tong, H., He, J., Wen, Z., Konuru, R., Lin, C.Y.: Diversified ranking on large graphs: an optimization viewpoint. In: Proceedings of the 17th ACM SIGKDD International Conference on on Knowledge Discovery and Data Mining, pp. 1028–1036. ACM (2011)

    Google Scholar 

  15. Sun, Y., Han, J., Zhao, P.: RankClus: integrating clustering with ranking for heterogeneous information network analysis. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 565–576. ACM (2009)

    Google Scholar 

  16. Sun, Y., Han, J.: Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 797–806. ACM (2009)

    Google Scholar 

  17. Kücüktunc, O., Saule, E., Kaya, K.: Diversifing citation recommendations. J. ACM Transactions on Intelligent System and Technology (TIST) 5(4), 55 (2014)

    Google Scholar 

  18. Li, R.H., Yu, J.X.: Scalable diversified ranking on large graphs. IEEE Transactions on J. Knowledge and Data Engineering 25(9), 2133–2146 (2013)

    Google Scholar 

  19. Cheng, X.Q., Sun, B.J., Shen, H.W., Yu, Z.H.: Research Status and Trends of Diversified Graph Ranking. J. Proceedings of the Chinese Academy of Science 30(2), 248–256 (2015)

    Google Scholar 

  20. Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 10–17 (2003)

    Google Scholar 

  21. Zhai, C.X., Lafferty, J.: A risk minimization framework for information retrieval. Information Processing & Management 42(1), 31–55 (2006)

    Article  MATH  Google Scholar 

  22. Lin, H., Bilmes, J., Xie, S.: Graph-based submodular selection for extractive summarization. In: Automatic Speech Recognition and Understanding Workshop (2009)

    Google Scholar 

  23. Zhu, X., Goldberg, A.B., Van Gael, J., Andrzejewski, D.: Improving diversity in ranking using absorbing random walks. In: HLT-NAACL, pp. 97–104 (2007)

    Google Scholar 

  24. Cheng, X.Q., Du, P., Guo, J.: Ranking on data manifold with sink points. IEEE Transactions on J. Knowledge and Data Engineering 25(1), 177–191 (2013)

    Google Scholar 

  25. Agichtein, E., Brill, E., Dumais, S.T., et al.: Learning user interaction models for predicting web search result preferences. In: Proc. of SIGIR, pp. 3–10 (2006)

    Google Scholar 

  26. Lü, L., Zhang, Y.C., Yeung, C.H.: Leaders in social networks, the delicious case. PloS One 6(6), e21202 (2011)

    Article  Google Scholar 

  27. Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 551–556. ACM (2004)

    Google Scholar 

  28. Arfken, G.: Ill-Conditioned Systems. Mathematical Methods for Physicists, 3rd edn, pp. 233–234. Academic Press, Orlando (1985)

    Google Scholar 

  29. Liu, J., Liu, T.: Detecting community structure in complex networks using simulated annealing with k-means algorithms. J. Physica A: Statistical Mechanics and its Applications 389(11), 2300–2309 (2010)

    Article  Google Scholar 

  30. Lancichinetti, A., Radicchi, F., Ramasco, J.J., Fortunato, S.: Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. J. Physical Review E 80(1), 016118 (2009)

    Article  Google Scholar 

  31. Lancichinetti, A., Fortunato, S., Kertész, J.: Detecting the overlapping and hierarchical community structure in complex networks. J. New Journal of Physics 11(3), 033015 (2009)

    Article  Google Scholar 

  32. Blondel, V.D., Guillaume, J.L., Lambiotte, R., et al.: Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008(10), P10008 (2008)

    Article  Google Scholar 

  33. Rosvall, M., Bergstorm, C.T.: Maps of random walks on complex networks reveal community structure. J. Proceedings of the National Academy of Sciences 105(4), 1118–1123 (2008)

    Article  Google Scholar 

  34. Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. J. Proc. Natl. Acad. Sci. 99, 7821–7826 (2002)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hua-Wei Shen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Sun, BJ., Shen, HW., Cheng, XQ. (2015). Improve Network Clustering via Diversified Ranking. In: Thai, M., Nguyen, N., Shen, H. (eds) Computational Social Networks. CSoNet 2015. Lecture Notes in Computer Science(), vol 9197. Springer, Cham. https://doi.org/10.1007/978-3-319-21786-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-21786-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21785-7

  • Online ISBN: 978-3-319-21786-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics