Skip to main content

Abstract

Block modeling algorithms are used to discover important latent structures in graphs. They are the graph equivalent of clustering algorithms. However, existing block modeling algorithms work directly on the given graphs, making them computationally expensive and less effective on large complex graphs. In this paper, we propose a FastMap-based algorithm for block modeling on single-view undirected graphs. FastMap embeds a given undirected graph into a Euclidean space in near-linear time such that the pairwise Euclidean distances between vertices approximate a desired graph-based distance function between them. In the first phase, our FastMap-based block modeling (FMBM) algorithm uses FastMap with a probabilistically-amplified shortest-path distance function between vertices. In the second phase, it uses Gaussian Mixture Models (GMMs) for identifying clusters (blocks) in the resulting Euclidean space. FMBM outperforms other state-of-the-art methods on many benchmark and synthetic instances, both in efficiency and solution quality. It also enables a perspicuous visualization of clusters (blocks) in the graphs, not provided by other methods.

This work at the University of Southern California is supported by DARPA under grant number HR001120C0157 and by NSF under grant numbers 1409987, 1724392, 1817189, 1837779, 1935712, and 2112533. The views, opinions, and/or findings expressed are those of the author(s) and should not be interpreted as representing the official views or policies of the sponsoring organizations, agencies, or the U.S. Government. This research was partially supported by the OPTIMA ARC training centre IC200100009.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The conditions used in community detection prevent the proper identification of peripheral groups.

  2. 2.

    In a single-view graph, there is at most one edge between any two vertices.

  3. 3.

    i.e., linear time after ignoring logarithmic factors

  4. 4.

    i.e., a pair of vertices not connected by an edge

  5. 5.

    unless \(|E| = O(|V|)\), in which case the complexity is near-linear in the size of the input because of the \(\log |V|\) factor

  6. 6.

    The complement graph \(\bar{G}\) has the same vertices as the original graph G but represents every edge in G as a non-edge and every non-edge in G as an edge.

  7. 7.

    An edge-induced subgraph of G has the same vertices as G but a subset of its edges.

  8. 8.

    M can be computed from A and C in \(O(|E| + k^2)\) time while evaluating the objective function in Eq. 2.

  9. 9.

    The domain of each \(c_i\) is \(\{1, 2 \ldots k\}\). Block \(\mathcal {B}_h\) refers to the collection of all vertices \(v_i \in V\) such that \(c_i = h\).

  10. 10.

    These values are only important as ballpark estimates. We observed that the performance of FMBM often stays stable within broad ranges of hyperparameter values, imparting robustness to FMBM. Moreover, only a few different hyperparameter settings had to be examined to determine the best one.

  11. 11.

    Although Graph-Tool does not require a user-specified value of k, it has a tendency to produce trivial solutions with \(k = 1\), resulting in 0 NMI values when the value of k is not explicitly specified.

  12. 12.

    https://github.com/leon-angli/Synthetic-Block-Modeling-Dataset

  13. 13.

    DANMF did not assign any block membership to a few vertices in some synthetic test cases. We assign Block \(\mathcal {B}_1\) by default to such vertices.

  14. 14.

    The spread here refers to how a block extends from its center to its periphery.

  15. 15.

    https://networkx.org/documentation/stable/reference/generated/networkx.drawing.nx_pylab.draw.html

References

  1. Abbe, E.: Community detection and stochastic block models: recent developments. J. Mach. Learn. Res. 18, 6446–6531 (2017)

    MathSciNet  Google Scholar 

  2. Antonopoulos, C.G.: Dynamic range in the C. elegans brain network. Chaos: Interdisc. J. Nonlinear Sci. 26(1), 013102 (2016)

    Google Scholar 

  3. Chan, J., Liu, W., Kan, A., Leckie, C., Bailey, J., Ramamohanarao, K.: Discovering latent blockmodels in sparse and noisy graphs using non-negative matrix factorisation. In: Proceedings of the ACM International Conference on Information & Knowledge Management (2013)

    Google Scholar 

  4. Cohen, L., Uras, T., Jahangiri, S., Arunasalam, A., Koenig, S., Kumar, T.K.S.: The FastMap algorithm for shortest path computations. In: Proceedings of the International Joint Conference on Artificial Intelligence (2018)

    Google Scholar 

  5. Davis, T.: USAir97 (2014). https://www.cise.ufl.edu/research/sparse/matrices/Pajek/USAir97

  6. Faloutsos, C., Lin, K.I.: FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (1995)

    Google Scholar 

  7. Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization algorithms. J. ACM (JACM) 34, 596–615 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  8. Girvan, M., Newman, M.E.: Community structure in social and biological networks. Natl. Acad. Sci. 99, 7821–7826 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  9. Gopalakrishnan, S., Cohen, L., Koenig, S., Kumar, T.K.S.: Embedding directed graphs in potential fields using FastMap-D. In: Proceedings of the International Symposium on Combinatorial Search (2020)

    Google Scholar 

  10. Hagberg, A., Swart, P., Chult, D.S.: Exploring network structure, dynamics, and function using NetworkX. Technical report, Los Alamos National Lab, Los Alamos, NM (United States) (2008)

    Google Scholar 

  11. Lee, J., Gross, S.P., Lee, J.: Improved network community structure improves function prediction. Sci. Rep. 3, 1–9 (2013)

    Article  Google Scholar 

  12. Li, J., Felner, A., Koenig, S., Kumar, T.K.S.: Using FastMap to solve graph problems in a Euclidean space. In: Proceedings of the International Conference on Automated Planning and Scheduling (2019)

    Google Scholar 

  13. Lin, S., Hu, Q., Wang, G., Yu, P.S.: Understanding community effects on information diffusion. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS (LNAI), vol. 9077, pp. 82–95. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18038-0_7

    Chapter  Google Scholar 

  14. Mattenet, A., Davidson, I., Nijssen, S., Schaus, P.: Generic constraint-based block modeling using constraint programming. J. Artif. Intell. Res. 70, 597–630 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  15. Murphy, K.P.: Machine Learning: A probabilistic perspective. The MIT Press, Cambridge (2012)

    MATH  Google Scholar 

  16. Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)

    Google Scholar 

  17. Peixoto, T.P.: Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models. Phys. Rev. E 89(1), 012804 (2014)

    Google Scholar 

  18. Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2014)

    Google Scholar 

  19. Ramteke, R., et al.: Improving single and multi-view blockmodelling by algebraic simplification. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN) (2020)

    Google Scholar 

  20. Ye, F., Chen, C., Zheng, Z.: Deep autoencoder-like nonnegative matrix factorization for community detection. In: Proceedings of the ACM International Conference on Information and Knowledge Management (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ang Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, A., Stuckey, P., Koenig, S., Kumar, T.K.S. (2022). A FastMap-Based Algorithm for Block Modeling. In: Schaus, P. (eds) Integration of Constraint Programming, Artificial Intelligence, and Operations Research. CPAIOR 2022. Lecture Notes in Computer Science, vol 13292. Springer, Cham. https://doi.org/10.1007/978-3-031-08011-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08011-1_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08010-4

  • Online ISBN: 978-3-031-08011-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics