A FastMap-Based Algorithm for Block Modeling

Li, Ang; Stuckey, Peter; Koenig, Sven; Kumar, T. K. Satish

doi:10.1007/978-3-031-08011-1_16

Ang Li⁸,
Peter Stuckey⁹,
Sven Koenig⁸ &
…
T. K. Satish Kumar⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13292))

Included in the following conference series:

International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research

1013 Accesses
4 Citations

Abstract

Block modeling algorithms are used to discover important latent structures in graphs. They are the graph equivalent of clustering algorithms. However, existing block modeling algorithms work directly on the given graphs, making them computationally expensive and less effective on large complex graphs. In this paper, we propose a FastMap-based algorithm for block modeling on single-view undirected graphs. FastMap embeds a given undirected graph into a Euclidean space in near-linear time such that the pairwise Euclidean distances between vertices approximate a desired graph-based distance function between them. In the first phase, our FastMap-based block modeling (FMBM) algorithm uses FastMap with a probabilistically-amplified shortest-path distance function between vertices. In the second phase, it uses Gaussian Mixture Models (GMMs) for identifying clusters (blocks) in the resulting Euclidean space. FMBM outperforms other state-of-the-art methods on many benchmark and synthetic instances, both in efficiency and solution quality. It also enables a perspicuous visualization of clusters (blocks) in the graphs, not provided by other methods.

This work at the University of Southern California is supported by DARPA under grant number HR001120C0157 and by NSF under grant numbers 1409987, 1724392, 1817189, 1837779, 1935712, and 2112533. The views, opinions, and/or findings expressed are those of the author(s) and should not be interpreted as representing the official views or policies of the sponsoring organizations, agencies, or the U.S. Government. This research was partially supported by the OPTIMA ARC training centre IC200100009.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The conditions used in community detection prevent the proper identification of peripheral groups.
2.
In a single-view graph, there is at most one edge between any two vertices.
3.
i.e., linear time after ignoring logarithmic factors
4.
i.e., a pair of vertices not connected by an edge
5.
unless \(|E| = O(|V|)\), in which case the complexity is near-linear in the size of the input because of the \(\log |V|\) factor
6.
The complement graph \(\bar{G}\) has the same vertices as the original graph G but represents every edge in G as a non-edge and every non-edge in G as an edge.
7.
An edge-induced subgraph of G has the same vertices as G but a subset of its edges.
8.
M can be computed from A and C in \(O(|E| + k^2)\) time while evaluating the objective function in Eq. 2.
9.
The domain of each \(c_i\) is \(\{1, 2 \ldots k\}\). Block \(\mathcal {B}_h\) refers to the collection of all vertices \(v_i \in V\) such that \(c_i = h\).
10.
These values are only important as ballpark estimates. We observed that the performance of FMBM often stays stable within broad ranges of hyperparameter values, imparting robustness to FMBM. Moreover, only a few different hyperparameter settings had to be examined to determine the best one.
11.
Although Graph-Tool does not require a user-specified value of k, it has a tendency to produce trivial solutions with \(k = 1\), resulting in 0 NMI values when the value of k is not explicitly specified.
12.
https://github.com/leon-angli/Synthetic-Block-Modeling-Dataset
13.
DANMF did not assign any block membership to a few vertices in some synthetic test cases. We assign Block \(\mathcal {B}_1\) by default to such vertices.
14.
The spread here refers to how a block extends from its center to its periphery.
15.
https://networkx.org/documentation/stable/reference/generated/networkx.drawing.nx_pylab.draw.html

References

Abbe, E.: Community detection and stochastic block models: recent developments. J. Mach. Learn. Res. 18, 6446–6531 (2017)
MathSciNet Google Scholar
Antonopoulos, C.G.: Dynamic range in the C. elegans brain network. Chaos: Interdisc. J. Nonlinear Sci. 26(1), 013102 (2016)
Google Scholar
Chan, J., Liu, W., Kan, A., Leckie, C., Bailey, J., Ramamohanarao, K.: Discovering latent blockmodels in sparse and noisy graphs using non-negative matrix factorisation. In: Proceedings of the ACM International Conference on Information & Knowledge Management (2013)
Google Scholar
Cohen, L., Uras, T., Jahangiri, S., Arunasalam, A., Koenig, S., Kumar, T.K.S.: The FastMap algorithm for shortest path computations. In: Proceedings of the International Joint Conference on Artificial Intelligence (2018)
Google Scholar
Davis, T.: USAir97 (2014). https://www.cise.ufl.edu/research/sparse/matrices/Pajek/USAir97
Faloutsos, C., Lin, K.I.: FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (1995)
Google Scholar
Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization algorithms. J. ACM (JACM) 34, 596–615 (1987)
Article MathSciNet MATH Google Scholar
Girvan, M., Newman, M.E.: Community structure in social and biological networks. Natl. Acad. Sci. 99, 7821–7826 (2002)
Article MathSciNet MATH Google Scholar
Gopalakrishnan, S., Cohen, L., Koenig, S., Kumar, T.K.S.: Embedding directed graphs in potential fields using FastMap-D. In: Proceedings of the International Symposium on Combinatorial Search (2020)
Google Scholar
Hagberg, A., Swart, P., Chult, D.S.: Exploring network structure, dynamics, and function using NetworkX. Technical report, Los Alamos National Lab, Los Alamos, NM (United States) (2008)
Google Scholar
Lee, J., Gross, S.P., Lee, J.: Improved network community structure improves function prediction. Sci. Rep. 3, 1–9 (2013)
Article Google Scholar
Li, J., Felner, A., Koenig, S., Kumar, T.K.S.: Using FastMap to solve graph problems in a Euclidean space. In: Proceedings of the International Conference on Automated Planning and Scheduling (2019)
Google Scholar
Lin, S., Hu, Q., Wang, G., Yu, P.S.: Understanding community effects on information diffusion. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS (LNAI), vol. 9077, pp. 82–95. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18038-0_7
Chapter Google Scholar
Mattenet, A., Davidson, I., Nijssen, S., Schaus, P.: Generic constraint-based block modeling using constraint programming. J. Artif. Intell. Res. 70, 597–630 (2021)
Article MathSciNet MATH Google Scholar
Murphy, K.P.: Machine Learning: A probabilistic perspective. The MIT Press, Cambridge (2012)
MATH Google Scholar
Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)
Google Scholar
Peixoto, T.P.: Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models. Phys. Rev. E 89(1), 012804 (2014)
Google Scholar
Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2014)
Google Scholar
Ramteke, R., et al.: Improving single and multi-view blockmodelling by algebraic simplification. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN) (2020)
Google Scholar
Ye, F., Chen, C., Zheng, Z.: Deep autoencoder-like nonnegative matrix factorization for community detection. In: Proceedings of the ACM International Conference on Information and Knowledge Management (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Southern California, Los Angeles, CA, 90007, USA
Ang Li, Sven Koenig & T. K. Satish Kumar
Monash University, Wellington Road, Clayton, VIC, 3800, Australia
Peter Stuckey

Authors

Ang Li
View author publications
You can also search for this author in PubMed Google Scholar
Peter Stuckey
View author publications
You can also search for this author in PubMed Google Scholar
Sven Koenig
View author publications
You can also search for this author in PubMed Google Scholar
T. K. Satish Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ang Li .

Editor information

Editors and Affiliations

UCLouvain, Louvain-la-Neuve, Belgium
Pierre Schaus

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, A., Stuckey, P., Koenig, S., Kumar, T.K.S. (2022). A FastMap-Based Algorithm for Block Modeling. In: Schaus, P. (eds) Integration of Constraint Programming, Artificial Intelligence, and Operations Research. CPAIOR 2022. Lecture Notes in Computer Science, vol 13292. Springer, Cham. https://doi.org/10.1007/978-3-031-08011-1_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-08011-1_16
Published: 10 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08010-4
Online ISBN: 978-3-031-08011-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A FastMap-Based Algorithm for Block Modeling