Abstract
In this paper, to obtain a consistent estimator of the number of communities, the authors present a new sequential testing procedure, based on the locally smoothed adjacency matrix and the extreme value theory. Under the null hypothesis, the test statistic converges to the type I extreme value distribution, and otherwise, it explodes fast and the divergence rate could even reach n in the strong signal case where n is the size of the network, guaranteeing high detection power. This method is simple to use and serves as an alternative approach to the novel one in Lei (2016) using random matrix theory. To detect the change of the community structure, the authors also propose a two-sample test for the stochastic block model with two observed adjacency matrices. Simulation studies justify the theory. The authors apply the proposed method to the political blog data set and find reasonable group structures.
Similar content being viewed by others
References
Newman M E J and Girvan M, Finding and evaluating community structure in networks, Physical review E, 2004, 69(2): 026113.
Newman M E J, Modularity and community structure in networks, Proceedings of the National Academy of Sciences, 2006, 103(23): 8577–8582.
Bickel P J and Chen A, A nonparametric view of network models and NewmanGirvan and other modularities, Proceedings of the National Academy of Sciences, 2009, 106(50): 21068–21073.
Wolfe P J and Olhede S C, Nonparametric graphon estimation, arXiv preprint arXiv:1309.5936, 2013.
Holland P W, Laskey K B, and Leinhardt S, Stochastic blockmodels: First steps, Social Networks, 1983, 5(2): 109–137.
Karrer B and Newman M E J, Stochastic blockmodels and community structure in networks, Physical review E, 2011, 83(1): 016107.
Airoldi E M, Blei D M, Fienberg S E, et al., Mixed membership stochastic blockmodels, Journal of Machine Learning Research, 2008, 9(Sep): 1981–2014.
Brock W A and Durlauf S N, Identification of binary choice models with social interactions, Journal of Econometrics, 2007, 140(1): 52–75.
Andrikopoulos A, Samitas A, and Kostaris K, Four decades of the Journal of Econometrics: Coauthorship patterns and networks, Journal of Econometrics, 2016, 195(1): 23–32.
Liu X and Lee L F, GMM estimation of social interaction models with centrality, Journal of Econometrics, 2010, 159(1): 99–115.
Graham B S, An econometric model of network formation with degree heterogeneity, Econometrica, 2017, 85(4): 1033–1063.
Zhang Y, Levina E, and Zhu J, Estimating network edge probabilities by neighbourhood smoothing, Biometrika, 2017, 104(4): 771–783.
Gao C, van der Vaart A W, and Zhou H H, A general framework for Bayes structured linear models, Annals of Statistics, 2020, 48(5): 2848–2878.
Lei J, A goodness-of-fit test for stochastic block models, The Annals of Statistics, 2016, 44(1): 401–424.
van der Pas S L and van der Vaart A, Bayesian community detection, Bayesian Analysis, 2018, 13(3): 767–796.
Wang Y R and Bickel P J, Likelihood-based model selection for stochastic block models, The Annals of Statistics, 2017, 45(2): 500–528.
Hu J, Qin H, Yan T, et al., Corrected Bayesian information criterion for stochastic block models, Journal of the American Statistical Association, 2020, 115(532): 1771–1783.
Karwa V, Pati D, Petrović S, et al., Exact tests for stochastic block models, arXiv preprint arXiv: 1612.06040, 2016.
Hu J, Zhang J, Qin H, et al., Using maximum entry-wise deviation to test the goodness of fit for stochastic block models, Journal of the American Statistical Association, 2020, 1–10.
Barnett I and Onnela J P, Change point detection in correlation networks, Scientific Reports, 2016, 6: 18893.
Zhou W, Asymptotic distribution of the largest off-diagonal entry of correlation matrices, Transactions of the American Mathematical Society, 2007, 359(11): 5345–5363.
Lei J and Rinaldo A, Consistency of spectral clustering in stochastic block models, The Annals of Statistics, 2015, 43(1): 215–237.
Gao C, Ma Z, Zhang A Y, et al., Achieving optimal misclassification proportion in stochastic block models, The Journal of Machine Learning Research, 2017, 18(1): 1980–2024.
Amini A A, Chen A, Bickel P J, et al., Pseudo-likelihood methods for community detection in large sparse networks, Annals of Statistics, 2013, 41(4): 2097–2122.
Petrov V V, Sums of Independent Random Variables, Springer-Verlag, Berlin, 1975.
Author information
Authors and Affiliations
Corresponding author
Additional information
Wu’s work was supported by the National Natural Science Foundation of China under Grant No. 71971118; Kong’s work was supported by the National Natural Science Foundation of China under Grant No. 71971118; Xu’s work was supported by Major Natural Science Projects of Universities in Jiangsu Province under Grant No. 20KJA520002.
Rights and permissions
About this article
Cite this article
Wu, F., Kong, X. & Xu, C. Test on Stochastic Block Model: Local Smoothing and Extreme Value Theory. J Syst Sci Complex 35, 1535–1556 (2022). https://doi.org/10.1007/s11424-021-0154-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11424-021-0154-9