Abstract
Directed acyclic graphs have been widely used to model the causal relationships among variables. Many existing works focus on \(l_1\) based methods to induce sparsity. However, in addition to sparsity, studies on networks show that many real networks are scale-free, that is, the degree of the network follows a power-law. To capture the scale-free property, in this paper we propose a novel penalized likelihood method by employing a log 1-norm group penalty which is the composite of the well-known log-type and lasso-type penalty functions. We then design an efficient coordinate descent algorithm to solve the resulting nonconvex problem. Moreover, we establish the estimation consistency of the estimator under the setting where the error variances are fixed at an identical constant. Numerical studies are also conducted to demonstrate the merits of our method.
Similar content being viewed by others
References
Aragam B, Zhou Q (2015) Concave penalized estimation of sparse gaussian bayesian networks. J Mach Learn Res 16:2273–2328
Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512
Barabási AL, Albert R (2002) Statistical mechanics of complex networks. Rev Mod Phys 74:47–97
Barabási AL, Bonabeau E (2003) Scale-free networks. Sci Am 288:50–59
Bickel PJ, Ritov Y, Tsybakov AB (2009) Simultaneous analysis of lasso and dantzig selector. Ann Stat 37:1705–1732
Cooper GF, Herskovits E (1992) A bayesian method for the induction of probabilistic networks from data. Mach Learn 9:309–347
Defazio A, Caetano TS (2012) A convex formulation for learning scale-free networks via submodular relaxation. In: Advances in neural information processing systems, pp 1250–1258
Edwards D (2000) Introduction to graphical modelling. Springer, New York
Erdös P, Rényi A (1959) On random graphs. Publ Math 6:290–297
Friedman J, Hastie T, Höfling H, Tibshirani R (2007) Pathwise coordinate optimization. Ann Appl Stat 1:302–332
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9:432–441
Friedman J, Hastie T, Tibshirani R (2010b) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1–22
Friedman J, Hastie T, Tibshirani R (2010a) A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736
Fu F, Zhou Q (2013) Learning sparse causal gaussian networks with experimental intervention: regularization and coordinate descent. J Am Stat Assoc 108:288–300
Guo X, Zhang H, Wu JL (2016) Structure learning in graphical models incorporating the scale-free prior. Sci China Inf Sci (Chin Ser) 46:870–882
Hero A, Rajaratnam B (2012) Hub discovery in partial correlation graphs. IEEE Trans Inf Theory 58:6064–6078
Jiang D, Huang J (2015) Concave 1-norm group selection. Biostatistics 16:3297–3331
Kalisch M, Bühlmann P (2007) Estimating high-dimensional directed acyclic graphs with the pc-algorithm. J Mach Learn Res 8:613–636
Kao K, Yang Y, Boscolo R, Sabatti C, Roychowdhury V, Liao J (2004) Transcriptome-based determination of multiple transcription regulator activities in escherichia coli by using network component analysis. Proc Natl Acad Sci 101:641–646
Lam W, Bacchus F (1994) Learning bayesian belief networks: an approach based on the MDL principle. Comput Intell 10:269–293
Lange K, Hunter D, Yang I (2000) Optimization transfer using surrogate objective functions (with discussion). J Comput Graph Stat 9:1–59
Lauritzen S (1996) Graphical models. Oxford University Press, Oxford
Liu Q, Ihler AT (2011) Learning scale free networks by reweighed \(l_1\) regularization. In: Proceedings of the 14th international conference on artificial intelligence and statistics, vol 15, pp 40–48
Loh PL, Wainwright MJ (2017) Support recovery without incoherence: a case for nonconvex regularization. Ann Stat 45:2455–2482
Meinshausen N, Bühlmann P (2006) High-dimensional graphs with the lasso. Ann Stat 34:1436–1462
Negahban SN, Ravikumar P, Wainwright MJ, Yu B (2012) A unified framework for high-dimensional analysis of m-estimators with decomposable regularizers. Stat Sci 27:538–557
Pearl J (2000) Causality: models, reasoning, and inference. Cambridge University Press, Cambridge
Peng J, Wang P, Zhou N, Zhu J (2009) Partial correlation estimation by joint sparse regression models. J Am Stat Assoc 104:735–746
Prill RJ, Marbach D, Saez-Rodriguez J, Sorger PK (2010) Towards a rigorous assessment of systems biology models: the dream3 challenges. PLoS ONE 5:e9202
Ravikumar P, Raskutti G, Wainwright MJ (2011) High-dimensional covariance estimation by minimizing \(l_1\)-penalized log-determinant. Electron J Stat 5:935–980
Shen-Orr SS, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet 31:64–68
Shojaie A, Michailidis G (2010) Penalized likelihood methods for estimation of sparse high dimensional directed acyclic graphs. Biometrika 97:519–538
Shojaie A, Jauhiainen A, Kallitsis M, Michailidis G (2014) Inferring regulatory networks by combining perturbation screens and steady state gene expression profiles. PLoS ONE 9(e82):392
Spirtes P, Glymour C, Scheines R (2000) Causation, prediction, and search. MIT Press, London
Tan K, London P, Mohan K (2014) Learning graphical models with hubs. J Mach Learn Res 15:3297–3331
Tang Q, Sun S, Xu J (2015) Learning scale-free networks by dynamic node specific degree prior. In: Proceedings of the 32nd international conference on machine learning, pp 2247–2255
Tarzanagh DA, Michailidis G (2018) Estimation of graphical models through structured norm minimization. J Mach Learn Res 18:1–48
van de Geer S, Bühlmann P (2013) \(l_0\)-penalized maximum likelihood for sparse directed acyclic graphs. Ann Stat 41:536–567
Wu T, Lange K (2008) Coordinate descent procedures for lasso penalized regression. Ann Appl Stat 2:224–244
Yuan M, Lin Y (2007) Model selection and estimation in the gaussian graphical model. Biometrika 94:19–35
Zhou S, van de Geer S, Bühlmann P (2009) Adaptive lasso for high dimensional regression and gaussian graphical modeling. arXiv preprint arXiv:0903.2515 [math.ST]
Acknowledgements
We are very grateful to the Editor, the Associate Editor and the two referees for their careful work and helpful comments which led to an improved version of this paper. Hai Zhang’s research is partially supported by the National Natural Science Foundation of China (Grant No. 11571011). Yao Wang’s research is partially supported by the National Natural Science Foundation of China (Grant Nos. 11501440, 61673015), the China Postdoctoral Science Foundation (Grant Nos. 2017M610628, 2018T111031) and the Key Research Program of Hunan Province, China (Grant No. 2017GK2273).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Guo, X., Zhang, H., Wang, Y. et al. Structure learning of sparse directed acyclic graphs incorporating the scale-free property. Comput Stat 34, 713–742 (2019). https://doi.org/10.1007/s00180-018-0841-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-018-0841-8