Skip to main content
Log in

Structure learning of sparse directed acyclic graphs incorporating the scale-free property

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Directed acyclic graphs have been widely used to model the causal relationships among variables. Many existing works focus on \(l_1\) based methods to induce sparsity. However, in addition to sparsity, studies on networks show that many real networks are scale-free, that is, the degree of the network follows a power-law. To capture the scale-free property, in this paper we propose a novel penalized likelihood method by employing a log 1-norm group penalty which is the composite of the well-known log-type and lasso-type penalty functions. We then design an efficient coordinate descent algorithm to solve the resulting nonconvex problem. Moreover, we establish the estimation consistency of the estimator under the setting where the error variances are fixed at an identical constant. Numerical studies are also conducted to demonstrate the merits of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  • Aragam B, Zhou Q (2015) Concave penalized estimation of sparse gaussian bayesian networks. J Mach Learn Res 16:2273–2328

    MathSciNet  MATH  Google Scholar 

  • Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512

    Article  MathSciNet  MATH  Google Scholar 

  • Barabási AL, Albert R (2002) Statistical mechanics of complex networks. Rev Mod Phys 74:47–97

    Article  MathSciNet  MATH  Google Scholar 

  • Barabási AL, Bonabeau E (2003) Scale-free networks. Sci Am 288:50–59

    Article  Google Scholar 

  • Bickel PJ, Ritov Y, Tsybakov AB (2009) Simultaneous analysis of lasso and dantzig selector. Ann Stat 37:1705–1732

    Article  MathSciNet  MATH  Google Scholar 

  • Cooper GF, Herskovits E (1992) A bayesian method for the induction of probabilistic networks from data. Mach Learn 9:309–347

    MATH  Google Scholar 

  • Defazio A, Caetano TS (2012) A convex formulation for learning scale-free networks via submodular relaxation. In: Advances in neural information processing systems, pp 1250–1258

  • Edwards D (2000) Introduction to graphical modelling. Springer, New York

    Book  MATH  Google Scholar 

  • Erdös P, Rényi A (1959) On random graphs. Publ Math 6:290–297

    MathSciNet  MATH  Google Scholar 

  • Friedman J, Hastie T, Höfling H, Tibshirani R (2007) Pathwise coordinate optimization. Ann Appl Stat 1:302–332

    Article  MathSciNet  MATH  Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9:432–441

    Article  MATH  Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2010b) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1–22

    Article  Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2010a) A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736

  • Fu F, Zhou Q (2013) Learning sparse causal gaussian networks with experimental intervention: regularization and coordinate descent. J Am Stat Assoc 108:288–300

    Article  MathSciNet  MATH  Google Scholar 

  • Guo X, Zhang H, Wu JL (2016) Structure learning in graphical models incorporating the scale-free prior. Sci China Inf Sci (Chin Ser) 46:870–882

    Google Scholar 

  • Hero A, Rajaratnam B (2012) Hub discovery in partial correlation graphs. IEEE Trans Inf Theory 58:6064–6078

    Article  MathSciNet  MATH  Google Scholar 

  • Jiang D, Huang J (2015) Concave 1-norm group selection. Biostatistics 16:3297–3331

    Article  MathSciNet  Google Scholar 

  • Kalisch M, Bühlmann P (2007) Estimating high-dimensional directed acyclic graphs with the pc-algorithm. J Mach Learn Res 8:613–636

    MATH  Google Scholar 

  • Kao K, Yang Y, Boscolo R, Sabatti C, Roychowdhury V, Liao J (2004) Transcriptome-based determination of multiple transcription regulator activities in escherichia coli by using network component analysis. Proc Natl Acad Sci 101:641–646

    Article  Google Scholar 

  • Lam W, Bacchus F (1994) Learning bayesian belief networks: an approach based on the MDL principle. Comput Intell 10:269–293

    Article  Google Scholar 

  • Lange K, Hunter D, Yang I (2000) Optimization transfer using surrogate objective functions (with discussion). J Comput Graph Stat 9:1–59

    Google Scholar 

  • Lauritzen S (1996) Graphical models. Oxford University Press, Oxford

    MATH  Google Scholar 

  • Liu Q, Ihler AT (2011) Learning scale free networks by reweighed \(l_1\) regularization. In: Proceedings of the 14th international conference on artificial intelligence and statistics, vol 15, pp 40–48

  • Loh PL, Wainwright MJ (2017) Support recovery without incoherence: a case for nonconvex regularization. Ann Stat 45:2455–2482

    Article  MathSciNet  MATH  Google Scholar 

  • Meinshausen N, Bühlmann P (2006) High-dimensional graphs with the lasso. Ann Stat 34:1436–1462

    Article  MathSciNet  MATH  Google Scholar 

  • Negahban SN, Ravikumar P, Wainwright MJ, Yu B (2012) A unified framework for high-dimensional analysis of m-estimators with decomposable regularizers. Stat Sci 27:538–557

    Article  MathSciNet  MATH  Google Scholar 

  • Pearl J (2000) Causality: models, reasoning, and inference. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Peng J, Wang P, Zhou N, Zhu J (2009) Partial correlation estimation by joint sparse regression models. J Am Stat Assoc 104:735–746

    Article  MathSciNet  MATH  Google Scholar 

  • Prill RJ, Marbach D, Saez-Rodriguez J, Sorger PK (2010) Towards a rigorous assessment of systems biology models: the dream3 challenges. PLoS ONE 5:e9202

    Article  Google Scholar 

  • Ravikumar P, Raskutti G, Wainwright MJ (2011) High-dimensional covariance estimation by minimizing \(l_1\)-penalized log-determinant. Electron J Stat 5:935–980

    Article  MathSciNet  MATH  Google Scholar 

  • Shen-Orr SS, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet 31:64–68

    Article  Google Scholar 

  • Shojaie A, Michailidis G (2010) Penalized likelihood methods for estimation of sparse high dimensional directed acyclic graphs. Biometrika 97:519–538

    Article  MathSciNet  MATH  Google Scholar 

  • Shojaie A, Jauhiainen A, Kallitsis M, Michailidis G (2014) Inferring regulatory networks by combining perturbation screens and steady state gene expression profiles. PLoS ONE 9(e82):392

    Google Scholar 

  • Spirtes P, Glymour C, Scheines R (2000) Causation, prediction, and search. MIT Press, London

    MATH  Google Scholar 

  • Tan K, London P, Mohan K (2014) Learning graphical models with hubs. J Mach Learn Res 15:3297–3331

    MathSciNet  MATH  Google Scholar 

  • Tang Q, Sun S, Xu J (2015) Learning scale-free networks by dynamic node specific degree prior. In: Proceedings of the 32nd international conference on machine learning, pp 2247–2255

  • Tarzanagh DA, Michailidis G (2018) Estimation of graphical models through structured norm minimization. J Mach Learn Res 18:1–48

    MathSciNet  MATH  Google Scholar 

  • van de Geer S, Bühlmann P (2013) \(l_0\)-penalized maximum likelihood for sparse directed acyclic graphs. Ann Stat 41:536–567

    Article  MATH  Google Scholar 

  • Wu T, Lange K (2008) Coordinate descent procedures for lasso penalized regression. Ann Appl Stat 2:224–244

    Article  MathSciNet  MATH  Google Scholar 

  • Yuan M, Lin Y (2007) Model selection and estimation in the gaussian graphical model. Biometrika 94:19–35

    Article  MathSciNet  MATH  Google Scholar 

  • Zhou S, van de Geer S, Bühlmann P (2009) Adaptive lasso for high dimensional regression and gaussian graphical modeling. arXiv preprint arXiv:0903.2515 [math.ST]

Download references

Acknowledgements

We are very grateful to the Editor, the Associate Editor and the two referees for their careful work and helpful comments which led to an improved version of this paper. Hai Zhang’s research is partially supported by the National Natural Science Foundation of China (Grant No. 11571011). Yao Wang’s research is partially supported by the National Natural Science Foundation of China (Grant Nos. 11501440, 61673015), the China Postdoctoral Science Foundation (Grant Nos. 2017M610628, 2018T111031) and the Key Research Program of Hunan Province, China (Grant No. 2017GK2273).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hai Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, X., Zhang, H., Wang, Y. et al. Structure learning of sparse directed acyclic graphs incorporating the scale-free property. Comput Stat 34, 713–742 (2019). https://doi.org/10.1007/s00180-018-0841-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-018-0841-8

Keywords

Navigation