Probably Approximately Correct Learning of Regulatory Networks from Time-Series Data

Carcano, Arthur; Fages, François; Soliman, Sylvain

doi:10.1007/978-3-319-67471-1_5

Arthur Carcano¹⁵,
François Fages¹⁶ &
Sylvain Soliman¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10545))

Included in the following conference series:

International Conference on Computational Methods in Systems Biology

1106 Accesses
1 Citations

Abstract

Automating the process of model building from experimental data is a very desirable goal to palliate the lack of modellers for many applications. However, despite the spectacular progress of machine learning techniques in data analytics, classification, clustering and prediction making, learning dynamical models from data time-series is still challenging. In this paper we investigate the use of the Probably Approximately Correct (PAC) learning framework of Leslie Valiant as a method for the automated discovery of influence models of biochemical processes from Boolean and stochastic traces. We show that Thomas’ Boolean influence systems can be naturally represented by k-CNF formulae, and learned from time-series data with a number of Boolean activation samples per species quasi-linear in the precision of the learned model, and that positive Boolean influence systems can be represented by monotone DNF formulae and learned actively with both activation samples and oracle calls. We consider Boolean traces and Boolean abstractions of stochastic simulation traces, and study the space-time tradeoff there is between the diversity of initial states and the length of the time horizon, and its impact on the error bounds provided by the PAC learning algorithms. We evaluate the performance of this approach on a model of T-lymphocyte differentiation, with and without prior knowledge, and discuss its merits as well as its limitations with respect to realistic experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Boolean Network Identification from Multiplex Time Series Data

Inference of Delayed Biological Regulatory Networks from Time Series Data

Automatising the analysis of stochastic biochemical time-series

Article Open access 01 June 2015

Notes

1.
For the sake of reproducibility, the code used in this article is available at http://lifeware.inria.fr/wiki/software/#CMSB17.
2.
More generally, the PAC learning protocol can discover partial vectors, but for the applications discussed in the current article it is enough to only consider total vectors.
3.
http://lifeware.inria.fr/biocham4.
4.
More precisely, in a well-formed influence system, f is assumed to be partially differentiable; $x_i\in P$ if and only if $\sigma = +$ (resp. −) and ${\partial {f}}/ {\partial x_i}(\varvec{x})>0$ (resp. $<0$) for some value $\varvec{x}\in \mathbb {R}_+^n$; and $x_i\in I$ if and only if $\sigma = +$ (resp. −) and ${\partial {f}}/ {\partial x_i}(\varvec{x})<0$ (resp. $>0$) for some value $\varvec{x}\in \mathbb {R}_+^n$.
5.
Note that this function ignores the cases where $v_i = 0$ and ${x_i}^-(v) =0$, or $v_i=1$ and ${x_i}^+(v)=1$ which may create loops in non-terminal states in general influence systems.

References

Angelopoulos, N., Muggleton, S.H.: Machine learning metabolic pathway descriptions using a probabilistic relational representation. Electron. Trans. Artif. Intell. 7(9), 1–11 (2002). also in Proceedings of Machine Intelligence
Google Scholar
Angelopoulos, N., Muggleton, S.H.: Slps for probabilistic pathways: Modeling and parameter estimation. Technical Report TR 2002/12. Department of Computing, Imperial College, London, UK (2002)
Google Scholar
Bernot, G., Comet, J.P., Richard, A., Guespin, J.: A fruitful application of formal methods to biological regulatory networks: Extending Thomas’ asynchronous logical approach with temporal logic. J. Theor. Biol. 229(3), 339–347 (2004)
Article Google Scholar
Bryant, C.H., Muggleton, S.H., Oliver, S.G., Kell, D.B., Reiser, P.G.K., King, R.D.: Combining inductive logic programming, active learning and robotics to discover the function of genes. Electron. Trans. Artif. Intell. 6(12), 1–36 (2001)
Google Scholar
Calzone, L., Chabrier-Rivier, N., Fages, F., Soliman, S.: Machine learning biochemical networks from temporal logic properties. In: Priami, C., Plotkin, G. (eds.) Transactions on Computational Systems Biology VI. LNCS, vol. 4220, pp. 68–94. Springer, Heidelberg (2006). doi:10.1007/11880646_4
Chapter Google Scholar
Chen, K.C., Calzone, L., Csikász-Nagy, A., Cross, F.R., Györffy, B., Val, J., Novàk, B., Tyson, J.J.: Integrative analysis of cell cycle control in budding yeast. Mol. Biol. Cell 15(8), 3841–3862 (2004)
Article Google Scholar
Deng, K., Bourke, C., Scott, S.D., Sunderman, J., Zheng, Y.: Bandit-based algorithms for budgeted learning. In: ICDM (2007)
Google Scholar
Deng, K., Zheng, Y., Bourke, C., Scott, S., Masciale, J.: New algorithms for budgeted learning. Mach. Learn. 90, 59–90 (2013)
Article MathSciNet Google Scholar
Fages, F., Martinez, T., Rosenblueth, D.A., Soliman, S.: Influence systems vs Reaction systems. In: Bartocci, E., Lio, P., Paoletti, N. (eds.) CMSB 2016. LNCS, vol. 9859, pp. 98–115. Springer, Cham (2016). doi:10.1007/978-3-319-45177-0_7
Chapter Google Scholar
Fages, F., Soliman, S.: Abstract interpretation and types for systems biology. Theor. Comput. Sci. 403(1), 52–70 (2008)
Article MathSciNet MATH Google Scholar
Gebser, M., Kaufmann, B., Neumann, A., Schaub, T.: clasp: A conflict-driven answer set solver. In: Baral, C., Brewka, G., Schlipf, J. (eds.) LPNMR 2007. LNCS (LNAI), vol. 4483, pp. 260–265. Springer, Heidelberg (2007). doi:10.1007/978-3-540-72200-7_23
Chapter Google Scholar
Gebser, M., Schaub, T., Thiele, S., Usadel, B., Veber, P.: Detecting inconsistencies in large biological networks with answer set programming. In: Garcia de la Banda, M., Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 130–144. Springer, Heidelberg (2008). doi:10.1007/978-3-540-89982-2_19
Chapter Google Scholar
Gillespie, D.T.: Exact stochastic simulation of coupled chemical reactions. J. Phys. Chemis. 81(25), 2340–2361 (1977)
Article Google Scholar
Gordon, A.D., Henzinger, T.A., Nori, A.V., Rajamani, S.K.: Probabilistic programming. In: Proceedings of the on Future of Software Engineering, FOSE 2014, pp. 167–181, NY, USA. ACM, New York (2014)
Google Scholar
Hill, S.M., et al.: Inferring causal molecular networks: empirical assessment through a community-based effort. Nat. Method. 1(4), 310–318 (2016)
Article Google Scholar
Llamosi, A., Mezine, A., dÁlché-Buc, F., Letort, V., Sebag, M.: Experimental design in dynamical system identification: a bandit-based active learning approach. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS, vol. 8725, pp. 306–321. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44851-9_20
Google Scholar
Mendoza, L.: A network model for the control of the differentiation process in Th cells. Biosystems 84(2), 101–114 (2006)
Article Google Scholar
Meyer, P., Cokelaer, T., Chandran, D., Kim, K.H., Loh, P.R., Tucker, G., Lipson, M., Berger, B., Kreutz, C., Raue, A., Steiert, B., Timmer, J., Bilal, E., Sauro, H.M., Stolovitzky, G., Saez-Rodriguez, J.: Network topology and parameter estimation: from experimental design methods to gene regulatory network kinetics using a community based approach. BMC Syst. Biol. 8(1), 1–18 (2014)
Article Google Scholar
Muggleton, S.H.: Inverse entailment and progol. New Gener. Comput. 13, 245–286 (1995)
Article Google Scholar
Ostrowski, M., Paulevé, L., Schaub, T., Siegel, A., Guziolowski, C.: Boolean network identification from perturbation time series data combining dynamics abstraction and logic programming. Biosystems 149, 139–153 (2016)
Article Google Scholar
Remy, E., Ruet, P., Mendoza, L., Thieffry, D., Chaouiya, C.: From logical regulatory graphs to standard petri nets: dynamical roles and functionality of feedback circuits. In: Priami, C., Ingólfsdóttir, A., Mishra, B., Riis Nielson, H. (eds.) Transactions on Computational Systems Biology VII. LNCS, vol. 4230, pp. 56–72. Springer, Heidelberg (2006). doi:10.1007/11905455_3
Chapter Google Scholar
Thomas, R.: Boolean formalisation of genetic control circuits. J. Theor. Biol. 42, 565–583 (1973)
Article Google Scholar
Thomas, R.: Regulatory networks seen as asynchronous automata : a logical description. J. Theor. Biol. 153, 1–23 (1991)
Article Google Scholar
Valiant, L.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)
Article MATH Google Scholar
Valiant, L.: Probably Approximately Correct. Basic Books (2013)
Google Scholar
Videla, S., Konokotina, I., Alexopoulos, L.G., Saez-Rodriguez, J., Schaub, T., Siegel, A., Guziolowski, C.: Designing experiments to discriminate families of logic models. Front. Bioeng. Biotechnol. 3, 131 (2015)
Article Google Scholar

Download references

Acknowledgements

This work is partly supported by the ANR project Hyclock.

Author information

Authors and Affiliations

Ecole Normale Supérieure, Paris, France
Arthur Carcano
Inria, University Paris-Saclay, Lifeware Group, Palaiseau, France
François Fages & Sylvain Soliman

Authors

Arthur Carcano
View author publications
You can also search for this author in PubMed Google Scholar
François Fages
View author publications
You can also search for this author in PubMed Google Scholar
Sylvain Soliman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to François Fages .

Editor information

Editors and Affiliations

Inria & École normale supérieure, Paris, France
Jérôme Feret
Technische Universität Darmstadt, Darmstadt, Germany
Heinz Koeppl

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Carcano, A., Fages, F., Soliman, S. (2017). Probably Approximately Correct Learning of Regulatory Networks from Time-Series Data. In: Feret, J., Koeppl, H. (eds) Computational Methods in Systems Biology. CMSB 2017. Lecture Notes in Computer Science(), vol 10545. Springer, Cham. https://doi.org/10.1007/978-3-319-67471-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-67471-1_5
Published: 01 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67470-4
Online ISBN: 978-3-319-67471-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics