Skip to main content

Gaussian Graphical Models to Infer Putative Genes Involved in Nitrogen Catabolite Repression in S. cerevisiae

  • Conference paper
Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics (EvoBIO 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5483))

  • 963 Accesses

Abstract

Nitrogen is an essential nutrient for all life forms. Like most unicellular organisms, the yeast Saccharomyces cerevisiae transports and catabolizes good nitrogen sources in preference to poor ones. Nitrogen catabolite repression (NCR) refers to this selection mechanism. We propose an approach based on Gaussian graphical models (GGMs), which enable to distinguish direct from indirect interactions between genes, to identify putative NCR genes from putative NCR regulatory motifs and over-represented motifs in the upstream noncoding sequences of annotated NCR genes. Because of the high-dimensionality of the data, we use a shrinkage estimator of the covariance matrix to infer the GGMs. We show that our approach makes significant and biologically valid predictions. We also show that GGMs are more effective than models that rely on measures of direct interactions between genes.

This work was supported by the Communauté Française de Belgique (ARC grant no. 04/09-307).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Godard, P., Urrestarazu, A., Vissers, S., Kontos, K., Bontempi, G., van Helden, J., André, B.: Effect of 21 different nitrogen sources on global gene expression in the yeast Saccharomyces cerevisiae. Molecular and Cellular Biology 27, 3065–3086 (2007)

    Article  Google Scholar 

  2. Scherens, B., Feller, A., Vierendeels, F., Messenguy, F., Dubois, E.: Identification of direct and indirect targets of the Gln3 and Gat1 activators by transcriptional profiling in response to nitrogen availability in the short and long term. FEMS Yeast Research 6, 777–791 (2006)

    Article  Google Scholar 

  3. Kontos, K., Godard, P., André, B., van Helden, J., Bontempi, G.: Machine learning techniques to identify putative genes involved in nitrogen catabolite repression in the yeast Saccharomyces cerevisiae. BMC Proceedings 2, S5 (2008)

    Article  Google Scholar 

  4. Lauritzen, S.L.: Graphical Models. Oxford Statistical Science Series. Clarendon Press, Oxford (1996)

    MATH  Google Scholar 

  5. Simonis, N., Wodak, S.J., Cohen, G.N., van Helden, J.: Combining pattern discovery and discriminant analysis to predict gene co-regulation. Bioinformatics 20, 2370–2379 (2004)

    Article  Google Scholar 

  6. Schäfer, J., Strimmer, K.: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology 4, 32 (2005)

    Article  MathSciNet  Google Scholar 

  7. Dobra, A., Hans, C., Jones, B., Nevins, J., Yao, G., West, M.: Sparse graphical models for exploring gene expression data. Journal of Multivariate Analysis 90, 196–212 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  8. Castelo, R., Roverato, A.: A robust procedure for Gaussian graphical model search from microarray data with p larger than n. Journal of Machine Learning Research 7, 2621–2650 (2006)

    MathSciNet  MATH  Google Scholar 

  9. Magwene, P., Kim, J.: Estimating genomic coexpression networks using first-order conditional independence. Genome Biology 5, R100 (2004)

    Article  Google Scholar 

  10. Wille, A., Zimmermann, P., Vranová, E., Fürholz, A., Laule, O., Bleuler, S., Hennig, L., Prelić, A., von Rohr, P., Thiele, L., Zitzler, E., Gruissem, W., Bühlmann, P.: Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. Genome Biology 5, R92 (2004)

    Article  Google Scholar 

  11. Kontos, K., Bontempi, G.: Nested q-partial graphs for genetic network inference from “small n, large p” microarray data. In: Elloumi, M., Küng, J., Linial, M., Murphy, R., Schneider, K., Toma, C. (eds.) BIRD 2008. CCIS 13, pp. 273–287. Springer, Heidelberg (2008)

    Google Scholar 

  12. Ledoit, O., Wolf, M.: A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis 88, 365–411 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  13. Cooper, T.G.: Transmitting the signal of excess nitrogen in Saccharomyces cerevisiae from the Tor proteins to the GATA factors: connecting the dots. FEMS Microbiology Reviews 26, 223–238 (2002)

    Article  Google Scholar 

  14. Bar-Joseph, Z., Gerber, G., Lee, T., Rinaldi, N., Yoo, J., Robert, F., Gordon, D., Fraenkel, E., Jaakkola, T., Young, R., et al.: Computational discovery of gene modules and regulatory networks. Nature Biotechnology 21, 1337–1342 (2003)

    Article  Google Scholar 

  15. Butte, A., Tamayo, P., Slonim, D., Golub, T., Kohane, I.: Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proceedings of the National Academy of Sciences 97, 12182–12186 (2000)

    Article  Google Scholar 

  16. Whittaker, J.: Graphical Models in Applied Multivariate Statistics. John Wiley and Sons, Inc., Chichester (1990)

    MATH  Google Scholar 

  17. Edwards, D.: Introduction to Graphical Modelling, 2nd edn. Springer Texts in Statistics. Springer, Heidelberg (2000)

    Book  MATH  Google Scholar 

  18. Schäfer, J., Strimmer, K.: An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21, 754–764 (2005)

    Article  Google Scholar 

  19. Dykstra, R.: Establishing the positive definiteness of the sample covariance matrix. The Annals of Mathematical Statistics 41, 2153–2154 (1970)

    Article  MATH  Google Scholar 

  20. van Helden, J.: Regulatory sequence analysis tools. Nucleic Acids Research 31, 3593–3596 (2003)

    Article  Google Scholar 

  21. Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 445–453. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  22. Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27, 861–874 (2006)

    Article  Google Scholar 

  23. McClish, R.J.: Analyzing a portion of the ROC curve. Medical Decision Making 9, 190–195 (1989)

    Article  Google Scholar 

  24. Jiang, Y.L., Metz, C.E., Nishikawa, R.M.: A receiver operating characteristic partial area index for highly sensitive diagnostic tests. Radiology 201, 745–750 (1996)

    Article  Google Scholar 

  25. Efron, B.: Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika 68, 589–599 (1981)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kontos, K., André, B., van Helden, J., Bontempi, G. (2009). Gaussian Graphical Models to Infer Putative Genes Involved in Nitrogen Catabolite Repression in S. cerevisiae . In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2009. Lecture Notes in Computer Science, vol 5483. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01184-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01184-9_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01183-2

  • Online ISBN: 978-3-642-01184-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics