Abstract
Understanding the regulatory mechanisms that are responsible for an organism’s response to environmental changes is an important question in molecular biology. A first and important step towards this goal is to detect genes whose expression levels are affected by altered external conditions. A range of methods to test for differential gene expression, both in static as well as in time-course experiments, have been proposed. While these tests answer the question whether a gene is differentially expressed, they do not explicitly address the question when a gene is differentially expressed, although this information may provide insights into the course and causal structure of regulatory programs. In this article, we propose a two-sample test for identifying intervals of differential gene expression in microarray time series. Our approach is based on Gaussian process regression, can deal with arbitrary numbers of replicates and is robust with respect to outliers. We apply our algorithm to study the response of Arabidopsis thaliana genes to an infection by a fungal pathogen using a microarray time series dataset covering 30,336 gene probes at 24 time points. In classification experiments our test compares favorably with existing methods and provides additional insights into time-dependent differential expression.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Kerr, M., Martin, M., Churchill, G.: Analysis of Variance for Gene Expression Microarray Data. Journal of Computational Biology 7(6), 819–837 (2000)
Dudoit, S., Yang, Y.H., Callow, M.J., Speed, T.P.: Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica 12, 111–140 (2002)
Efron, B., Tibshirani, R., Storey, J.D., Tusher, V.: Empirical Bayes Analysis of a Microarray Experiment. Journal of the American Statistical Association 96, 1151–1160 (2001)
Ishwaran, H., Rao, J.: Detecting differentially expressed genes in microarrays using Bayesian model selection. Journal of the American Statistical Association 98, 438–455 (2003)
Lonnstedt, I., Speed, T.: Replicated microarray data. Statistica Sinica 12, 31–46 (2002)
Bar-Joseph, Z., Gerber, G., Simon, I., Gifford, D.K., Jaakkola, T.S.: Comparing the continuous representation of time-series expression profiles to identify differentially expressed genes. Proceedings of the National Academy of Sciences of the United States of America 100, 10146–10151 (2003)
Storey, J.D., Xiao, W., Leek, J.T., Tompkins, R.G., Davis, R.W.: Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences of the United States of America 102, 12837–12842 (2005)
Tai, Y.C., Speed, T.P.: A multivariate empirical Bayes statistic for replicated microarray time course data. Annals of Statistics 34, 2387–2412 (2006)
Angelini, C., De Canditiis, D., Mutarelli, M., Pensky, M.: A Bayesian Approach to Estimation and Testing in Time-course Microarray Experiments. Statistical Applications in Genetics and Molecular Biology 6 (September 2007)
Yuan, M.: Flexible temporal expression profile modelling using the Gaussian process. Computational Statistics and Data Analysis 51, 1754–1764 (2006)
Lawrence, N.D., Sanguinetti, G., Rattray, M.: Modelling transcriptional regulation using Gaussian Processes. In: Advances in Neural Information Processing Systems, vol. 19, pp. 785–792. MIT Press, Cambridge (2007)
Chu, W., Ghahramani, Z., Falciani, F., Wild, D.: Biomarker discovery in microarray gene expression data with Gaussian processes. Bioinformatics 21(16), 3385–3393 (2005)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2006)
Kuss, M., Pfingsten, T., Csato, L., Rasmussen, C.E.: Approximate Inference for Robust Gaussian Process Regression. Technical report, Max Planck Institute for Biological Cybernetics, Tubingen (2005)
Minka, T.: Expectation propagation for approximate Bayesian inference. Uncertainty in Artificial Intelligence 17, 362–369 (2001)
Stegle, O., Fallert, S.V., MacKay, D.J.C., Brage, S.: Gaussian process robust regression for noisy heart rate data. IEEE Trans. Biomed. Eng. 55, 2143–2151 (2008)
Fujita, M., Fujita, Y., Noutoshi, Y., Takahashi, F., Narusaka, Y., Yamaguchi-Shinozaki, K., Shinozaki, K.: Crosstalk between abiotic and biotic stress responses: a current view from the points of convergence in the stress signaling networks. Current Opinion in Plant Biology 9, 436–442 (2006)
Allemeersch, J., Durinck, S., Vanderhaeghen, R., Alard, P., Maes, R., Seeuws, K., Bogaert, T., Coddens, K., Deschouwer, K., Hummelen, P.V., Vuylsteke, M., Moreau, Y., Kwekkeboom, J., Wijfjes, A.H., May, S., Beynon, J., Hilson, P., Kuiper, M.T.: Benchmarking the catma microarray. a novel tool forarabidopsis transcriptome analysis. Plant Physiol. 137, 588–601 (2005)
Wu, H., Kerr, M., Cui, X., Churchill, G.: MAANOVA: a software package for the analysis of spotted cDNA microarray experiments. The Analysis of Gene Expression Data: Methods and Software, pp. 313–341
Heard, N., Holmes, C., Stephens, D., Hand, D., Dimopoulos, G.: Bayesian coclustering of Anopheles gene expression time series: Study of immune defense response to multiple experimental challenges. Proceedings of the National Academy of Sciences 102(47), 16939–16944 (2005)
Heard, N.A., Holmes, C.C., Stephens, D.A.: A Quantitative Study of Gene Regulation Involved in the Immune Response of Anopheline Mosquitoes: An Application of Bayesian Hierarchical Clustering of Curves. Journal of the American Statistical Association 101(473), 18 (2006)
Falcon, S., Gentleman, R.: Using GOstats to test gene lists for GO term association. Bioinformatics 23(2), 257 (2007)
Stegle, O., Denby, K., Wild, D.L., Ghahramani, Z., Borgwardt, K.: Supplementary material: A robust Bayesian two-sample test for detecting intervals of differential gene expression in microarray time series (2009), http://www.inference.phy.cam.ac.uk/os252/projects/GPTwoSample
Yuan, C., Neubauer, C.: Variational Mixture of Gaussian Process Experts. In: Advances in Neural Information Processing Systems, vol. 19. MIT Press, Cambridge (2008)
Rasmussen, C.E., Ghahramani, Z.: Infinite Mixtures of Gaussian Process Experts. In: Advances in Neural Information Processing Systems, vol. 19, pp. 881–888. MIT Press, Cambridge (2001)
Jordan, M., Ghahramani, Z., Jaakkola, T., Saul, L.: An introduction to variational methods for graphical models. Machine Learning 37, 183–233 (1999)
Kullback, S., Leibler, R.: On Information and Sufficiency. The Annals of Mathematical Statistics 22(1), 79–86 (1951)
Seeger, M.: Expectation Propagation for Exponential Families. Technical report, University of California at Berkeley (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Stegle, O., Denby, K., Wild, D.L., Ghahramani, Z., Borgwardt, K.M. (2009). A Robust Bayesian Two-Sample Test for Detecting Intervals of Differential Gene Expression in Microarray Time Series. In: Batzoglou, S. (eds) Research in Computational Molecular Biology. RECOMB 2009. Lecture Notes in Computer Science(), vol 5541. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02008-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-02008-7_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02007-0
Online ISBN: 978-3-642-02008-7
eBook Packages: Computer ScienceComputer Science (R0)