Skip to main content
Log in

Identification and prioritization of differentially expressed genes for time-series gene expression data

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Identification of differentially expressed genes (DEGs) in time course studies is very useful for understanding gene function, and can help determine key genes during specific stages of plant development. A few existing methods focus on the detection of DEGs within a single biological group, enabling to study temporal changes in gene expression. To utilize a rapidly increasing amount of single-group time-series expression data, we propose a two-step method that integrates the temporal characteristics of time-series data to obtain a B-spline curve fit. Firstly, a flat gene filter based on the Ljung–Box test is used to filter out flat genes. Then, a B-spline model is used to identify DEGs. For use in biological experiments, these DEGs should be screened, to determine their biological importance. To identify high-confidence promising DEGs for specific biological processes, we propose a novel gene prioritization approach based on the partner evaluation principle. This novel gene prioritization approach utilizes existing co-expression information to rank DEGs that are likely to be involved in a specific biological process/condition. The proposed method is validated on the Arabidopsis thaliana seed germination dataset and on the rice anther development expression dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Dudoit S, Yang Y H, Callow M J, Speed T P. Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica, 2002, 12(1): 111–139

    MathSciNet  MATH  Google Scholar 

  2. Tusher V G, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the United States of America, 2001, 98(9): 5116–5121

    Article  MATH  Google Scholar 

  3. Smyth G K. Limma: linear models for microarray data. In: Gentleman R, Carey V J, Huber W, et al, eds. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. New York: Springer, 2005, 397–420

    Chapter  Google Scholar 

  4. ElBakry O, Ahmad M O, Swamy M N. Identification of differentially expressed genes for time-course microarray data based on modified RM ANOVA. IEEE/ACMTransactions on Computational Biology and Bioinformatics, 2012, 9(2): 451–466

    Article  Google Scholar 

  5. Bar-Joseph Z. Analyzing time series gene expression data. Bioinformatics, 2004, 20(16): 2493–2503

    Article  Google Scholar 

  6. Ernst J, Nau G J, Bar-Joseph Z. Clustering short time series gene expression data. Bioinformatics, 2005, 21(suppl_1): 159–168

    Article  Google Scholar 

  7. Chaiboonchoe A, Samarasinghe S, Kulasiri G D. Using emergent clustering methods to analyse short time series gene expression data from childhood leukemia treated with glucocorticoids. In: Proceedings of the 18th World IMACS Congress and MODSIM09 International Congress on Modelling and Simulation. 2009, 741–747

    Google Scholar 

  8. Bar-Joseph Z, Gerber G, Simon L, Gifford D K, Jaakkola T S. Comparing the continuous representation of time-series expression profiles to identify differentially expressed genes. Proceedings of the National Academy of Sciences of the United States of America, 2003, 100(18): 10146–10151

    Article  MathSciNet  MATH  Google Scholar 

  9. Conesa A, Nueda M J, Ferrer A, Talon M. maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments. Bioinformatics, 2006, 22(9): 1096–1102

    Article  Google Scholar 

  10. Storey J D, Xiao W Z, Leek J T, Tompkins R G, Davis R W. Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences of the United States of America, 2005, 102(36): 12837–12842

    Article  Google Scholar 

  11. Kim J, Ogden R, Kim H. A method to identify differential expression profiles of time-course gene data with Fourier transformation. BMC Bioinformatics, 2013, 14(1): 310

    Article  Google Scholar 

  12. Han X U, Sung W-K, Feng L I N. Identifying differentially expressed genes in time-course microarray experiment without replicate. Journal of Bioinformatics and Computational Biology, 2007, 5(02a): 281–296

    Article  Google Scholar 

  13. Angelini C, Cutillo L, De Canditiis D, Mutarelli M, Pensky M. BATS: a Bayesian user-friendly software for analyzing time series microarray experiments. BMC Bioinformatics, 2008, 9: 415

    Article  Google Scholar 

  14. Wu S, Wu H L. More powerful significant testing for time course gene expression data using functional principal component analysis approaches. BMC Bioinformatics, 2013, 14(1): 6

    Article  Google Scholar 

  15. Yang EW, Girke T, Jiang T. Differential gene expression analysis using coexpression and RNA-Seq data. Bioinformatics, 2013, 29(17): 2153–2161

    Article  Google Scholar 

  16. Pan J B, Hu S C, Wang H, Zou Q, Ji Z L. PaGeFinder: quantitative identification of spatiotemporal pattern genes. Bioinformatics, 2012, 28(11): 1544–1545

    Article  Google Scholar 

  17. Xiao S J, Zhang C, Zou Q, Ji Z L. TiSGeD: a database for tissuespecific genes. Bioinformatics, 2010, 26(9): 1273–1275

    Article  Google Scholar 

  18. Pan J B, Hu S C, Shi D, Cai M C, Li Y B, Zou Q, Ji Z L. PaGenBase: a pattern gene database for the global and dynamic understanding of gene function. PloS One, 2013, 8(12): E80747

    Article  Google Scholar 

  19. Moreau Y, Tranchevent L C. Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nature Reviews Genetics, 2012, 13(8): 523–536

    Article  Google Scholar 

  20. Yu W, Wulf A, Liu T B, Khoury M J, Gwinn M. Gene Prospector: an evidence gateway for evaluating potential susceptibility genes and interacting risk factors for human diseases. BMC Bioinformatics, 2008, 9(1): 528

    Article  Google Scholar 

  21. Chen J, Bardes E E, Aronow B J, Jegga A G. ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Research, 2009, 37(suppl_2): W305–W311

    Article  Google Scholar 

  22. Adie E A, Adams R R, Evans K L, Porteous D J, Pickard B S. Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics, 2005, 6(1): 55

    Article  Google Scholar 

  23. Usadel B, Obayashi T, Mutwil M, Giorgi F M, Bassel G W, Tanimoto M, Chow A, Steinhauser D, Persson S, Provart N J. Co-expression tools for plant biology: opportunities for hypothesis generation and caveats. Plant Cell Environ, 2009, 32(12): 1633–1651

    Article  Google Scholar 

  24. Obayashi T, Okamura Y, Ito S, Tadaka S, Aoki Y, Shirota M, Kinoshita K. ATTED-II in 2014: evaluation of gene coexpression in agriculturally important plants. Plant and Cell Physiology, 2014, 55(1): e6

    Article  Google Scholar 

  25. Storey J D, Tibshirani R. Statistical significance for genome wide studies. Proceedings of the National Academy of Sciences of the United States of America, 2003, 100(16): 9440–9445

    Article  MathSciNet  MATH  Google Scholar 

  26. Howe E, Holton K, Nair S, Schlauch D, Sinha R, Quackenbush J. MeV: multiexperiment viewer. In: Ochs M F, Casagrande J T, Davuluri R V, eds. Biomedical Informatics for Cancer Research. Springer US, 2010, 267–277

    Chapter  Google Scholar 

  27. Du Z, Zhou X, Ling Y, Zhang Z H, Su Z. agriGO: a GO analysis toolkit for the agricultural community. Nucleic Acids Research, 2010, 38(suppl_2): W64–W70

    Article  Google Scholar 

  28. Narsai R, Law S R, Carrie C, Xu L, Whelan J. In-depth temporal transcriptome profiling reveals a crucial developmental switch with roles for RNA processing and organelle metabolism that are essential for germination in Arabidopsis. Plant Physiology, 2011, 157(3): 1342–1362

    Article  Google Scholar 

  29. Yeung K Y, Haynor D R, Ruzzo W L. Validating clustering for gene expression data. Bioinformatics, 2001, 17(4): 309–318

    Article  Google Scholar 

  30. Fujita M, Horiuchi Y, Ueda Y, Mizuta Y, Kubo T, Yano K, Yamaki S, Tsuda K, Nagata T, Niihama M, Kato H, Kikuchi S, Hamada K, Mochizuki T, Ishimizu T, Iwai H, Tsutsumi N, Kurata N. Rice expression atlas in reproductive development. Plant and Cell Physiology, 2010, 51(12): 2060–2081

    Article  Google Scholar 

Download references

Acknowledgments

This paper was supported by the National Natural Science Foundation of China (Grant Nos. 61271346, 61571163, 61532014, 91335112, 61671189 and 61402132).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maozu Guo.

Additional information

Linlin Xing received his MS degree in computer science form Harbin Institute of Technology (HIT), China in 2012. He is currently a PhD candidate under the supervision of Professor Maozu Guo in the School of Computer Science and Technology, HIT. His research interests include gene expression data analysis and biological network construction.

Maozu Guo received his BS and MS degrees from Harbin Engineering University, China in 1988 and 1991 respectively, and PhD degree from Harbin Institute of Technology (HIT), China in 1998, all in computer science. He is currently a professor in the School of Computer Science and Technology, HIT. His research interests include Bioinformatics and machine learning.

Xiaoyan Liu received her BS and MS degrees in computer science from Harbin Engineering University, China, and PhD degree in Engineering Mechanics from Harbin Institute of Technology (HIT), China. She is currently an associate professor in School of Computer Science and Technology at HIT. Her research interests include Bioinformatics and knowledge-based systems.

Chunyu Wang received his BS, MS and PhD degrees in computer science from Harbin Institute of Technology (HIT), China. Now he is an associate professor in computer science and technology at HIT. His research interests include bioinformatics and machine learning.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xing, L., Guo, M., Liu, X. et al. Identification and prioritization of differentially expressed genes for time-series gene expression data. Front. Comput. Sci. 12, 813–823 (2018). https://doi.org/10.1007/s11704-016-6287-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-016-6287-7

Keywords

Navigation