Skip to main content

Clustering Time-Series Gene Expression Data with Unequal Time Intervals

  • Chapter
Transactions on Computational Systems Biology X

Part of the book series: Lecture Notes in Computer Science ((TCSB,volume 5410))

Abstract

Clustering gene expression data given in terms of time-series is a challenging problem that imposes its own particular constraints, namely exchanging two or more time points is not possible as it would deliver quite different results, and also it would lead to erroneous biological conclusions. We have focused on issues related to clustering gene expression temporal profiles, and devised a novel algorithm for clustering gene temporal expression profile microarray data. The proposed clustering method introduces the concept of profile alignment which is achieved by minimizing the area between two aligned profiles. The overall pattern of expression in the time-series context is accomplished by applying agglomerative clustering combined with profile alignment, and finding the optimal number of clusters by means of a variant of a clustering index, which can effectively decide upon the optimal number of clusters for a given dataset. The effectiveness of the proposed approach is demonstrated on two well-known datasets, yeast and serum, and corroborated with a set of pre-clustered yeast genes, which show a very high classification accuracy of the proposed method, though it is an unsupervised scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bari, A., Rueda, L.: A New Profile Alignment Method for Clustering Gene Expression Data. In: Lamontagne, L., Marchand, M. (eds.) Canadian AI 2006. LNCS, vol. 4013, pp. 86–97. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Bréhélin, L.: Clustering Gene Expression Series with Prior Knowledge. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 27–38. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Cho, R.J., Campbell, M.J., Winzeler, E.A., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T.G., Gabrielian, A.E., Landsman, D., Lockhart, D.J., Davis, R.W.: A genome-wide transcriptional analysis of the mitotic cell cycle. Molecular Cell 2(1), 65–73 (1998)

    Article  Google Scholar 

  4. Conesa, A., Nueda, M.J., Ferrer, A., Talon, M.: maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments. Bioinformatics 22(9), 1096–1102 (2006)

    Article  Google Scholar 

  5. Chu, S., DeRisi, J., Eisen, M., Mulholland, J., Botstein, D., Brown, P., Herskowitz, I.: The transcriptional program of sporulation in budding yeast. Science 282, 699–705 (1998)

    Article  Google Scholar 

  6. Déjean, S., Martin, P.G.P., Baccini, A., Besse, P.: Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives. EURASIP J. Bioinform. Syst. Biol. 2007, 70561 (2007)

    Google Scholar 

  7. Drăghici, S.: Data Analysis Tools for DNA Microarrays. Chapman & Hall, Boca Raton (2003)

    Google Scholar 

  8. Eisen, M., Spellman, P., Brown, P., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. 95, 14863–14868 (1998)

    Article  Google Scholar 

  9. Ernst, J., Nau, G.J., Bar-Joseph, Z.: Clustering Short Time Series Gene Expression Data. Bioinformatics 21(suppl. 1), i159–i168 (2005)

    Article  Google Scholar 

  10. Gasch, A.P., Eisen, M.B.: Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biology 3(11), 0059.1–0059.22 (2002)

    Article  Google Scholar 

  11. Guillemin, K., Salama, N., Tompkins, L., Falkow, S.: Cag pathogenicity island-specific responses of gastric epithelial cells to Helicobacter pylori infection. Proc. Natl. Acad. Sci. 99, 15136–15141 (2002)

    Article  Google Scholar 

  12. Hartigan, J.A.: Clustering Algorithms. John Wiley and Sons, Chichester (1975)

    MATH  Google Scholar 

  13. Heijne, W.H., Stierum, R.H., Slijper, M., van Bladeren, P.J., van Ommen, B.: Toxicogenomics of bromobenzene hepatotoxicity: a combined transcriptomics and proteomics approach. Biochem. Pharmacol. 65, 857–875 (2003)

    Article  Google Scholar 

  14. Heyer, L., Kruglyak, S., Yooseph, S.: Exploring expression data: identification and analysis of coexpressed genes. Genome Res. 9, 1106–1115 (1999)

    Article  Google Scholar 

  15. Hogg, R., Craig, A.: Introduction to Mathematical Statistics, 5th edn. MacMillan, Basingstoke (1995)

    Google Scholar 

  16. Hwang, J., Peddada, S.: Confidence interval estimation subject to order restrictions. Ann. Statist. 22, 67–93 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  17. Iyer, V., Eisen, M., Ross, D., Schuler, G., Moore, T., Lee, J., Trent, J., Staudt, L., Hudson Jr., J., Boguski, M.: The transcriptional program in the response of human fibroblasts to serum. Science 283, 83–87 (1999)

    Article  Google Scholar 

  18. Bar-Joseph, Z., Gerber, G., Jaakkola, T., Gifford, D., Simon, I.: Continuous representations of time series gene expression data. Journal of Computational Biology 10(3-4), 341–356 (2003)

    Article  Google Scholar 

  19. Lobenhofer, E., Bennett, L., Cable, P., Li, L., Bushel, P., Afshari, C.: Regulation of DNA replication fork genes by 17betaestradiol. Molec. Endocrin. 16, 1215–1229 (2002)

    Article  Google Scholar 

  20. Maulik, U., Bandyopadhyay, S.: Performance Evaluation of Some Clustering Algorithms and Validity Indices. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12), 1650–1654 (2002)

    Article  Google Scholar 

  21. Moller-Levet, C., Klawonn, F., Cho, K.-H., Wolkenhauer, O.: Clustering of unevenly sampled gene expression time-series data. Fuzzy sets and Systems 152(1,16), 49–66 (2005)

    Article  MathSciNet  Google Scholar 

  22. Peddada, S., Prescott, K., Conaway, M.: Tests for order restrictions in binary data. Biometrics 57, 1219–1227 (2001)

    Article  MathSciNet  Google Scholar 

  23. Peddada, S., Lobenhofer, E., Li, L., Afshari, C., Weinberg, C., Umbach, D.: Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics 19(7), 834–841 (2003)

    Article  Google Scholar 

  24. Petrie, T.: Probabilistic functions of finite state Markov chains. Ann. Math. Statist. 40, 97–115 (1969)

    Article  MATH  MathSciNet  Google Scholar 

  25. Ramoni, M., Sebastiani, P., Kohane, I.: Cluster analysis of gene expression dynamics. Proc. Natl. Acad. Sci. USA 99(14), 9121–9126 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  26. Ramsay, J., Silverman, B.: Functional Data Analysis, 2nd edn. Springer, New York (2005)

    Google Scholar 

  27. Rueda, L., Bari, A.: Clustering Temporal Gene Expression Data with Unequal Time Intervals. In: 2nd International Conference on Bio-Inspired Models of Network, Information, and Computing Systems, Bioinformatics Track, Budapest, Hungary (2007) ICST 978-963-9799-11-0

    Google Scholar 

  28. Schliep, A., Schonhuth, A., Steinhoff, C.: Using hidden Markov models to analyze gene expression time course data. Bioinformatics 19, I264–I272 (2003)

    Article  Google Scholar 

  29. Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D., Futcher, B.: Comprehensive identification of cell cycleregulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell. 9, 3273–3297 (1998)

    Google Scholar 

  30. Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E., Golub, T.: Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. 96(6), 2907–2912 (1999)

    Article  Google Scholar 

  31. Tavazoie, S., Hughes, J., Campbell, M., Cho, R., Church, G.: Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999)

    Article  Google Scholar 

  32. Zhu1, G., Spellman, P.T., Volpe, T., Brown, P.O., Botstein, D., Davis, T.N., Futcher, B.: Two yeast forkhead genes regulate cell cycle and pseudohyphal growth. Nature 406, 90–94 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Rueda, L., Bari, A., Ngom, A. (2008). Clustering Time-Series Gene Expression Data with Unequal Time Intervals. In: Priami, C., Dressler, F., Akan, O.B., Ngom, A. (eds) Transactions on Computational Systems Biology X. Lecture Notes in Computer Science(), vol 5410. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92273-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-92273-5_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-92272-8

  • Online ISBN: 978-3-540-92273-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics