Skip to main content

Quantitative Evaluation of Established Clustering Methods for Gene Expression Data

  • Conference paper
Biological and Medical Data Analysis (ISBMDA 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3337))

Included in the following conference series:

Abstract

Analysis of gene expression data generated by microarray techniques often includes clustering. Although more reliable methods are available, hierarchical algorithms are still frequently employed. We clustered several data sets and quantitatively compared the performance of an agglomerative hierarchical approach using the average-linkage method with two partitioning procedures, k-means and fuzzy c-means. Investigation of the results revealed the superiority of the partitioning algorithms: the compactness of the clusters was markedly increased and the arrangement of the profiles into clusters more closely resembled biological categories. Therefore, we encourage analysts to critically scrutinize the results obtained by clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Butte, A.: The use and analysis of microarray data. Nat. Rev. Drug Discov. 1, 951–960 (2002)

    Google Scholar 

  2. Shannon, W., Culverhouse, R., Duncan, J.: Analyzing microarray data using cluster analysis. Pharmacogenomics 4, 41–52 (2003)

    Article  Google Scholar 

  3. Bullinger, L., Dohner, K., Bair, E., Frohling, S., Schlenk, R.F., Tibshirani, R., Dohner, H., Pollack, J.R.: Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia. N. Engl. J. Med. 350, 1605–1616 (2004)

    Google Scholar 

  4. Murata, Y., Watanabe, T., Sato, M., Momose, Y., Nakahara, T., Oka, S., Iwahashi, H.: Dimethyl sulfoxide exposure facilitates phospholipid biosynthesis and cellular membrane proliferation in yeast cells. J. Biol. Chem. 278, 33185–33193 (2003)

    Google Scholar 

  5. Shimoji, T., Kanda, H., Kitagawa, T., Kadota, K., Asai, R., Takahashi, K., Kawaguchi, N., Matsumoto, S., Hayashizaki, Y., Okazaki, Y., Shinomiya, K.: Clinico-molecular study of dedifferentiation in well-differentiated liposarcoma. Biochem. Biophys. Res. Commun. 314, 1133–1140 (2004)

    Google Scholar 

  6. Amatschek, S., Koenig, U., Auer, H., Steinlein, P., Pacher, M., Gruenfelder, A., Dekan, G., Vogl, S., Kubista, E., Heider, K.H., Stratowa, C., Schreiber, M., Sommergruber, W.: Tissue-wide expression profiling using cdna subtraction and microarrays to identify tumor-specific genes. Cancer Res. 64, 844–856 (2004)

    Article  Google Scholar 

  7. Kawahara, N., Wang, Y., Mukasa, A., Furuya, K., Shimizu, T., Hamakubo, T., Aburatani, H., Kodama, T., Kirino, T.: Genome-wide gene expression analysis for induced ischemic tolerance and delayed neuronal death following transient global ischemia in rats. J. Cereb. Blood Flow Metab. 24, 212–223 (2004)

    Article  Google Scholar 

  8. Mirza, A., Wu, Q., Wang, L., McClanahan, T., Bishop, W.R., Gheyas, F., Ding, W., Hutchins, B., Hockenberry, T., Kirschmeier, P., Greene, J.R., Liu, S.: Global transcriptional program of p53 target genes during the process of apoptosis and cell cycle progression. Oncogene 22, 3645–3654 (2003)

    Article  Google Scholar 

  9. Weinberg, E.O., Mirotsou, M., Gannon, J., Dzau, V.J., Lee, R.T., Pratt, R.E.: Sex dependence and temporal dependence of the left ventricular genomic response to pressure overload. Physiol. Genomics 12, 113–127 (2003)

    Google Scholar 

  10. Mar, J.C., McLachlan, G.J.: Model-based clustering in gene expression microarrays: an application to breast cancer data. In: Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics 2003, pp. 139–144. Australian Computer Society, Inc. (2003)

    Google Scholar 

  11. Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting patterns of gene expression with selforganizing maps: methods and application to hematopoietic differentiation. In: Proc. Natl. Acad. Sci., USA, vol. 96, pp. 2907–2912 (1999)

    Google Scholar 

  12. Morgan, B.J.T., Ray, A.P.G.: Non-uniqueness and inversions in cluster analysis. Appl. Statist. 44, 117–134 (1995)

    Article  MATH  Google Scholar 

  13. Yeung, K.Y., Fraley, C., Murua, A., Raftery, A.E., Ruzzo, W.L.: Model-based clustering and data transformations for gene expression data. Bioinformatics 17, 977–987 (2001)

    Article  Google Scholar 

  14. Dougherty, E.R., Barrera, J., Brun, M., Kim, S., Cesar, R.M., Chen, Y., Bittner, M., Trent, J.M.: Inference from clustering with application to gene-expression microarrays. J. Comput. Biol. 9, 105–126 (2002)

    Article  Google Scholar 

  15. Granzow, M., Berrar, D., Dubitzky, W., Schuster, A., Azuaje, F.J., Eils, R.: Tumor classification by gene expression profiling: Comparison and validation of five clustering methods. SIGBIO Newsletter Special Interest Group on Biomedical Computing of the ACM 21, 16–22 (2001)

    Google Scholar 

  16. Harris, T.M., Childs, G.: Global gene expression patterns during differentiation of f9 embryonal carcinoma cells into parietal endoderm. Funct. Integr. Genomics 2, 105–119 (2002)

    Article  Google Scholar 

  17. Xiao, L., Wang, K., Teng, Y., Zhang, J.: Component plane presentation integrated self-organizing map for microarray data analysis. FEBS Lett. 538, 117–124 (2003)

    Article  Google Scholar 

  18. Wang, J., Delabie, J., Aasheim, H., Smeland, E., Myklebost, O.: Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study. BMC Bioinformatics 3, 36 (2002)

    Article  Google Scholar 

  19. Wu, S., Chow, T.W.S.: Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density. Pattern Recognition 37, 175–188 (2004)

    Article  MATH  Google Scholar 

  20. Golay, X., Kollias, S., Stoll, G., Meier, D., Valavanis, A., Boesiger, P.: A new correlation-based fuzzy logic clustering algorithm for fMRI. Magnetic Resonance in Medicine 40, 249–260 (1998)

    Article  Google Scholar 

  21. Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. In: Proc. Natl. Acad. Sci., USA, pp. 14863–14868 (1998)

    Google Scholar 

  22. Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. IEEE Trans. SMC, Part B Cybernet 28, 301–315 (1998); ISSN: 1083-4419

    Google Scholar 

  23. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley & Sons, Chichester (2001)

    MATH  Google Scholar 

  24. Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press, London (1999)

    Google Scholar 

  25. Möller, U., Ligges, M., Grunling, C., Georgiewa, P., Kaiser, W.A., Witte, H., Blanz, B.: Pitfalls in the clustering of neuroimage data and improvements by global optimization strategies. Neuroimage 14, 206–218 (2001)

    Google Scholar 

  26. Boldrick, J.C., Alizadeh, A.A., Diehn, M., Dudoit, S., Liu, C.L., Belcher, C.E., Botstein, D., Staudt, L.M., Brown, P.O., Relman, D.A.: Stereotyped and specific gene expression programs in human innate immune responses to bacteria. In: Proc. Natl. Acad. Sci., USA, vol. 99, pp. 972–977 (2002)

    Google Scholar 

  27. Guthke, R., Thies, F., Moeller, U.: Data- and knowledge-driven dynamic modeling of the immune response to bacterial infection. In: Dounias, G.D. (ed.) Hybrid and adaptive computational intelligence in medicine and bio-informatics, EUNITE, pp. 33–39 (2003)

    Google Scholar 

  28. Cho, R.J., Campbell, M.J., Winzeler, E.A., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T.G., Gabrielian, A.E., Landsman, D., Lockhart, D.J., Davis, R.W.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2, 65–73 (1998)

    Article  Google Scholar 

  29. Sharan, R., Maron-Katz, A., Shamir, R.: Click and expander: a system for clustering and visualizing gene expression data. Bioinformatics 19, 1787–1799 (2003)

    Article  Google Scholar 

  30. Iyer, V.R., Eisen, M.B., Ross, D.T., Schuler, G., Moore, T., Lee, J.C., Trent, J.M., Staudt, L.M., Hudson, J.J., Boguski, M.S., Lashkari, D., Shalon, D., Botstein, D., Brown, P.O.: The transcriptional program in the response of human fibroblasts to serum. Science 283, 83–87 (1999)

    Google Scholar 

  31. Radke, D., Thies, F., Dvinge, H., Möller, U.: Improved clustering methods for the analysis of microarray data. In: German Conference on Bioinformatics (2003)

    Google Scholar 

  32. Scherf, U., Ross, D.T., Waltham, M., Smith, L.H., Lee, J.K., Tanabe, L., Kohn, K.W., Reinhold, W.C., Myers, T.G., Andrews, D.T., Scudiero, D.A., Eisen, M.B., Sausville, E.A., Pommier, Y., Botstein, D., Brown, P.O., Weinstein, J.N.: A gene expression database for the molecular pharmacology of cancer. Nat. Genet. 24, 236–244 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Radke, D., Möller, U. (2004). Quantitative Evaluation of Established Clustering Methods for Gene Expression Data. In: Barreiro, J.M., Martín-Sánchez, F., Maojo, V., Sanz, F. (eds) Biological and Medical Data Analysis. ISBMDA 2004. Lecture Notes in Computer Science, vol 3337. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30547-7_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30547-7_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23964-2

  • Online ISBN: 978-3-540-30547-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics