Skip to main content

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 242))

  • 942 Accesses

Summary

Microarray technology is used for studying gene regulation at the genome and transcriptome level. In the most common application, the expression level of thousands of genes is monitored simultaneously leading to a huge dataset having high dimensionality. It is assumed that genes with similar function or regulatory elements will display a common expression profile over a variety of biological conditions. For some cases, it may be desirable to study simultaneously many drugs in different experimental conditions (e.g. concentration or time point) on biological models, leading to the generation of 3-way data. Cluster analysis is used for identifying biologically relevant groups of genes. In this chapter, fuzzy cluster analysis is used for this purpose. After a brief formulation of the problem, we outline motivations for our choice of the clustering algorithm. Then, the fuzzy clustering algorithms are presented and the main tuning parameters are discussed in the context of 2-way and 3-way microarray data. We propose a transformation allowing more contrast in distances between all pairs of samples in a dataset. This increases the likelihood of detection of a group structure, if any, in a high dimensional dataset. Results showing the performance of the fuzzy C-Means algorithm are carried out using real datasets. These results are finally validated through functional enrichment of genes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abou-Sleymane, G., Chalmel, F., Helmlinger, D., Lardenois, A., Thibault, C., Weber, C., Mérienne, K., Mandel, J.-L., Poch, O., Devys, D., Trottier, Y.: Polyglutamine expansion causes neurodegeneration by altering the neuronal differentiation program. Hum. Mol. Genetics 15(5), 691–703 (2006)

    Article  Google Scholar 

  2. Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. 96(12), 6745–6750 (1999)

    Article  Google Scholar 

  3. Armstrong, S.A., Staunton, J.E., Silverman, L.B., Pieters, R., den Boer, M.L., Minden, M.D., Sallan, S.E., Lander, E.S., Golub, T.R., Korsmeyer, S.J.: MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics 30, 41–47 (2002)

    Article  Google Scholar 

  4. Ben-Dor, A., Shamir, R., Yakhini, Z.: Clustering gene expression patterns. J. of Comput. Biol. 6(3-4), 281–297 (1999)

    Article  Google Scholar 

  5. Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Statist. Soc. B 57, 289–300 (1995)

    MATH  MathSciNet  Google Scholar 

  6. Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is nearest neighbor meaningful? In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 217–235. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  7. Bezdek, J.C.: Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York (1981)

    MATH  Google Scholar 

  8. Bolstad, B.M., Irizarry, R.A., Astrand, M., Speed, T.P.: A comparison of normalization methods for high density oligonucleotide array data based on bias and variance. Bioinformatics 19(2), 185–193 (2003)

    Article  Google Scholar 

  9. Broberg, P.: Statistical methods for ranking differentially expressed genes. Genome Biology 4(6), R 41.1–R 41.9 (2003)

    Article  Google Scholar 

  10. Cromer, A., Carles, A., Millon, R., Ganguli, G., Chalmel, F., Lemaire, F., Young, J., Dembele, D., Thibault, C., le Muller, D., Poch, O., Abecassis, J., Wasylyk, B.: Identification of genes associated with tumorigenesis and metastatic potential of hypopharyngeal cancer by microarray analysis. Oncogene 23, 2484–2498 (2004)

    Article  Google Scholar 

  11. Dembele, D., Kastner, P.: Fuzzy c-means method for clustering microarray data. Bioinformatics 19(8), 973–980 (2003)

    Article  Google Scholar 

  12. Dennis Jr., G., Sherman, B.T., Hosack, D.A., Yang, J., Gao, W., Lane, H.C., Lempicki, R.A.: DAVID: Database for annotation, visualization, and integrated discovery. Genome Biology 4(9), R60 (2003)

    Article  Google Scholar 

  13. Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. 95, 14863–14868 (1998)

    Article  Google Scholar 

  14. Gath, I., Geva, A.B.: Unsupervised optimal fuzzy clustering. IEEE Trans. Pattern Analysis and Machine Intelligence 11(7), 773–781 (1989)

    Article  Google Scholar 

  15. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  16. Gröll, L., Jäkel, J.: A new convergence proof of fuzzy c-means. IEEE Trans. Fuzzy Systems 13(5), 717–720 (2005)

    Article  Google Scholar 

  17. Gustafson, E.E., Kessel, W.C.: Fuzzy clustering with a fuzzy covariance matrix. In: Proc. of the IEEE Conference, vol. 2, pp. 761–766 (1978)

    Google Scholar 

  18. Höppner, F., Klawonn, F.: A contribution to convergence theory of fuzzy c-means and derivatives. IEEE Trans. Fuzzy Syst. 11(5), 682–694 (2003)

    Article  Google Scholar 

  19. Affymetrix cgos, free technical support software, http://www.affymetrix.com/support/index.affx

  20. Irizarry, R.A., Bolstad, B.M., Collin, F., Cope, L.M., Hobbs, B., Speed, T.P.: Summaries of affymetrix genechip probe level data. Nucleic Acids Research 31(4), e15 (2003)

    Article  Google Scholar 

  21. Iyer, V.R., Eisen, M.B., Ross, D.T., Schuler, G., Moore, T., Lee, J.C.F., Trent, J.M., Staudt, L.M., Hudson Jr., J., Bogoski, M.S., Lashkari, D., Shalon, D., Botstein, D., Brown, P.O.: The transcriptional program in the response of human fibroblasts to serum. Science 283, 83–87 (1999)

    Article  Google Scholar 

  22. Krishnapuram, R., Kim, J.: Clustering algorithms based on volume criteria. IEEE Trans. Fuzzy Systems 8(2), 228–236 (2000)

    Article  Google Scholar 

  23. Milligan, G.M., Cooper, M.C.: An Examination of Procedures for Determining the Number of Clusters in a Data Set. Psychometrika 50(2), 159–179 (1985)

    Article  Google Scholar 

  24. Morgan, B.J.T., Ray, A.P.G.: Non-uniqueness and inversions in cluster analysis. Appl. Statist. 44, 117–134 (1995)

    Article  MATH  Google Scholar 

  25. Sato, M., Sato, Y., Jain, L.C.: Fuzzy clustering models and applications. Physica-Verlag (1997)

    Google Scholar 

  26. Sharan, R., Shamir, R.: A clustering algorithm with application to gene expression analysis. In: Proc. AAAI - ISMB, CLICK 2000, pp. 307–316 (2000)

    Google Scholar 

  27. Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., Merisov, J.P.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005)

    Article  Google Scholar 

  28. Tamayo, P., Slonim, D., Mesirov, J.P., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96, 2907–2912 (1999)

    Article  Google Scholar 

  29. Tavazoie, S., Hughes, J.D., Campbell, M.J., Raymond, I., Cho, R.I., Church, G.M.: Systematic determination of genetic network architecture. Nature Genetics 22, 281–285 (1999)

    Article  Google Scholar 

  30. Theodorisis, S., Kouthroumbas, K.: Pattern recognition. Academic Press, New-York (1999)

    Google Scholar 

  31. Tian, L., Greenberg, S.A., Kong, S.W., Altschuler, J., Kohane, I.S., Park, P.J.: Discovering statistically significant pathways in expression profiling studies. Proc. Natl. Acad. Sci. USA 102(38), 13544–13549 (2005)

    Article  Google Scholar 

  32. Tseng, G.C., Oh, M.-K., Rohlin, L., Liao, J.C., Wong, W.H.: Issues in cdna microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic. Acids Res. 29(12), 2549–2557 (2001)

    Article  Google Scholar 

  33. Tusher, V.G., Tibshirani, R., Chu, G.: Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA 98(9), 5116–5121 (2001)

    Article  MATH  Google Scholar 

  34. Wicker, N., Dembele, D., Raffelsberger, W., Poch, O.: Density of points clustering, application to transcriptomic data analysis. Nucleic Acids Research 30(18), 3992–4000 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Dembélé, D. (2009). Microarray Data Analysis Using Fuzzy Clustering Algorithms. In: Jin, Y., Wang, L. (eds) Fuzzy Systems in Bioinformatics and Computational Biology. Studies in Fuzziness and Soft Computing, vol 242. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89968-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89968-6_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89967-9

  • Online ISBN: 978-3-540-89968-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics