Skip to main content
Log in

Multi-objective optimization for clustering 3-way gene expression data

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

In this paper, we use the Fuzzy C-means method for clustering 3-way gene expression data via optimization of multiple objectives. A reformulation of the total clustering criterion is used to obtain an expression which has fewer variables compared to the classical FCM criterion. This transformation allows the use of a direct global optimizer in constrast to the alternating search commonly used. Gene expression data from microarray technology is generally of high dimension. The problem of empty space is known for this kind of data. We propose in this paper a transformation allowing more contrast in distances between all pairs of data samples. This, hence, increases the likelihood of detecting group structure, if any, in high dimensional datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abou-Sleymane G, Chalmel F, Helmlinger D et al (2006) Polyglutamine expansion causes neurodegeneration by alterning the neuronal differentiation program. Hum Mol Genet 15(5): 691–703

    Article  Google Scholar 

  • Alon U, Barkai N, Notterman DA et al (1999) Broad value patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96(12): 6745–6750

    Article  Google Scholar 

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57: 289–300

    MATH  MathSciNet  Google Scholar 

  • Beyer K, Goldstein J, Ramakrishnan R et al (1999) When is “nearest neighbor” meaningful?. In: Beeri C, Buneman P (eds) LNCS 1540. Springer, Berlin, pp 217–235

    Google Scholar 

  • Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York

    MATH  Google Scholar 

  • Dembélé D, Kastner P (2003) Fuzzy C-means method for clustering microarray data. Bioinformatics 19(8): 973–980

    Article  Google Scholar 

  • Dennis G Jr, Sherman BT, Hosack DA et al (2003) DAVID: database for annotation, visualization, and integrated discovery. Genome Biol 4(9): R60

    Article  Google Scholar 

  • Dohono DL (2000) High-dimensional data analysis: the curses and blessings of dimensionality. In: Proceedings of American mathematical society conference “math challenges of the 21st century”, Los Angeles, http://www-stat.stanford.edu/~donoho

  • Golub TR, Slonim DK, Tamayo P et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: 531–537

    Article  Google Scholar 

  • Gröll L, Jäkel J (2005) A new convergence proof of fuzzy C-means. IEEE T Fuzzy Syst 13(5): 717–720

    Article  Google Scholar 

  • Hathaway RJ, Bezdek JC (1995) Optimization of clustering criteria by reformulation. IEEE T Fuzzy Syst 3(2): 241–245

    Article  Google Scholar 

  • Hérault J, Guérin-Dugué A, Villemain P (2002) Searching for the embedded manifolds in high-dimensional data, problems and unsolved questions, In: SANN’2002 proceedings—European symposium on artificial neural networks, 24–26 April, Bruges, pp 173–184

  • Höppner F, Klawonn F (2003) A contribution to convergence theory of fuzzy C-means and derivatives. IEEE T Fuzzy Syst 11(5): 682–694

    Article  Google Scholar 

  • Irizarry RA, Bolstad BM, Collin F et al (2003) Summaries of affymetrix geneChip probe level data. Nucleic Acids Res 31(4): e15

    Article  Google Scholar 

  • Jimenez JO, Landgrebe D (1995) High dimension feature reduction via projection pursuit, technical report TR-ECE 96-5, School of Electrical and Computer Engineering, Purdue University

  • Michalewicz Z (1998) Genetic algorithms + data structures = evolution programs, 3rd revised and extended edn. Springer, Heidelberg

    Google Scholar 

  • Milligan GW, Cooper MC (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2): 159–179

    Article  Google Scholar 

  • Sato M, Sato Y, Jain LC (1997) Fuzzy Clustering Models and Applications. Physica-Verlag

  • Sharan R, Shamir R (2000) CLICK: a clustering algorithm with application to gene expression analysis. In: Proceedings of the AAAI: ISMB, pp 307–316

  • Tamayo P, Slonim D, Mesirov J et al (1999) Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci USA 96: 2907–2912

    Article  Google Scholar 

  • Wicker N, Dembele D, Raffelsberger W et al (2002) Density of points clustering, application to transcriptomic data analysis. Nucleic Acids Res 30(18): 3992–4000

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Doulaye Dembélé.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dembélé, D. Multi-objective optimization for clustering 3-way gene expression data. Adv Data Anal Classif 2, 211–225 (2008). https://doi.org/10.1007/s11634-008-0032-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-008-0032-5

Keywords

Mathematics Subject Classification (2000)

Navigation