Skip to main content
Log in

Improved Class Prediction in DNA Microarray Gene Expression Data by Unsupervised Reduction of the Dimensionality followed by Supervised Learning with a Perceptron

Journal of VLSI signal processing systems for signal, image and video technology Aims and scope Submit manuscript

Abstract

This manuscript describes a combined approach of unsupervised clustering followed by supervised learning that provides an efficient classification of conditions in DNA array gene expression experiments (different cell lines including some cancer types, in the cases shown). Firstly the dimensionality of the dataset of gene expression profiles is reduced to a number of non-redundant clusters of co-expressing genes using an unsupervised clustering algorithm, the Self Organizing Tree Algorithm (SOTA), a hierarchical version of Self Organizing Maps (SOM). Then, the average values of these clusters are used for training a perceptron that produces a very efficient classification of the conditions. This way of reducing the dimensionality of the data set seems to perform better than other ones previously proposed such as principal component analysis (PCA). In addition, the weights that connect the gene clusters to the different experimental conditions can be used to assess the relative importance of the genes in the definition of these classes. Finally, Gene Ontology (GO) terms are used to infer a possible biological role for these groups of genes and to asses the validity of the classification from a biological point of view.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. P.O. Brown and D. Botsein, "Exploring the New World of the Genome with DNA Microarrays," Nature Biotechnol., vol. 14, 1999, pp. 1675-1680.

    Google Scholar 

  2. A.A. Alizadeh, M.B. Eisen, R.E. Davis, C. Ma, I.S. Lossos, A. Rosenwald, J.C. Boldrick, H. Sabet, T. Tran, X. Yu, J.I. Powell, L. Yang, G.E. Marti, T. Moore, J. Hudson Jr, L. Lu, D.B. Lewis, R. Tibshirani, G. Sherlock, W.C. Chan, T.C. Greiner, D.D. Weisenburger, J.O. Armitage, R. Warnke, R. Levy, W. Wilson, M.R. Grever, J.C. Byrd, D. Botstein, P.O. Brown, and L.M. Staudt, "DistinctTypes of Diffuse Large B-CellLymphoma 252 Conde et al. Identified by Gene Expression Profiling," Nature, vol. 403, 2000, pp. 503-511.

    Article  Google Scholar 

  3. U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, and A.J. Levine, "Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed with Oligonucleotide Arrays," Proc. Natl. Acad. Sci. USA, vol. 96, 1999, pp. 6745-6750.

    Article  Google Scholar 

  4. T.S. Furey, N. Cristianini, N. Duffy, D.W. Bednarski, M. Schummer, and D. Haussler, "Support Vector Machine Classifi-cation and Validation of Cancer Tissue Samples Using Microarray Expression Data," Bioinformatics, vol. 16, 2000, pp. 906-914.

    Article  Google Scholar 

  5. J. Khan, J.S. Wei, M. Ringn´er, L.H. Saal, M. Ladanyi, F. Westermann, F. Berthold, M. Schwab, C.R. Antonescu, C. Peterson, and P.S. Meltzer, "Classification and Diagnostic Prediction of Cancers Using Gene Expression Profiling and Arti-ficial Neural Networks," Nature Med., vol. 7, 2001, pp. 673-579.

    Article  Google Scholar 

  6. A. Mateos, J. Herrero, J. Tamames, and J. Dopazo, "Supervised Neural Networks for Clustering Conditions in DNA Array Data after Reducing Noise by Clustering Gene Expression Profiles," in Microarray Data Analysis II, Kluwer Academic, 2002, pp. 91-103.

  7. J. Dopazo and J.M. Carazo, "Phylogenetic Reconstruction Using a Growing Neural Network that Adopts the Topology of a Phylogenetic Tree," J. Mol. Evol., vol. 44, 1997, pp. 226-233.

    Article  Google Scholar 

  8. J. Herrero, A. Valencia, and J. Dopazo, "A Hierarchical Unsupervised Growing Neural Network for Clustering Gene Expression Patterns," Bioinformatics, vol. 17, 2001, pp. 126-136.

    Article  Google Scholar 

  9. C.H. Wu and J.W. McLarty, Neural Networks and Genome Informatics, Elsevier, 2000.

  10. K.Y. Yeung and W.L. Ruzzo, "Principal Component Analysis for Clustering Gene Expression Data," Bioinformatics, vol. 17, 2001, pp. 763-774.

    Article  Google Scholar 

  11. J.C. Oliveros, C. Blaschke, J. Herrero, J. Dopazo, and A. Valencia, "Expression Profiles and Biological Function," Genome Informatics, vol. 10, 2000, pp. 106-117.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Conde, L., Mateos, Á., Herrero, J. et al. Improved Class Prediction in DNA Microarray Gene Expression Data by Unsupervised Reduction of the Dimensionality followed by Supervised Learning with a Perceptron. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 35, 245–253 (2003). https://doi.org/10.1023/B:VLSI.0000003023.90210.c8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:VLSI.0000003023.90210.c8

Navigation