Skip to main content

Adaptive Selection of Feature Set Dimensionality for Classification of DNA Microarray Samples

  • Conference paper
Computer Recognition Systems 2

Part of the book series: Advances in Soft Computing ((AINSC,volume 45))

Abstract

This work tackles a problem of building predictive models from results of DNA microarray experiments. Data analysis challenges related to high dimensionality of data and small number of samples usually available from such experiments are discussed. A method is proposed to adaptively select the right number of genes to be used as features for a predictive model in order to avoid overfitting which seems to be the major risk in microarray studies. The approach proposed is illustrated by a numerical example based on a gene expression profiles from two types of acute leukemia (data originally published by Golub).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bittner M, Meltzer P, Chen Y (2000) Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 406:536–540

    Article  Google Scholar 

  2. Dudoit S, Shaffer J, Boldrick J (2002) Multiple Hypothesis Testing in Microarray Experiments. UC Berkeley Division of Biostatistics Working Paper Series, Paper110.

    Google Scholar 

  3. Eisen M, et al. (1998) Proc. Natl. Acad. Sci. USA 95:14863–14868

    Article  Google Scholar 

  4. Everitt B (1980) Cluster Analysis, Second Edition. Heineman Educational Books Ltd., London

    Google Scholar 

  5. Ewens W, Grant G (2001) Statistical Methods in Bioinformatics. Springer, Berlin Heidelberg New York

    MATH  Google Scholar 

  6. Faller D, et al. (2003) Journal of Computational Biology 10:751–762

    Article  Google Scholar 

  7. Golub T, et al. (1999) Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286:531–537.

    Article  Google Scholar 

  8. Hastie T, Tibshirani R, Friedman J (2002) The Elements of Statistical Learning. Data Mining, Inference and Prediction. Springer, Berlin Heidelberg New York

    Google Scholar 

  9. Hoffmann R, Seidl T, Dugas M (2002) Profound effect of normalization on detection of differently expressed genes in oligonucleotide microarray data analysis. Genome Biology

    Google Scholar 

  10. Maciejewski H, Jasinska A (2005) Clustering DNA microarray data. Computer recognition systems CORES 05, Springer Advances in Soft Computing

    Google Scholar 

  11. Maciejewski H, Konarski L (2007) Building a predictive model from data in high dimensions with application to analysis of microarray experiments. DepCoS — RELCOMEX, IEEE Computer Society Press

    Google Scholar 

  12. MAQC Consortium [Shi L. et al.] (2006) The MicroArray Quality Control (MAQC) project shows inter-and intraplatform reproducibility of gene expression measurements. Nature Biotechnology 24

    Google Scholar 

  13. Markowetz F, Spang R (2005) Molecular diagnosis. Classification, Model Selection and performance evaluation, Methods Inf. Med. 44:438–443

    Google Scholar 

  14. Quackenbush J (2001) Nature Reviews Genetics 2:418–427

    Article  Google Scholar 

  15. Shannon W, Culverhouse R, Duncann J (2003) Pharmacogenomics 4:41–51

    Article  Google Scholar 

  16. Tamayo P, et al. (1999) Proc. Natl. Acad. Sci. USA 96:2907–29120

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Maciejewski, H. (2007). Adaptive Selection of Feature Set Dimensionality for Classification of DNA Microarray Samples. In: Kurzynski, M., Puchala, E., Wozniak, M., Zolnierek, A. (eds) Computer Recognition Systems 2. Advances in Soft Computing, vol 45. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75175-5_103

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75175-5_103

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75174-8

  • Online ISBN: 978-3-540-75175-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics