Supervised Learning from Microarray Data

Hastie, Trevor; Tibshirani, Robert; Narasimhan, Balasubramanian; Chu, Gilbert

doi:10.1007/978-3-642-57489-4_7

Trevor Hastie³,
Robert Tibshirani⁴,
Balasubramanian Narasimhan⁵ &
…
Gilbert Chu⁶

1096 Accesses
1 Citations

Abstract

Gene expression arrays pose challenging problems for most traditional supervised learning techniques. We present a discussion of some of the issues involved. We then propose a simple approach to class prediction for DNA microarrays, based on a enhancement of the nearest centroid classifier. Our technique uses soft-thresholded class centroids as prototypes for each class. The shrinkage improves significantly prediction performance, and identifies a subset of the genes most responsible for class separation. The method performs as well or better than competitors from the literature, and is easy to understand and interpret. We illustrate the technique on data from three studies: small round blue cell tumors, leukemia and breast cancer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Breiman, L. (1996), ‘Bagging predictors’, Machine Learning 26, 123–140.
Google Scholar
Donoho, D. & Johnstone, I. (1994),`Ideal spatial adaptation by wavelet shrinkage’Biometrika81,425–455
Article MathSciNet MATH Google Scholar
Eisen,M., Spellman,P., Brown,P.& Botstein, D.(1998), `Cluster analysis and display of genome-wide expression patterns’ Proc.Natl. Acad.Sci.USA95,14863–14868
Article Google Scholar
Friedman, J.(1989), `Regularized discriminant analysis’, Journal of the American Statistical Association 84, 165–175.
Article MathSciNet Google Scholar
Hastie, T., Tibshirani, R.& Friedman, J.(2001)The Elements of Statistical Learn- ing; Data mining Inference and Prediction, Springer Verlag, New York
Google Scholar
Khan, J., Wei, J., Ringner, M., Saal,L., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.,Peterson, C., & Meltzer, P.(2001), ‘Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks’Nature Medicine7, 673–679
Article Google Scholar
Tibshirani, R. (1996), `Regression shrinkage and selection via the lasso’J. Royal. Statist. Soc. B. 58, 267–288.
MathSciNet MATH Google Scholar
Tibshirani, R., Hastie, T., Narasimhan, B. & Chu, G. (2002),`Diagnosis of multiple cancer types by shrunken centroids of gene expression’,Proceedings of the National Academy of Sciences
Google Scholar
Tusher, V., Tibshirani, R & Chu, C. (2001), `Significance analysis of microarrays applied to transcriptional responses to ionizing radiation’,Proc. Natl. Acad. Sci. USA.98, 5116–5121.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Statistics Department and Department of Health, Research and Policy, Stanford University, CA, 94305, USA
Trevor Hastie
Department of Health, Research and Policy, Stanford University, CA, 94305, USA
Robert Tibshirani
Statistics Department, Stanford University, CA, 94305, USA
Balasubramanian Narasimhan
Departments of Biochemistry and Medical Oncology, Stanford University, CA, 94305, USA
Gilbert Chu

Authors

Trevor Hastie
View author publications
You can also search for this author in PubMed Google Scholar
Robert Tibshirani
View author publications
You can also search for this author in PubMed Google Scholar
Balasubramanian Narasimhan
View author publications
You can also search for this author in PubMed Google Scholar
Gilbert Chu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CASE — Centre of Applied Statistics and Economics, Humboldt-Universität zu Berlin, Spandauer Straße 1, 10178, Berlin, Germany
Wolfgang Härdle
Wirtschaftswissenschaftliche Fakultät, Institut für Statistik und Ökonometrie, Humboldt-Universität zu Berlin, Spandauer Straße 1, 10178, Berlin, Germany
Bernd Rönz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hastie, T., Tibshirani, R., Narasimhan, B., Chu, G. (2002). Supervised Learning from Microarray Data. In: Härdle, W., Rönz, B. (eds) Compstat. Physica, Heidelberg. https://doi.org/10.1007/978-3-642-57489-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-57489-4_7
Publisher Name: Physica, Heidelberg
Print ISBN: 978-3-7908-1517-7
Online ISBN: 978-3-642-57489-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics