Skip to main content

Reproducible Statistical Analysis in Microarray Profiling Studies

  • Conference paper
Applied Parallel Computing. State of the Art in Scientific Computing (PARA 2004)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3732))

Included in the following conference series:

  • 1444 Accesses

Abstract

Reproducibility of calculations is a longstanding issue within the statistical community. Due to the complexity of the algorithms, the size of the data sets, and the limitations of the medium printed paper it is usually not possible to report all the minutiae of the data processing and statistical computations. Like the critical assessment of a mathematical proof it should be possible to check the software behind a complex data analysis. To achieve reproducible calculations and to offer an extensible computational framework the tool of a compendium is discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Microarray special. Statistical Science, 18, 1–117 (2003)

    Google Scholar 

  2. Simon, R., Rademacher, M.D., Dobbin, K., McShane, L.M.: Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J. Nat. Cancer Inst. 95, 14–18 (2003)

    Article  Google Scholar 

  3. van ’t Veer, L., Dai, H., van de Vijver, M.J., He, Y.D., Hart, A.A.M., Mao, M., Petersen, H.L., van de Kooy, K., Marton, M.J., Witteveen, A.T., Schreiber, G.J., Kerkhoven, R.M., Roberts, C., Linsley, P.S., Bernards, R., Friend, S.H.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536 (2002)

    Article  Google Scholar 

  4. Huang, E., Cheng, S.H., Dressman, H., Pittman, J., Tsou, M.H., Horng, C.F., Bild, A., Iversen, E.S., Liao, M., Chen, C.M., West, M., Nevins, J.R., Huang, A.T.: Gene expression predictors of breast cancer outcomes. The Lancet 361, 1590–1596 (2003)

    Article  Google Scholar 

  5. Chang, J., Wooten, E., Tsimelzon, A., Hilsenbeck, S., Gutierrez, C., Elledge, R., Mohsin, S., Osborne, K., Chamness, G., Allred, C., O’Connell, P.: Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. The Lancet 362, 362–369 (2003)

    Article  Google Scholar 

  6. Bullinger, L., Döhner, K., Bair, E., Fröhling, S., Schlenk, R.F., Tibshirani, R., Döhner, H., Pollack, J.R.: Use of Gene-Expression Profiling to Identify Prognostic Subclasses in Adult Acute Myeloid Leukemia. NEJM 350, 1605–1616 (2004)

    Article  Google Scholar 

  7. Tibshirani, R.J., Efron, B.: Pre-validation and inference inmicroarrays. Statistical Applications in Genetics and Molecular Biology 1(1) (2002)

    Google Scholar 

  8. Breiman, L.: Statistical Modelling: The Two Cultures. Statistical Science 16, 199–231 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  9. Ambroise, C., McLachlan, G.J.: Selection bias in gene extraction on the basis of microarray gene-expression data. Proc. Natl. Acad. Sci. 99, 6562–6566 (2002)

    Article  MATH  Google Scholar 

  10. Brenton, J.D., Caldas, C.: Predictive cancer genomics - what do we need? The Lancet 362, 340–341 (2003)

    Article  Google Scholar 

  11. Leisch, F., Rossini, A.J.: Reproducible statistical research. Chance 16, 41–46 (2003)

    Google Scholar 

  12. Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of DiscriminationMethods for the Classification of Tumors Using Gene Expression Data. Journal of the American Statistical Association 97, 77–87 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  13. Lee, J.W.: Korea University, Department of statistics, personal communication

    Google Scholar 

  14. Ihaka, R., Gentleman, R.: A language for data analysis and graphics. Journal of Computational and Graphical Statistics 5, 299–314 (1996)

    Article  Google Scholar 

  15. Gentleman, R., Carey, V.: Bioconductor. R News 2(1), 11–16 (2002)

    Google Scholar 

  16. Leisch, F.: Dynamic generation of statistical reports using literate data analysis. In: Compstat 2002 - Proceedings in Computational Statistics, pp. 575–580 (2002)

    Google Scholar 

  17. van de Vijver, M.J., He, Y.D., van ’t Veer, L.J., et al.: A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347, 1999–2009 (2002)

    Article  Google Scholar 

  18. Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Class prediction by nearest shrunken centroids, with application to DNA microarrays. Statistical Science 18, 104–117 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  19. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1999)

    Google Scholar 

  20. Breiman, L.: Random Forests. Machine Learning Journal 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  21. Eilers, P.H., Boer, J.M., Van Ommen, G.J., Van Houwelingen, H.C.: Classification of Microarray Data with Penalized Logistic Regression. In: Proceedings of SPIE volume 4266:progress in biomedical optics and imaging, vol. 2, pp. 187–198 (2001)

    Google Scholar 

  22. Carey, V.J.: Literate Statistical Programming: Concepts and Tools. Chance 14, 46–50 (2001)

    Google Scholar 

  23. Sawitzki, G.: Keeping Statistics Alive in Documents. Computational Statistics 17, 65–88 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  24. Leisch, F.: Dynamic generation of statistical reports using literate data analysis. In: Compstat 2002 - Proceedings in Computational Statistics, pp. 575–580 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mansmann, U., Ruschhaupt, M., Huber, W. (2006). Reproducible Statistical Analysis in Microarray Profiling Studies. In: Dongarra, J., Madsen, K., Waśniewski, J. (eds) Applied Parallel Computing. State of the Art in Scientific Computing. PARA 2004. Lecture Notes in Computer Science, vol 3732. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11558958_114

Download citation

  • DOI: https://doi.org/10.1007/11558958_114

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29067-4

  • Online ISBN: 978-3-540-33498-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics