Skip to main content

A Comparison of Algorithms to Find Differentially Expressed Genes in Microarray Data

  • Conference paper
  • First Online:
Book cover Advances in Data Analysis, Data Handling and Business Intelligence

Abstract

There are several different algorithms published for the identification of differentially expressed genes in DNA microarray experiments. Such algorithms produce ordered lists of genes. To compare the performance of these algorithms established measurements from Information Retrieval are proposed. A benchmark data set with known properties is generated and published. This benchmark data is used to compare the performance of different algorithms with a new algorithm, called PUL. Surprisingly a clear ordering in performance of the algorithms was observed. PUL outperformed other algorithms by a factor of two. PUL was applied successfully in different practical applications. For these experiments the importance of the genes identified by PUL were independently verified.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. New York: ACM Press, Addison-Wesley.

    Google Scholar 

  • Beckers, J., Herrmann, F., Rieger, S., Drobyshev A. L., Horsch, M., Hrabé de Angelis, M., et al. (2005). Identification and validation of novel ERBB2 (HER2, NEU) targets including genes involved in angiogenesis. International Journal of Cancer, 114, 590–597.

    Article  Google Scholar 

  • Berwanger, B., et al. (2002). Loss of a FYN-regulated differentiation and growth arrest pathway in advanced stage neuroblastomas. Cancer Cell, 2(5), 377–386.

    Article  Google Scholar 

  • Bilmes, J. (1997). A gentle tutorial on the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models (Technical report ICSI-TR-97–021). University of Berkeley.

    Google Scholar 

  • Dudoit, S., Fridlyand, J., & Speed, T. (2000). Comparison of discrimination methods for the classification of tumors using gene expression data (Technical report 576). Department of Statistics, University of California, Berkeley.

    Google Scholar 

  • Lönnstedt, I., & Speed, T. P. (2001). Replicated microarray data. Statistica Sinica, 12(1), 31–46.

    Google Scholar 

  • Pallasch, C. P., Schwamb J., Schulz, A., Königs, S., Debey, S., Kofler, D., et al. (2008). Targeting lipid metabolism by the lipoprotein lipase inhibitor orlistat results in apoptosis in chronic lymphocytic leukemia. Leucemia, 22(3), 585–592.

    Google Scholar 

  • Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3(1), Article 3.

    Google Scholar 

  • Tusher, V., Tibshirani, R., & Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the United States of America, 98, 5116–5121.

    Article  MATH  Google Scholar 

  • Ultsch, A (2003): Pareto density estimation: A density estimation for knowledge discovery. In Baier, D., & Wernecke, K. D. (Eds.), Innovations in classification, data science, and information systems. Studies in classification, data analysis, and knowledge organization (pp. 91–100). Heidelberg: Springer.

    Google Scholar 

  • Ultsch, A. (2005). Improving the identification of differentially expressed genes in cDNA microarray experiments. In Weihs, C., & Gaul, W. (Eds.), Classification – The ubiquitous challenge (pp. 378–385). Heidelberg: Springer

    Chapter  Google Scholar 

  • Ultsch, A. (2007). Using information retrieval methods for the comparison of algorithms to find differentially expressed genes in microarray data (Technical Report Nr. 12). Computer Science, University of Marburg.

    Google Scholar 

  • Westfall, P. H., & Young, S. S. (1993). Resampling-based multiple testing. Examples and methods for p-value adjustment. New York: Wiley.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfred Ultsch .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ultsch, A., Pallasch, C., Bergmann, E., Christiansen, H. (2009). A Comparison of Algorithms to Find Differentially Expressed Genes in Microarray Data. In: Fink, A., Lausen, B., Seidel, W., Ultsch, A. (eds) Advances in Data Analysis, Data Handling and Business Intelligence. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01044-6_63

Download citation

Publish with us

Policies and ethics