Abstract
Because of the high costs of microarray experiments and the availability of only limited biological materials, microarray experiments are often performed with a small number of replicates. Investigators, therefore, often have to perform their experiments with low replication or without replication. However, the heterogeneous error variability observed in microarray experiments increases the difficulty in analyzing microarray data without replication. No current analysis techniques are practically applicable to such microarray data analysis. We here introduce a statistical method, the so-called unreplicated heterogeneous error model (UHEM) for the microarray data analysis without replication. This method is possible by utilizing many adjacent-intensity genes for estimating local error variance after nonparametric elimination of differentially expressed genes between different biological conditions. We compared the performance of UHEM with three empirical Bayes prior specification methods: between-condition local pooled error, pseudo standard error, or adaptive standard error-based HEM. We found that our unreplicated HEM method is effective for the microarray data analysis when replication of an array experiment is impractical or prohibited.
Similar content being viewed by others
References
Baldi P, Long A (2001) A Bayesian framework for the analysis of microarray expression data: regularized t test and statistical inferences of gene changes. Bioinformatics 17: 509–519
Cho H, Lee JK (2008) Error-pooling empirical Bayes model for enhanced statistical discovery of differential expression in microarray data. IEEE Trans Syst Man Cybern Part A Syst Hum 38: 425–436
Dong F (1993) On the identification of active contrasts in unreplicated fractional factorials. Stat Sin 3: 209–217
Efron B (2003) Robbins, empirical Bayes and microarrays. Ann Stat 31: 366–378
Efron B, Tibshirani R (2002) Empirical Bayes methods and false discovery rates for microarrays. Genet Epidemiol 23: 70–86
Efron B, Tibshirani R, Storey J, Tusher V (2001) Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 96: 1151–1160
Geyer C (1992) Practical Markov chain Monte Carlo. Stat Sci 7: 473–483
Ibrahim J, Chen M-H, Gray R (2002) Bayesian models for gene expression with DNA microarray data. J Am Stat Assoc 97: 88–99
Ishwaran H, Rao J (2003) Detecting differentially expressed genes in microarrays using Bayesian model selection. J Am Stat Assoc 98: 438–455
Jain N, Thatte J, Braciale T, Ley K, O’Connell M, Lee JK (2003) Local pooled error test for identifying differentially expressed genes with a small number of replicated microarrays. Bioinformatics 19: 1945–1951
Kendziorski C, Newton M, Lan H, Gould M (2003) On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Stat Med 22: 3899–3914
Kooperberg C, Aragaki A, Strand AD, Olson J (2005) Significance testing for small microarray experiments. Stat Med 24: 2281–2298
Lee M-LT, Kuo FC, Whitmore GA, Sklar J (2000) Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. Proc Natl Acad Sci 97: 9834–9839
Lee K, Sha N, Dougherty E, Vannucci M, Mallick B (2003) Gene selection: a Bayesian variable selection approach. Bioinformatics 19: 90–97
Lenth R (1989) Quick and easy analysis of unreplicated factorials. Technometrics 31: 469–473
Liang M, Briggs A, Rute E, Greene A, Cowley A (2003) Quantitative assessment of the importance of dye switching and biological replication in cdna microarray studies. Physiol Genomics 14: 199–207
Newton M, Kendziorski C (2003) Parametric empirical Bayes methods for microarrays. In: Parmigiani G, Garrett E, Irizarry R (eds) The analysis of gene expression data. Springer, New York, pp 254–271
Newton M, Kendziorski C, Richmond C, Blattner F, Tsui K (2001) On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data. J Comput Biol 8: 37–52
Theilhaber J, Bushnell S, Jackson A, Fuchs R (2001) Bayesian estimation of fold-changes in the analysis of gene expression: the PFOLD algorithm. J Comput Biol 8: 585–614
Townsend JP (2003) Multifactorial experimental design and the transitivity of ratios with spotted dna microarrays. BMC Genomics 4: 41
Townsend JP (2004) Resolution of large and small differences in gene expression using models for the bayesian analysis of gene expression levels and spotted dna microarrays. BMC Bioinformatics 5: 54
Townsend JP, Hartl D (2002) Bayesian analysis of gene expression levels: statistical quantification of relative mrna level across multiple treatments or samples. Genome Biology 3, research0071.0071–0071.0016
Tseng GC, Oh M-K, Rohlin L, Liao J, Wong WH (2001) Issues in cdna microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res 29: 2549–2557
Tusher V, Tibshirani R, Chu C (2001) Significance analysis of microarrays applied to transcriptional responses to ionizing radiation. Proc Natl Acad Sci 98: 5116–5121
Wernisch L (2002) Can replication save noisy microarray data?. Comp Funct Genomics 3: 372–374
Yang YH, Speed T (2002) Design issues for cdna microarray experiments. Nat Rev 3: 579–588
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cho, H., Kang, J. & Lee, J.K. Empirical Bayes analysis of unreplicated microarray data. Comput Stat 24, 393–408 (2009). https://doi.org/10.1007/s00180-008-0133-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-008-0133-9