Skip to main content

Advertisement

Log in

minPtest: a resampling based gene region-level testing procedure for genetic case-control studies

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Current technologies generate a huge number of single nucleotide polymorphism (SNP) genotype measurements in case-control studies. The resulting multiple testing problem can be ameliorated by considering candidate gene regions. The minPtest R package provides the first widely accessible implementation of a gene region-level summary for each candidate gene using the min \(P\) test. The latter is a permutation-based method that can be based on different univariate tests per SNP. The package brings together three different kinds of tests which were scattered over several R packages, and automatically selects the most appropriate one for the study design at hand. The implementation of the minPtest integrates two different parallel computing packages, thus optimally leveraging available resources for speedy results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Armitage P (1955) Tests for linear trends in proportions and frequencies. Biometrics 11(3):375–386

    Article  Google Scholar 

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–330

    MATH  MathSciNet  Google Scholar 

  • Carstensen B, Plummer M, Laara E, Laara M, et al (2010) Epi: a package for statistical analysis in epidemiology. http://CRAN.R-project.org/package=Epi, R package version 1.1.17

  • Chapman J, Whittaker J (2008) Analysis of multiple SNPs in a candidate gene region. Genet Epidemiol 32:560–566

    Article  Google Scholar 

  • Chen BE, Sakoda LC, Hsing AW, Rosenberg PS (2006) Resampling-based multiple hypothesis testing procedures for genetic case-control association studies. Genet Epidemiol 30:495–507

    Article  Google Scholar 

  • Clayton D, Leung H (2007) An R package for analysis of whole-genome association studies. Hum Hered 64:45–51

    Article  Google Scholar 

  • Clayton D (2011) snpStats: SnpMatrix and XSnpMatrix classes and methods. http://www-gene.cimr.cam.ac.uk/clayton. R package version 1.2.1

  • Cochran WG (1954) Some methods for strengthening the common chi-squared tests. Biometrics 10(4): 417–451

    Article  MATH  MathSciNet  Google Scholar 

  • Eugster MJA, Knaus J, Porzelius C, Schmidberger M, Vicedo E (2011) Hands-on tutorial for parallel computing with R. Comput Stat 26:219–239

    Article  MathSciNet  Google Scholar 

  • Gentleman R, Carey V, Bates D, Bolstad B et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5:R80

    Article  Google Scholar 

  • Hahne F, Huber W, Gentleman R, Falcon S (2008) Bioconductor case studies. Springer, New York

    Book  Google Scholar 

  • Hosgood HD 3rd, Menashe I, Shen M, Yeager M et al (2008) Pathway-based evaluation of 380 candidate genes and lung cancer susceptibility suggests the importance of the cell cycle pathway. Carcinogenesis 29(10):1938–1943

    Article  Google Scholar 

  • Knaus J, Porzelius C, Binder H, Schwarzer G (2009) Easier parallel computing in R with snowfall and sfCluster. R J 1:54–59

    Google Scholar 

  • Knaus J (2010) snowfall: Easier cluster computing (based on snow). http://CRAN.R-project.org/package=snowfall, R package version 1.84

  • Lan Q, Wang SS, Menashe I, Armstrong B et al (2011) Genetic variation in Th1/Th2 pathway genes and risk of non-Hodgkin lymphoma: a pooled analysis of three population-based case-control studies. Br J Hematol 153(3):341–350

    Article  Google Scholar 

  • Moore LE, Brennan P, Karami S et al (2009) Apolipoprotein E/C1 locus variants modify renal cell carcinoma risk. Cancer Res 69(20):8001–8008

    Article  Google Scholar 

  • R Development Core Team (2010) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/, ISBN 3-900051-07-0

  • Sauerbrei W, Royston P (1999) Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. J R Stat Soc Ser A Stat Soc 162(1):71–94

    Article  Google Scholar 

  • Scherag A, Hebebrand J, Wichmann HE, Jöckel KH (2010) Evaluating strategies for marker ranking in genome-wide association studies of complex traits. Methods Inf Med 49:632–640

    Article  Google Scholar 

  • Schwender H, Fritsch A (2010) scrime: analysis of high-dimensional categorical data such as SNP data. http://CRAN.R-project.org/package=scrime, R package version 1.2.0

  • Schwender H, Ruczinski I, Ickstadt K (2011) Testing SNPs and sets of SNPs for importance in association studies. Biostatistics 12:18–32

    Article  Google Scholar 

  • Urbanek S, (2009) multicore: parallel processing of R code on machines with multiple cores or CPUs. http://RForge.net/multicore/, R package version 0.1-3

  • Wang SS, Purdue MP, Cerhan JR, Zheng T et al (2009) Common gene variants in the tumor necrosis factor (TNF) and TNF receptor superfamilies and NF-kB transcription factors and non-Hodgkin lymphoma risk. PLoS One 4(4):e5360

    Article  Google Scholar 

  • Westfall PH, Zaykin DV, Young SS (2002) Multiple tests for genetic effects in association studies. Methods Mol Biol 184:143–168

    Google Scholar 

  • Westfall PH, Young SS (1993) Resampling-based multiple testing: example and methods for p-value adjustment. Wiley, New York

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefanie Hieke.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hieke, S., Binder, H., Nieters, A. et al. minPtest: a resampling based gene region-level testing procedure for genetic case-control studies. Comput Stat 29, 51–63 (2014). https://doi.org/10.1007/s00180-012-0391-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-012-0391-4

Keywords

Navigation