Elsevier

NeuroImage

Volume 40, Issue 2, 1 April 2008, Pages 655-661
NeuroImage

False positives in imaging genetics

https://doi.org/10.1016/j.neuroimage.2007.11.058Get rights and content

Abstract

Imaging genetics provides an enormous amount of functional–structural data on gene effects in living brain, but the sheer quantity of potential phenotypes raises concerns about false discovery. Here, we provide the first empirical results on false positive rates in imaging genetics.

We analyzed 720 frequent coding SNPs without significant association with schizophrenia and a subset of 492 of these without association with cognitive function. Effects on brain structure (using voxel-based morphometry, VBM) and brain function, using two archival imaging tasks, the n-back working memory task and an emotional face matching task, were studied in whole brain and regions of interest and corrected for multiple comparisons using standard neuroimaging procedures. Since these variants are unlikely to impact relevant brain function, positives obtained provide an upper empirical estimate of the false positive association rate. In a separate analysis, we randomly permuted genotype labels across subjects, removing any true genotype–phenotype association in the data, to derive a lower empirical estimate.

At a set correction level of 0.05, in each region of interest and data set used, the rate of positive findings was well below 5% (0.2–4.1%). There was no relationship between the region of interest and the false positive rate. Permutation results were in the same range as empirically derived rates.

The observed low rates of positives provide empirical evidence that the type I error rate is well controlled by current commonly used correction procedures in imaging genetics, at least in the context of the imaging paradigms we have used. In fact, our observations indicate that these statistical thresholds are conservative.

Introduction

Imaging genetics, the association of genetic variation with data derived from structural and functional neuroimaging, has emerged as a fruitful approach to discover neural mechanisms associated with genetic risk for psychiatric disorders (Hariri and Weinberger, 2003, Meyer-Lindenberg and Weinberger, 2006). Imaging provides, for each subject, an enormous amount of functional–structural data that can potentially characterize gene effects in living brain, which have proven surprisingly penetrant at this level (Canli et al., 2005, Heinz et al., 2005, Meyer-Lindenberg et al., 2006a, Meyer-Lindenberg et al., 2006b, Pezawas et al., 2005). However, the very richness of this information also raises new questions, since genetics is usually concerned with a much more limited set of dependent measures, e.g., simple categorical (e.g., disease status) or quantitative (e.g., IQ, personality scores) phenotypes. Specifically, since the functional relevance in increasing risk for psychiatric disorders has been established unambiguously only for very few loci (Straub and Weinberger, 2006), concern has been voiced over the possibility of spurious positive findings when genetic variants of uncertain functional relevance are related to high-dimensional imaging data (Meyer-Lindenberg and Weinberger, 2006). This problem becomes even more pressing as large numbers of genotypes are becoming available for imaging genetics studies in the context of the current wave of genome-wide association studies in psychiatry.

To study this question empirically, we identified a panel of genetic variants (SNPs) with low prior probability of being associated with brain structure or function relevant for psychiatric disorders. These were then tested in several large neuroimaging cohorts. As these SNPs are not expected to show an effect, significant findings obtained for them should provide an upper empirical estimate for the expected rate of false positives in imaging genetics (“upper estimate” because the actual false positive rate could be lower if some of the selected variants affect the brain after all, meaning that some observed imaging effects could reflect true neurobiology). While it would be comparatively straightforward to simulate single genotype labels for a given set of imaging data, we expected this empirical approach to be considerably more valid because the effects and distribution in human brain of the great majority of genetic variants in the human genome are entirely unknown and, given ancestral stratification and linkage disequilibrium, a large number of these variants of unknown effect will show, to varying degrees, association with any SNP under study. In other words, while a given target label can be simulated, approximating the level of genetic and neural complexity associated with the genetic background by simulated data is not presently feasible, making the use of real data sets and genotype important. However, to also establish a lower empirical estimate for false positives, the target genotypes were then randomly permuted, removing the genotype–phenotype relationship for those variants, and the analyses repeated. From a panel of 907 common nonconservative coding SNPs, we selected 720 SNPs that had a minor allele frequency > 10% in our data and did not show significant (at the level of p = 0.05, Bonferroni-corrected for the number of tests) association in either a family-based or case–control analysis with the broad diagnosis of schizophrenia (including schizoaffective disorder depressed and various axis II spectrum diagnoses) from our ongoing genetic association study of schizophrenia. However, Bonferroni correction is more accurate when tests are independent and less accurate (more conservative) when there is linkage disequilibrium, making it possible that some markers had true association with the disease but did not meet this conservative threshold. Therefore, of these 720 variants, 492 SNPs were selected that also did not show any association with disease and a panel of cognitive measures even at the lenient p = 05, uncorrected level, and were analyzed separately as a supplementary analysis. For each selected SNP, we then analyzed effects on brain structure (using voxel-based morphometry, VBM) (Pezawas et al., 2005) and brain function, using two archival imaging tasks, the n-back working memory task (Callicott et al., 2005) and an emotional face matching task (Hariri et al., 2002a). These imaging paradigms have previously shown strong association to genes related to schizophrenia and to cognition and emotion (Hariri and Weinberger, 2003, Meyer-Lindenberg and Weinberger, 2006). Genotyped healthy subjects were used overlapping with previously published cohorts. Following standard practice in the field, we analyzed genetic association throughout the whole-brain as well as in a priori hypothesized regions of interest (see Fig. 1). For the latter, we chose hippocampus and prefrontal cortex, typically used with memory tasks (Callicott et al., 2005), and amygdala, used for the face masking task (Meyer-Lindenberg et al., 2006a), as described previously. Two methods were used to determine the corrected thresholds: family-wise error based on Gaussian Random Fields theory (Worsley et al., 1996), which controls the number of false positives across the brain, and false discovery rate (FDR), a frequentist method which controls the error rate, i.e., the proportion of signal (rejected null hypothesis) that represents false positives (Genovese et al., 2002).

Section snippets

Subjects

Subjects with schizophrenia spectrum disorders, their unaffected siblings, and controls came from the Clinical Brain Disorders Branch “Sibling Study,” a study of neurobiological abnormalities related to genetic risk for schizophrenia (Egan et al., 2001). Only Caucasians of European ancestry were studied to minimize heterogeneity and potential stratification artifacts. DNA was available for 296 European American patients, 306 of their siblings, 439 parents of probands, and 350 controls. All

Results

Our results are found in Table 3. The main finding of the present study was that for each region of interest and data set used, the rate of positive findings was well below 5% (range 0.2–4.1%). For all combinations of imaging modalities, regions of interest, and correction methods, the false positive rate was significantly (p < 05, one-sample t-test) less than 5% with one exception: the amygdala ROI using FWE correction during the faces task.

We further investigated the question whether the type

Discussion

We provide the first empirical data on false positive rates in imaging genetics. Our analysis shows that regardless of the correction method employed, positive rates in an imaging genetics data set selected for low likelihood of association with neural intermediate phenotypes were well below 5%. Since under the null hypothesis of no association of imaging phenotypes with the evaluated SNPs, at a level of p = 0.05, corrected, false positive results would have been expected in 5% of analyses, the

Acknowledgment

This work was funded in part by the NIMH/IRP.

References (31)

  • V.L. Morgan et al.

    Comparison of fMRI statistical software packages and strategies for analysis of images containing random and stimulus-correlated motion

    Comput. Med. Imaging Graph.

    (2007)
  • R.E. Straub et al.

    Schizophrenia genes—famine to feast

    Biol. Psychiatry

    (2006)
  • G.R. Abecasis et al.

    Merlin-rapid analysis of dense genetic maps using sparse gene flow trees

    Nat. Genet.

    (2002)
  • J.H. Callicott et al.

    Physiological dysfunction of the dorsolateral prefrontal cortex in schizophrenia revisited

    Cereb. Cortex

    (2000)
  • J.H. Callicott et al.

    Variation in DISC1 affects hippocampal structure and function and increases risk for schizophrenia

    Proc. Natl. Acad. Sci U. S. A.

    (2005)
  • Cited by (102)

    • Neuroimaging Epigenetics: Challenges and Recommendations for Best Practices

      2018, Neuroscience
      Citation Excerpt :

      Nikolova and colleagues investigated 20 CpG sites upstream of the SLC6A4 transcription start site and probed associations between methylation levels and amygdala reactivity to threat: they successfully used PCA to reduce the number of CpGs to a handful of orthogonal factors and predict amygdala response (Nikolova et al., 2014). Data reduction is key in guarding against the inflation of type 1 errors, which is a concern for imaging genetics analysis (Meyer-Lindenberg et al., 2008). While research has shown that imaging genetics has good control for false-positives at normal-to-conservative statistical thresholds (Meyer-Lindenberg et al., 2008; Silver et al., 2011), the number of DNA methylation predictors is variable and will likely result in a greater number of statistical tests.

    • Neuroimaging Intermediate Phenotypes of Executive Control Dysfunction in Schizophrenia

      2016, Biological Psychiatry: Cognitive Neuroscience and Neuroimaging
    View all citing articles on Scopus
    1

    Present address: Merck & Co, Inc., BL2-6, PO Box 4, West Point, PA 19486, USA.

    2

    Fax: +1 301 480 7795.

    View full text