False positives in imaging genetics
Introduction
Imaging genetics, the association of genetic variation with data derived from structural and functional neuroimaging, has emerged as a fruitful approach to discover neural mechanisms associated with genetic risk for psychiatric disorders (Hariri and Weinberger, 2003, Meyer-Lindenberg and Weinberger, 2006). Imaging provides, for each subject, an enormous amount of functional–structural data that can potentially characterize gene effects in living brain, which have proven surprisingly penetrant at this level (Canli et al., 2005, Heinz et al., 2005, Meyer-Lindenberg et al., 2006a, Meyer-Lindenberg et al., 2006b, Pezawas et al., 2005). However, the very richness of this information also raises new questions, since genetics is usually concerned with a much more limited set of dependent measures, e.g., simple categorical (e.g., disease status) or quantitative (e.g., IQ, personality scores) phenotypes. Specifically, since the functional relevance in increasing risk for psychiatric disorders has been established unambiguously only for very few loci (Straub and Weinberger, 2006), concern has been voiced over the possibility of spurious positive findings when genetic variants of uncertain functional relevance are related to high-dimensional imaging data (Meyer-Lindenberg and Weinberger, 2006). This problem becomes even more pressing as large numbers of genotypes are becoming available for imaging genetics studies in the context of the current wave of genome-wide association studies in psychiatry.
To study this question empirically, we identified a panel of genetic variants (SNPs) with low prior probability of being associated with brain structure or function relevant for psychiatric disorders. These were then tested in several large neuroimaging cohorts. As these SNPs are not expected to show an effect, significant findings obtained for them should provide an upper empirical estimate for the expected rate of false positives in imaging genetics (“upper estimate” because the actual false positive rate could be lower if some of the selected variants affect the brain after all, meaning that some observed imaging effects could reflect true neurobiology). While it would be comparatively straightforward to simulate single genotype labels for a given set of imaging data, we expected this empirical approach to be considerably more valid because the effects and distribution in human brain of the great majority of genetic variants in the human genome are entirely unknown and, given ancestral stratification and linkage disequilibrium, a large number of these variants of unknown effect will show, to varying degrees, association with any SNP under study. In other words, while a given target label can be simulated, approximating the level of genetic and neural complexity associated with the genetic background by simulated data is not presently feasible, making the use of real data sets and genotype important. However, to also establish a lower empirical estimate for false positives, the target genotypes were then randomly permuted, removing the genotype–phenotype relationship for those variants, and the analyses repeated. From a panel of 907 common nonconservative coding SNPs, we selected 720 SNPs that had a minor allele frequency > 10% in our data and did not show significant (at the level of p = 0.05, Bonferroni-corrected for the number of tests) association in either a family-based or case–control analysis with the broad diagnosis of schizophrenia (including schizoaffective disorder depressed and various axis II spectrum diagnoses) from our ongoing genetic association study of schizophrenia. However, Bonferroni correction is more accurate when tests are independent and less accurate (more conservative) when there is linkage disequilibrium, making it possible that some markers had true association with the disease but did not meet this conservative threshold. Therefore, of these 720 variants, 492 SNPs were selected that also did not show any association with disease and a panel of cognitive measures even at the lenient p = 05, uncorrected level, and were analyzed separately as a supplementary analysis. For each selected SNP, we then analyzed effects on brain structure (using voxel-based morphometry, VBM) (Pezawas et al., 2005) and brain function, using two archival imaging tasks, the n-back working memory task (Callicott et al., 2005) and an emotional face matching task (Hariri et al., 2002a). These imaging paradigms have previously shown strong association to genes related to schizophrenia and to cognition and emotion (Hariri and Weinberger, 2003, Meyer-Lindenberg and Weinberger, 2006). Genotyped healthy subjects were used overlapping with previously published cohorts. Following standard practice in the field, we analyzed genetic association throughout the whole-brain as well as in a priori hypothesized regions of interest (see Fig. 1). For the latter, we chose hippocampus and prefrontal cortex, typically used with memory tasks (Callicott et al., 2005), and amygdala, used for the face masking task (Meyer-Lindenberg et al., 2006a), as described previously. Two methods were used to determine the corrected thresholds: family-wise error based on Gaussian Random Fields theory (Worsley et al., 1996), which controls the number of false positives across the brain, and false discovery rate (FDR), a frequentist method which controls the error rate, i.e., the proportion of signal (rejected null hypothesis) that represents false positives (Genovese et al., 2002).
Section snippets
Subjects
Subjects with schizophrenia spectrum disorders, their unaffected siblings, and controls came from the Clinical Brain Disorders Branch “Sibling Study,” a study of neurobiological abnormalities related to genetic risk for schizophrenia (Egan et al., 2001). Only Caucasians of European ancestry were studied to minimize heterogeneity and potential stratification artifacts. DNA was available for 296 European American patients, 306 of their siblings, 439 parents of probands, and 350 controls. All
Results
Our results are found in Table 3. The main finding of the present study was that for each region of interest and data set used, the rate of positive findings was well below 5% (range 0.2–4.1%). For all combinations of imaging modalities, regions of interest, and correction methods, the false positive rate was significantly (p < 05, one-sample t-test) less than 5% with one exception: the amygdala ROI using FWE correction during the faces task.
We further investigated the question whether the type
Discussion
We provide the first empirical data on false positive rates in imaging genetics. Our analysis shows that regardless of the correction method employed, positive rates in an imaging genetics data set selected for low likelihood of association with neural intermediate phenotypes were well below 5%. Since under the null hypothesis of no association of imaging phenotypes with the evaluated SNPs, at a level of p = 0.05, corrected, false positive results would have been expected in 5% of analyses, the
Acknowledgment
This work was funded in part by the NIMH/IRP.
References (31)
- et al.
Voxel-based morphometry—the methods
NeuroImage
(2000) - et al.
SmartPhantom—an fMRI simulator
Magn. Reson. Imaging
(2006) - et al.
To modulate or not to modulate: differing results in uniquely shaped Williams syndrome brains
NeuroImage
(2006) - et al.
Relative risk for cognitive impairments in siblings of patients with schizophrenia
Biol. Psychiatry
(2001) - et al.
Detecting activations in PET and fMRI: levels of inference and power
NeuroImage
(1996) - et al.
Factor analysis of neurocognitive tests in a large sample of schizophrenic probands, their siblings, and healthy controls
Schizophr. Res.
(2007) - et al.
Thresholding of statistical maps in functional neuroimaging using the false discovery rate
NeuroImage
(2002) - et al.
The amygdala response to emotional stimuli: a comparison of faces and scenes
NeuroImage
(2002) - et al.
Nonstationary cluster-size inference with random field and permutation methods
NeuroImage
(2004) - et al.
Empirical comparison of maximal voxel and non-isotropic adjusted cluster extent results in a voxel-based morphometry study of comorbid learning disability with schizophrenia
NeuroImage
(2005)
Comparison of fMRI statistical software packages and strategies for analysis of images containing random and stimulus-correlated motion
Comput. Med. Imaging Graph.
Schizophrenia genes—famine to feast
Biol. Psychiatry
Merlin-rapid analysis of dense genetic maps using sparse gene flow trees
Nat. Genet.
Physiological dysfunction of the dorsolateral prefrontal cortex in schizophrenia revisited
Cereb. Cortex
Variation in DISC1 affects hippocampal structure and function and increases risk for schizophrenia
Proc. Natl. Acad. Sci U. S. A.
Cited by (102)
Cluster-level statistical inference in fMRI datasets: The unexpected behavior of random fields in high dimensions
2018, Magnetic Resonance ImagingFalse positive rates in surface-based anatomical analysis
2018, NeuroImageNeuroimaging Epigenetics: Challenges and Recommendations for Best Practices
2018, NeuroscienceCitation Excerpt :Nikolova and colleagues investigated 20 CpG sites upstream of the SLC6A4 transcription start site and probed associations between methylation levels and amygdala reactivity to threat: they successfully used PCA to reduce the number of CpGs to a handful of orthogonal factors and predict amygdala response (Nikolova et al., 2014). Data reduction is key in guarding against the inflation of type 1 errors, which is a concern for imaging genetics analysis (Meyer-Lindenberg et al., 2008). While research has shown that imaging genetics has good control for false-positives at normal-to-conservative statistical thresholds (Meyer-Lindenberg et al., 2008; Silver et al., 2011), the number of DNA methylation predictors is variable and will likely result in a greater number of statistical tests.
Enhancing the Informativeness and Replicability of Imaging Genomics Studies
2017, Biological PsychiatryNeuroimaging Intermediate Phenotypes of Executive Control Dysfunction in Schizophrenia
2016, Biological Psychiatry: Cognitive Neuroscience and Neuroimaging
- 1
Present address: Merck & Co, Inc., BL2-6, PO Box 4, West Point, PA 19486, USA.
- 2
Fax: +1 301 480 7795.