An evaluation of thresholding techniques in fMRI analysis
Introduction
Many fMRI experiments have a common objective of identifying active voxels in a neuroimaging dataset. This is done in single-subject experiments, for example, by performing individual voxel-wise tests of the null hypothesis that the observed time course is not significantly related to an assigned reference function Bandettini et al., 1993, Cox et al., 1995. A voxel activation map is then constructed by applying a thresholding rule to the resulting t statistics.
This paper describes three error rates that may be used to formally set activation thresholds based on individual voxel-wise test statistics, but not on cluster size. We review methods that control each of the error rates at a prespecified level α. These methods include simple procedures that ignore spatial correlation among the test statistics as well as more elaborate ones that incorporate this correlation information. The operating characteristics of the methods are shown through a simulation study, highlighting two results. First, as has been noted previously, the choice of error rate substantially impacts the power to detect true activations. Second, complicated procedures which explicitly account for the correlation structure do not improve the power in most practical situations, except when data are extremely strongly correlated. Therefore, for most single-subject analyzes, the simple procedures are recommended in practice. A real bilateral finger-tapping experiment is used to illustrate the methods and conclusions.
Section snippets
Problem and error rates
A common way of determining significance of a statistical hypothesis test is to specify the significance level or type I error rate of the test, usually denoted by α, and use this to determine a threshold. The type I error rate is the probability that, if the voxel were truly inactive, its test statistic would exceed the threshold, leading to the incorrect conclusion that it is active. This significance level determines the threshold, so that, for example, a 5% level voxel z test would have a
Methods for controlling the FWE
The simplest way to control the FWE is through the Bonferroni method. To apply this, simply divide the individual threshold significance level α by the number of voxel hypotheses m to arrive at an adjusted threshold significance level α' = α/m for each voxel test. This guarantees that FWE is no larger than α because
One limitation of the Bonferroni method is that it results in conservative control of the FWE (i.e., fewer voxels declared
Methods for controlling the FDR
Benjamini and Hochberg (1995) propose a simple step-up procedure for controlling the FDR at level q, which was applied to neuroimaging data by Genovese et al. (2002). This procedure is called step-up because it uses an adaptive threshold which depends on the ordered P values P(1) ≤ P(2) ≤,…, ≤ P(m), where the subscript in parentheses denotes the order. Let v(i) denote the voxel corresponding to P value P(i), and let d be the largest i for which
The BH (Benjamini and Hochberg) procedure
Basic design
Data are generated to simulate a bilateral finger-tapping fMRI block design experiment where the true motor activation structure is known so that each of the thresholding methods can be evaluated.
A 64 × 64 slice is selected for analysis within which two 7 × 7 ROIs as lightened in Fig. 1 are designated to have activation. For this slice, simulated fMRI data are constructed according to a regression model which consists of an intercept, a time trend for all voxels but also a reference function
Real fMRI Example
To illustrate the thresholding methods described in this paper, a bilateral finger-tapping experiment was performed with the same design as the previous simulation study. To generate the functional data, bilateral finger tapping was performed in a block design with eight epochs of 16 s on and 16 s off. Scanning was performed using a 3-T Bruker Biospec in which 15 axial slices of size 64 × 64 were acquired. Each voxel has dimensions in mm of 3.125 × 3.125 × 5, with TE = 27.2 ms. Observations
Conclusion
This simulation study highlights two important findings. First, as has been indicated previously by other authors, the FDR-controlling methods generally have higher power than FWE-controlling methods to detect active voxels. The average magnitude of this power improvement was approximately 14% in the simulations considered, but this is likely to be sensitive to the underlying parameters involved and the size of the image considered. In general, the FDR criterion is more robust to the size of
References (33)
- et al.
Generic brain activation mapping in functional magnetic resonance imaging: a nonparametric approach
Magn. Reson. Imaging
(1997) - et al.
Analysis of fMRI time series revisited-again
NeuroImage
(1995) - et al.
Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics
J. Stat. Plan. Inference
(1999) - et al.
Processing strategies for time-course data sets in functional MRI of the human brain
Magn. Reson. Med
(1993) - et al.
Controlling the false discovery rate: a practical and powerful approach to multiple testing
J.R. Stat. Soc
(1995) - et al.
On adaptive control of the false discovery rate in multiple testing with independent statistics
J. Educ. Behav. Stat
(2000) - et al.
The control of the false discovery rate in multiple testing under dependency
Ann. Stat
(2001) - et al.
Statistical methods of estimation and inference for functional MR image analysis
Magn. Reson. Med
(1996) - et al.
Colored noise and computational inference in neurophysiological (fMRI) time series analysis: resampling methods in time and wavelet domains
Hum. Brain Mapp
(2001) - et al.
Real-time functional magnetic resonance imaging
Magn. Reson. Med
(1995)