Elsevier

NeuroImage

Volume 35, Issue 1, March 2007, Pages 121-130
NeuroImage

Non-normality and transformations of random fields, with an application to voxel-based morphometry

https://doi.org/10.1016/j.neuroimage.2006.11.037Get rights and content

Abstract

Parametric tests of linear models for images modeled as random fields are based like ordinary univariate tests on distributional assumptions. It is here shown that the effect of departures from assumptions in random field tests is more pronounced than in the univariate condition. Simulations are presented investigating in detail the influence of smoothing, unbalancedness and leverages on empirical thresholds. In certain conditions, significance tests may become invalid. As a case study, the existence and effect of departures from normality of gray matter probability maps, commonly used in voxel-based morphometry, is investigated, as well as the effect of different transformation strategies involving estimating the degree of transformation from the data by maximum likelihood. The best results are achieved with a voxel-by-voxel transformation, suggesting heterogeneity of distributional form across the volume for this kind of data.

Introduction

The effect of non-normality on the performance of significance tests based on Student’s t or Snedecor’s F statistics was investigated early in the statistical literature (Pearson, 1929, Pearson, 1931, Pearson and Please, 1975, Box, 1953). The general conclusion from these and later studies is that, unless deviation from normality is severe, the impact on classic univariate parametric tests is limited, in many cases only reducing the power of the test (see summary in Miller, 1986). In the first part of this study, simulations on the effect of non-normality on random fields were carried out to show their impact on the thresholds that determine achieved significance levels in procedures such as statistical parametric mapping (SPM, Friston et al., 1995). A specific study is justified by the fact that it is difficult to extrapolate the results from these early studies to the random field setting. On the one hand, random field theory tests (Worsley et al., 1992, Worsley et al., 1996) are based on an approximation of the distribution of the extrema of the random field, and for this reason they may by more sensitive to deviations from normality than statistics of central tendency such as t in the univariate setting (Westfall and Young, 1993, pp. 56–60). On the other hand, the common practice of enforcing a fixed spatial correlation structure by smoothing the data may reduce any original deviation from non-normality (Salmond et al., 2002).

The deviation from normality examined here only concerns the distribution of the data viewed voxel-by-voxel (marginal normality). This means that we are not concerned with the full distributional specification of the random field model which would also require, for example, that the joint spatial distribution be normal or that the extent of spatial correlation be uniform across the volume (Hayasaka et al., 2004). In this respect, we also note that, while the distributional requirements of random field theory tests do not coincide with voxel-by-voxel distributional assumptions, these latter may be important for other types of tests or for multiple testing situations in general (Westfall and Young, 1993). If the correct voxel-by-voxel significance threshold varies across the volume, applying a uniform threshold will have the effect of applying unequal requirements in weighting the evidence in individual voxels (Beran, 1988).

This study is primarily motivated by the application of statistical tests on structural images, which often consist of probability values or coefficients. The issue is how much non-normality can be tolerated before the achieved Type I error rates are altered, making the test invalid or overly conservative. This issue affects unbalanced comparisons, and especially single-subject studies (Colliot et al., 2005, Kassubek et al., 2002, Woermann et al., 1999a, Woermann et al., 1999b), since here the consequences of non-normality are most marked. In the first part of this study we will show with the help of Monte Carlo simulations that even after spatial smoothing departures from normality can indeed impact random field thresholds. It will be shown that this occurs by non-normality levels that would not raise concern in a univariate setting. However, the conservativeness of t random field tests generally ensures that the tests remain valid. Hence, one may be more afraid of loosing of power than committing a Type I error when using random field theory tests, but we will show that exceptions are likely to occur at extreme degrees of unbalancedness, such as in tests of individual images, and regressors with high leverages.

In the second part of this study, we will examine the behavior of empirical thresholds computed from 114 gray matter probability maps estimated from MPRAGE images, a type of structural T1-weighted images used in voxel-based morphometry (Ashburner and Friston, 2000). Voxel-based morphometry is a prominent methodology in the analysis of structural data. We will show here that smoothing is only partially effective in reducing the impact of non-normality and that in fact non-normality affects the exact thresholds following a spatial pattern. To overcome this problem, we investigate the application of data-driven maximum likelihood estimates of the required amount of data transformation.

This study extends existing work on voxel-based morphometry data (Salmond et al., 2002) in several respects. Firstly, we provide simulations exploring in detail the effect of smoothing, unbalancedness, and regressor leverages on the asymmetry of empirical thresholds in tests on random fields. Secondly, in the case study with gray matter probability maps, we use a much larger sample of volumes from an adult population, while Salmond et al. (2002) focused on structural images of a small sample of children. The larger sample allows us to compute maps of the non-uniform distribution of areas where overthreshold voxels occur as a result of Type I errors. Thirdly, we investigate the impact of different transformation strategies on the empirical thresholds and the spatial distribution of overthreshold voxels. In particular, we study the shortcomings of uniform transformations at smoothing kernels of 4 mm or less and compare their performance to that of transformations estimated from the data.

Section snippets

Simulations

All codes implementing the algorithms and the simulations presented here were developed on MATLAB 6.1 R12 (The Mathworks, Natick, MA) installed on a Pentium PC running Windows 2000 (Microsoft, Redmond, WA). Transformed artificial random fields of size 32 × 32 × 32 voxels (as in the simulations of Nichols and Hayasaka, 2003) were created by convolving a Gaussian kernel of full width half-maximum (FWHM) 4 voxels with a standard normal deviate x after applying the inverse Box–Cox transformation:f(x;α,β

Effect of skewness and kurtosis on distribution of extrema

In this first simulation, the inverse Box–Cox and power transformations were applied prior to smoothing artificial random fields, obtaining data with increasing skewness and kurtosis (see Materials and methods for details). The plots of the empirical and the theoretical thresholds are displayed in Fig. 1.

One can see that the effect of skewness on the Monte Carlo thresholds can be substantial and increases when the smoothing kernel is smaller (top half of Fig. 1). The first trial, marked with

Discussion

The Monte Carlo studies presented here demonstrate that the effect of sampling extrema dominates over that of spatial smoothing (Fig. 1). Even at relatively large smoothing kernels the effect of skewness on effective Type I error rates is noticeable. In t tests, the effect of the central limit theorem is that of further limiting the impact of skewness. In these circumstances, the effect on effective Type I error rates is complex: any amount of unbalancedness in the data leads to noticeable

References (25)

  • O. Colliot et al.

    Individual voxel-based analysis of gray matter in focal cortical dysplasia

    NeuroImage

    (2005)
  • R.D. Cook et al.

    Residuals and Influence in Regression

    (1980)
  • Cited by (23)

    • Multimodal FLAIR/MPRAGE segmentation of cerebral cortex and cortical myelin

      2017, NeuroImage
      Citation Excerpt :

      In voxels where this probability is small or high, the distribution of the outcome variable becomes skewed. In practice, this is likely to be a problem for voxel-level inferences, but only for extreme values and in the presence of unbalanced groups or in a regression (Viviani et al. 2007). Kurtotic distributions are likely to be a problem only in small samples.

    • False positive rates in Voxel-based Morphometry studies of the human brain: Should we be worried?

      2015, Neuroscience and Biobehavioral Reviews
      Citation Excerpt :

      Secondly, the results of a VBM study are dependent on the type of preprocessing. This may differ with respect to the segmentation procedure (Ashburner, 2012), the widely discussed normalization protocol (Crum et al., 2003; Ashburner and Friston, 2001) and the Gaussian smoothing kernel applied to the images (Salmond et al., 2002; Viviani et al., 2007; Smith and Nichols, 2009). Thirdly, the results of a VBM study depend on the statistical analysis.

    • Individualized Gaussian process-based prediction and detection of local and global gray matter abnormalities in elderly subjects

      2014, NeuroImage
      Citation Excerpt :

      Application of larger smoothing kernel sizes improved normality but left a noticeable positively skewed distribution of observations, even with large smoothing kernels. As recently suggested by Viviani et al. (2007) we explored the benefits of local Box–Cox transformation of the gray matter volume observations before subsequent GP modeling. Fig. 5B depicts Q–Q plot correlations and standardized moments of the model residuals after the transformation.

    View all citing articles on Scopus
    View full text