Elsevier

NeuroImage

Volume 29, Issue 2, 15 January 2006, Pages 396-408
NeuroImage

Application of artificial neural network to fMRI regression analysis

https://doi.org/10.1016/j.neuroimage.2005.08.002Get rights and content

Abstract

We used an artificial neural network (ANN) to detect correlations between event sequences and fMRI (functional magnetic resonance imaging) signals. The layered feed-forward neural network, given a series of events as inputs and the fMRI signal as a supervised signal, performed a non-linear regression analysis. This type of ANN is capable of approximating any continuous function, and thus this analysis method can detect any fMRI signals that correlated with corresponding events. Because of the flexible nature of ANNs, fitting to autocorrelation noise is a problem in fMRI analyses. We avoided this problem by using cross-validation and an early stopping procedure. The results showed that the ANN could detect various responses with different time courses. The simulation analysis also indicated an additional advantage of ANN over non-parametric methods in detecting parametrically modulated responses, i.e., it can detect various types of parametric modulations without a priori assumptions. The ANN regression analysis is therefore beneficial for exploratory fMRI analyses in detecting continuous changes in responses modulated by changes in input values.

Introduction

In functional magnetic resonance imaging (fMRI) studies, searches for the occurrence of signal changes that correlated with certain events are conducted to detect brain activations. In the general linear model (GLM) approach (Friston et al., 1995), a regressor representing the canonical hemodynamic response function (HRF) is used to detect such correlations. However, if the shape of the hemodynamic response differs greatly from the pre-assumed shape (Aguirre et al., 1998, Miezin et al., 2000) or an unknown process mediates such correlations, we cannot detect those correlations. In some cases, a combination of regressors, the canonical HRF, its temporal derivative, and a dispersion derivative, is used to absorb this diversity of response shape (Friston et al., 1998a, Friston et al., 1998b). Although this method can absorb minor changes in the canonical HRF, it cannot absorb all diversity and the response variability is still a problem.

Various other approaches, which do not assume the shape of the HRF a priori, have been proposed including selective averaging (Dale and Buckner, 1997), smooth FIR filters (Goutte et al., 2000), and non-parametric Bayesian estimation of the HRF (Marrelec et al., 2003). The primary advantage of these methodologies, which are called non-parametric methods because no parametric models of the HRF are used, is that even when the response functions diverge from region to region or subject to subject, correlated responses can be detected.

In this study, we proposed another ‘semi-parametric’ method for fMRI analysis: the use of a very general class of functional forms to build more flexible models (Bishop, 1995). The proposed method uses a feed-forward layered artificial neural network (ANN) to describe a non-linear dynamic system of hemodynamic response. Some event over its recent history is used as an input and the BOLD signal from a particular voxel is used as a supervised signal. This type of artificial neural network is called a multi-layer perceptron (MLP). It is known that for an infinite number of hidden units, an MLP with one or more hidden layers whose output functions are sigmoid functions can approximate any continuous function to any degree of accuracy (Funahashi, 1989, Hornik et al., 1989). Thus this method can perform a non-linear regression analysis between BOLD signals and events without explicit modeling of the response function.

In the MLP, hidden units receive input vector xin multiplied by an adjustable weight matrix Whidden-in and bias value bhidden. Hidden units have a general class of transfer function, sigmoid function tanh, and transfer the input value to the output layer.xhidden=tanh(Whidden−inxin+bhidden)

At the output layer, the outputs of the hidden layer xhidden are multiplied by an adjustable weight matrix Wout-hidden and a bias value bout is added.y=Wout−hiddenxhidden+bouty is a network output. Thus, an MLP approximates the regression function y = f(xin), by a convolution of sigmoid functions with adjustable input weights. Hereafter, we call this regression the ANN regression.

The ANN regression produces results known to be equivalent to the Volterra kernel method (Volterra, 1959) without explicit definition of the kernel functions (Wray and Green, 1994). Friston et al., 1998b, Friston et al., 2000 have already applied the Volterra series approach to fMRI analysis. In their approach, the kernel functions are explicitly modeled (Friston et al., 2000). From the viewpoint of MLP, this approach can be seen as using fewer pre-assumed transfer functions at the hidden layer and only a restricted number of input-hidden connections. In contrast, the proposed ANN regression has no explicit form of response functions and uses fully adjustable connections between the input and hidden layers. Because an adequately trained network is equivalent to the Volterra series expansion with all the dimensions of that system (Wray and Green, 1994), the ANN regression provides more flexibility than using only a restricted set of Volterra kernels. Furthermore, such a network can describe any relations between input and output series.

The ANN regression analysis has another advantage over non-parametric methods. ANN regression can detect any type of response modulation by the input values. This type of analysis, which is known as parametric modulation analysis, deals not only with the existence of activation, but also with how that activation is modulated by the parameters of the event. To detect parametric modulation using non-parametric methods, we have to explicitly model the shape of the modulations; otherwise we can detect only linear modulations. The ANN regression, in contrast, does not require explicitly modeling of the modulation shape. Furthermore, any continuous modulation can be detected, owing to the ability of the ANN regression to adjust to any continuous functions.

This paper presents a method of ANN regression that can be applied to fMRI studies. In fMRI analyses, autocorrelation noise involved in the BOLD signal time course (Bullmore et al., 1996, Purdon and Weisskoff, 1998, Woolrich et al., 2001) may be a problem for the ANN regression, because it can fit any relations between events and BOLD signals. We considered this problem in analyzing null-task fMRI data and synthetic white noise data. In the following section, we describe the application of the ANN method to a practical case: a memory-guided saccade task. Because this task includes various responses with different shapes, it is a good example of how the ANN method can fit various shapes of responses. Finally, to explore the potential of the ANN method, a parametric modulation analysis was performed. Using a synthetic data set, we compared the ANN method and a non-parametric method and their ability to detect non-linear parametric modulations.

Section snippets

Analysis methods

Our analysis used a multi-layer ANN to regress fMRI signals using a history of recent events. The ANN is a network of simple processing nodes connected by a certain weights (Hertz et al., 1991). The model was inspired by biological processes, i.e., modeling biological neural network architectures such as parallel processing, with the node transfer function providing simple modeling of neural activation. Although it has been used for modeling various perceptual and cognitive processes in

ANN fitting to white noise and null fMRI data

When the ANN regression is applied to fMRI data, autocorrelation noise, which is often involved in a BOLD signal time course (Bullmore et al., 1996), may become a problem. Although we were able to avoid over-fitting by using the cross-validation procedure described above, an ANN may fit autocorrelation noise because it is not white noise. In this section, we examine how an ANN fits to noise signals, i.e., synthetic white noise and a null fMRI signal (no task was applied and the subject rested

Application to practical data: memory-guided saccade task

In an earlier section, we showed that the ANN did not regress the autocorrelation noise relative to the white noise. However, the ANN may fit noises as well as activation signals. In this section, we describe the use of a practical task experiment to examine whether we could discriminate activation from noise in the ANN regression result.

In the experiment, a memory-guided saccade task (Funahashi et al., 1989, Funahashi et al., 1991, Hikosaka and Wurtz, 1983) was performed. The subject was

Parametric modulation analysis using an ANN regression and non-parametric methods

One advantage of the ANN analysis over non-parametric methods is in finding modulated responses by the stimulus parameters in a parametric modulation analysis (e.g., Pinel et al., 2001, Riecker et al., 2003, Coull et al., 2004). This analysis tells us not only ‘which area’ of the brain was activated by the stimulus, but also ‘how’ the brain region was activated by changes in the stimulus parameters. Though conventional non-parametric methods can describe response modulations, these methods

General discussion

This paper examined the application of an artificial neural network (ANN) regression analysis to fMRI data. To reduce the computational time, we used an RPROP algorithm (Riedmiller, 1994, Riedmiller and Braun, 1996), and to avoid over-fitting we used early stopping with cross-validation. The result of fitting to synthetic white noise and null fMRI data indicated that the autocorrelation noise intrinsically included in most fMRI time course data was less problematic in the ANN regression.

Acknowledgment

We would like to thank to Dr. Makoto Kato for his helpful comments and technical assistance with the memory-guided saccade task.

References (48)

  • T.T. Liu

    Efficiency, power, and entropy in event-related fMRI with multiple trial types Part II: design of experiments

    NeuroImage

    (2004)
  • T.T. Liu et al.

    Efficiency, power, and entropy in event-related FMRI with multiple trial types Part I: theory

    NeuroImage

    (2004)
  • J.L. Marchini et al.

    A new statistical approach to detecting significant activation in functional MRI

    NeuroImage

    (2000)
  • F.M. Miezin et al.

    Characterizing the hemodynamic response: effects of presentation rate, sampling procedure, and the possibility of ordering brain activity based on relative timing

    NeuroImage

    (2000)
  • P. Pinel et al.

    Modulation of parietal activation by semantic distance in a number comparison task

    NeuroImage

    (2001)
  • L. Prechelt

    Automatic early stopping using cross validation: quantifying the criteria

    Neural Netw.

    (1998)
  • A. Riecker et al.

    Parametric analysis of rate-dependent hemodynamic response functions of cortical and subcortical brain structures during auditorily cued finger tapping: a fMRI study

    NeuroImage

    (2003)
  • J. Tanabe et al.

    Comparison of detrending methods for optimal fMRI preprocessing

    NeuroImage

    (2002)
  • M.W. Woolrich et al.

    Temporal autocorrelation in univariate linear modeling of FMRI data

    NeuroImage

    (2001)
  • D. Anguita et al.

    Speed Improvement of the back-propagation on current generation workstations

  • Y. Benjamini et al.

    Controlling the false discovery rate: a practical and powerful approach to multiple testing

    J.R. Stat. Soc.

    (1995)
  • C.M. Bishop

    Neural Networks for Pattern Recognition

    (1995)
  • E. Bullmore et al.

    Statistical methods of estimation and inference for functional MR image analysis

    Magn. Reson. Med.

    (1996)
  • J.T. Coull et al.

    Functional anatomy of the attentional modulation of time estimation

    Science

    (2004)
  • Cited by (12)

    • Reproducibility of importance extraction methods in neural network based fMRI classification

      2018, NeuroImage
      Citation Excerpt :

      In our case, the parameters of the model were optimized through backpropagation using mini-batch gradient descent as the optimization algorithm. Neural network classifiers have been previously used to classify fMRI data either with hidden layers (Bertolino et al., 2014; Floren et al., 2015; Misaki and Miyauchi, 2006) or without (Polyn et al., 2005; Saarimäki et al., 2016). The majority of MVPA studies use support vector classifiers (SVC) (Cox and Savoy, 2003; De Martino et al., 2008, Ethofer et al., 2009; Habes et al., 2013; LaConte et al., 2005; Kamitani and Tong, 2005; Lahnakoski et al., 2014; Lie et al., 2013; Meier et al., 2012; Mourão-Miranda et al., 2005; Mourão-Miranda et al., 2007; Rasmussen et al., 2011; see also Sundermann et al., 2014, for an extended list) due to fast training and good performance in ill-posed problems such as in fMRI classification (Etzel et al., 2013).

    • Towards XAI: Interpretable Shallow Neural Network Used to Model HCP’s fMRI Motor Paradigm Data

      2022, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    View all citing articles on Scopus
    View full text