A decade of decoding reward-related fMRI signals and where we go from here

doi:10.1016/j.neuroimage.2017.03.067

NeuroImage

Volume 180, Part A, 15 October 2018, Pages 324-333

https://doi.org/10.1016/j.neuroimage.2017.03.067 Get rights and content

Highlights

•
MVPA has been used in recent years to study reward and decision making.
•
Assumptions and models for interpreting these results are rarely discussed.
•
This paper reviews selected MVPA studies on reward and what we have learned from them.
•
It highlights questions about reward processing for which MVPA is particularly useful.

Abstract

Information about potential rewards in the environment is essential for guiding adaptive behavior, and understanding neural reward processes may provide insights into neuropsychiatric dysfunctions. Over the past 10 years, multivoxel pattern analysis (MVPA) techniques have been used to study brain areas encoding information about expected and experienced outcomes. These studies have identified reward signals throughout the brain, including the striatum, medial prefrontal cortex, orbitofrontal cortex, dorsolateral prefrontal cortex, and parietal cortex. This review article discusses some of the assumptions and models that are used to interpret results from these studies, and how they relate to findings from animal electrophysiology. The article reviews and summarizes some of the key findings from MVPA studies on reward. In particular, it first focuses on studies that, in addition to mapping out the brain areas that process rewards, have provided novel insights into the coding mechanisms of value and reward. Then, it discusses examples of how multivariate imaging approaches are being used more recently to decode features of expected rewards that go beyond value, such as the identity of an expected outcome or the action required to obtain it. The study of such complex and multifaceted reward representations highlights the key advantage of using representational methods, which are uniquely able to reveal these signals and may narrow the gap between animal and human research. Applied in a clinical context, MVPA may advance our understanding of neuropsychiatric disorders and the development of novel treatment strategies.

Introduction

Approaching potential rewards in the environment while avoiding danger is critical for survival. To enable such adaptive behaviors, nervous systems of humans and other animals must process and represent reward-related information about future and experienced outcomes. In turn, disruptions in neural reward processing may result in maladaptive behaviors such as those observed in patients with neurological and psychiatric disorders.

Neuroscience has a rich history of studying reward processing in animals using lesions (Everitt and Stacey, 1987, Gaffan and Murray, 1990, Holland and Gallagher, 1993, Schoenbaum et al., 2003), electrophysiology (Romo and Schultz, 1990, Schoenbaum et al., 1998, Tremblay and Schultz, 1999) and, more recently, optogenetics (Ferenczi et al., 2016, Gremel and Costa, 2013, Steinberg et al., 2013). In contrast, the study of human reward processing has traditionally been limited to evidence from patients with lesions in circumscribed brain areas (Bechara et al., 1994, Tsuchida et al., 2010). This drastically changed with the invention of functional magnetic resonance imaging (fMRI), which has opened a window into the neural mechanisms of reward in humans (Breiter et al., 2001, Knutson et al., 2001, O’Doherty et al., 2001). Although univariate fMRI studies have taught us a great deal about the brain regions involved in reward processing (Bartra et al., 2013, Clithero and Rangel, 2013), they are restricted to brain areas in which neuronal activity is relatively uniformly associated with reward across voxels and individuals. The advent of multivariate approaches to fMRI data analysis (Haxby et al., 2001, Haynes and Rees, 2005, Kamitani and Tong, 2005, Kriegeskorte et al., 2006) mitigated these limitations and enabled us to study brain regions in which neurons represent reward-related information heterogeneously across space and individuals.

Over the past decade, fMRI studies using MVPA techniques have decoded a wide array of reward-related signals from the human brain. Individual studies have examined feedback-related signals predicting value-based choices (Hampton and O’Doherty, 2007), and how reinforcement signals from the outcome of simple games are represented (Vickery et al., 2011). Other studies have focused on decoding the subjective value of offers (Krajbich et al., 2009, Wang et al., 2014), the value of predicted outcomes (Kahnt et al., 2010, Kahnt et al., 2011b), consumer choices made inside the scanner (Grosenick et al., 2008), and differences between types of valuation and reward (Clithero et al., 2009, Clithero et al., 2011). Moreover, several studies used brain activity acquired during passive viewing (Levy et al., 2011), or even outside the focus of attention (Tusche et al., 2010, Tusche et al., 2013), to predict preference-based choices made outside the scanner (Smith et al., 2014). For instance, Tusche et al. (2010) presented a series of images of automobiles in the background while subjects were engaged in a demanding visual fixation task. The acquired fMRI data were then used to predict hypothetical consumer choices made after scanning. Similarly, Levy et al. (2011) showed that preference-related fMRI signals during passive viewing of images of consumer products can be used to predict actual choices at a later time point. Beyond demonstrating the technical feasibility of decoding real-world decisions based on fMRI signals, these studies have illustrated potential applications of reward-based decoding approaches to infer choice and valuation when open behavior is unavailable (Grosenick et al., 2008, Smith et al., 2014).

In general, MVPA studies in the reward domain have confirmed previous results from univariate fMRI experiments and successfully decoded reward-related information in subcortical areas such as the striatum, and in areas of the prefrontal and parietal cortex. Section 2 of this article discusses common models and assumptions that relate MVPA findings to the reward-related firing of single neurons. (Note that no general overview on technical aspects of fMRI decoding methods and multivariate classifiers will be provided, and the interested reader is referred to comprehensive review articles on this topic (Haynes, 2015, Misaki et al., 2010, Mourao-Miranda et al., 2005).) Beyond mapping reward-related brain regions, many MVPA studies have addressed additional questions regarding neural mechanisms of reward that would have been difficult to address with conventional imaging approaches. A selection of such studies is the focus of section 3 wherein key points arising from these experiments are summarized. Section 4 discusses how representational imaging methods are being used more recently to address questions related to reward and goal-directed behavior that go beyond the encoding of value.

Section snippets

Reward coding in single neurons

How do neuronal populations encode information about reward? Single- and multi-unit recordings in animals indicate that depending on the brain region, the relationship between reward parameters and neuronal firing rates is either simple and homogeneous, or complex and heterogeneous. For instance, dopaminergic neurons in the substantia nigra and ventral tegmental area display a relatively homogeneous coding scheme, such that firing rates increase with the value of unpredicted rewards and

What have we learned from decoding reward signals?

In general, reward studies that use MVPA approaches have largely confirmed previous results from univariate fMRI studies. The following section focuses on a selection of studies that have used MVPA methods to answer questions related to reward processing that go beyond what is known from, and what can be typically achieved by, traditional univariate analysis methods.

Current topics in decoding reward

This section draws from studies that highlight the potential of MVPA methods to ask more detailed questions about the nature of reward representations. Specifically, it reviews studies that use MVPA to demonstrate how reward signals can take the form of a common currency for value, and how these general value signals contrast with highly specific reward representations which simultaneously encode multiple features of expected outcomes that are not necessarily related to value.

Discussion and conclusions

Multivariate decoding techniques have offered new and exciting ways to analyze fMRI data, and they have substantially extended the scope of questions that can be addressed. The study of human reward learning and decision-making has benefitted from this advance. The main takeaway from reward studies using MVPA methods is that in many regions of the brain, encoding of reward information is not limited to value. Specifically, reward predictive signals, especially in prefrontal cortex, incorporate

Acknowledgments

The author thanks Drs. J.D. Howard, L.P. Qu, and P.N. Tobler for insightful comments and suggestions. The author is supported by grants from the National Institute on Deafness and Other Communication Disorders (NIDCD) and the National Institute on Drug Abuse (NIDA), National Institutes of Health, USA.

References (129)

O. Bartra et al.
The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value
Neuroimage
(2013)
A. Bechara et al.
Insensitivity to future consequences following damage to human prefrontal cortex
Cognition
(1994)
E.D. Boorman et al.
Two anatomically and computationally distinct learning signals predict changes to stimulus-outcome associations in hippocampus
Neuron
(2016)
H.C. Breiter et al.
Functional imaging of neural responses to expectancy and experience of monetary gains and losses
Neuron
(2001)
E.S. Bromberg-Martin et al.
Dopamine in motivational control: rewarding, aversive, and alerting
Neuron
(2010)
R.M. Cichy et al.
Encoding the identity and location of objects in human LOC
Neuroimage
(2011)
J.A. Clithero et al.
Local pattern classification differentiates processes of economic valuation
Neuroimage
(2009)
J.A. Clithero et al.
Within- and cross-participant classifiers reveal different neural coding of information
Neuroimage
(2011)
J.L. Gardner
Is cortical vasculature functionally organized?
Neuroimage
(2010)
K. Hackmack et al.
Multi-scale classification of disease using structural MRI and wavelet transform
Neuroimage
(2012)

J.D. Haynes

A primer on pattern-based approaches to fMRI: principles, pitfalls, and perspectives

Neuron

(2015)

J.D. Haynes et al.

Reading hidden intentions in the human brain

Curr. Biol.

(2007)

T. Kahnt et al.

Perceptual learning and decision-making in human medial frontal cortex

Neuron

(2011)

T. Kahnt et al.

Decoding different roles for vmPFC and dlPFC in multi-attribute decision making

Neuroimage

(2011)

J.T. Klein et al.

Social information signaling by neurons in primate striatum

Curr. Biol.

(2013)

B. Knutson et al.

A region of mesial prefrontal cortex tracks monetarily rewarding outcomes: characterization with rapid event-related fMRI

Neuroimage

(2003)

N. Kriegeskorte et al.

How does an fMRI voxel sample the neuronal activity pattern: compact-kernel or complex spatiotemporal filter?

Neuroimage

(2010)

Y.C. Leong et al.

Dynamic interaction between reinforcement learning and attention in multidimensional environments

Neuron

(2017)

D.J. Levy et al.

The root of all value: a neural common currency for choice

Curr. Opin. Neurobiol.

(2012)

M. Misaki et al.

Comparison of multivariate classifiers and response normalizations for pattern-information fMRI

Neuroimage

(2010)

P.R. Montague et al.

Neural economics and the biological substrates of valuation

Neuron

(2002)

J. Mourao-Miranda et al.

Classifying brain states and determining the discriminating activation patterns: support Vector Machine on functional MRI data

Neuroimage

(2005)

T. Naselaris et al.

Encoding and decoding in fMRI

Neuroimage

(2011)

T. Naselaris et al.

Bayesian reconstruction of natural images from human brain activity

Neuron

(2009)

L. Pogoda et al.

Multivariate representation of food preferences in the human brain

Brain Cogn.

(2016)

A.G. Ramayya et al.

Expectation modulates neural representations of valence throughout the human brain

Neuroimage

(2015)

A. Alink et al.

fMRI orientation decoding in V1 does not require global maps or globally coherent orientation stimuli

Front. Psychol.

(2013)

H.C. Barron et al.

Repetition suppression: a means to index neural representations using BOLD?

Philos. Trans. R. Soc. Lond. B Biol. Sci.

(2016)

K.A. Burke et al.

The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards

Nature

(2008)

R.M. Carter et al.

A distinct role of the temporal-parietal junction in predicting socially guided decisions

Science

(2012)

S.E. Cavanagh et al.

Autocorrelation structure at rest predicts value correlates of single neurons during reward-guided choice

eLife

(2016)

S.C. Chan et al.

A probability distribution over latent causes, in the orbitofrontal cortex

J. Neurosci.

(2016)

L.J. Chang et al.

A sensitive and specific neural signature for picture-induced negative affect

PLoS Biol.

(2015)

V.S. Chib et al.

Evidence for a common representation of decision values for dissimilar goods in human ventromedial prefrontal cortex

J. Neurosci.

(2009)

J. Chikazoe et al.

Population coding of affect across stimuli, modalities and individuals

Nat. Neurosci.

(2014)

R.M. Cichy et al.

Resolving human object recognition in space and time

Nat. Neurosci.

(2014)

J.A. Clithero et al.

Informatic parcellation of the network involved in the computation of subjective value

Soc. Cogn. Affect. Neurosci.

(2013)

T. Davis et al.

Measuring neural representations with fMRI: practices and pitfalls

Ann. N. Y. Acad. Sci.

(2013)

P. Domenech et al.

The neuro-computational architecture of value-based selection in the human brain

Cereb. Cortex

(2017)

J. Dubois et al.

Single-unit recordings in the macaque face patch system reveal limitations of fMRI MVPA

J. Neurosci.

(2015)

B.J. Everitt et al.

Studies of instrumental behavior with sexual reinforcement in male rats (Rattus norvegicus): ii. Effects of preoptic area lesions, castration, and testosterone

J. Comp. Psychol.

(1987)

E.A. Ferenczi et al.

Prefrontal cortical regulation of brainwide circuit dynamics and reward-related behavior

Science

(2016)

T.H. FitzGerald et al.

Action-specific value signals in reward-related regions of the human brain

J. Neurosci.

(2012)

J. Freeman et al.

Coarse-scale biases for spirals and orientation in human visual cortex

J. Neurosci.

(2013)

D. Gaffan et al.

Amygdalar interaction with the mediodorsal nucleus of the thalamus and the ventromedial prefrontal cortex in stimulus-reward associative learning in the monkey

J. Neurosci.

(1990)

M.F. Glasser et al.

The Human Connectome project’s neuroimaging approach

Nat. Neurosci.

(2016)

J.A. Gottfried et al.

Human orbitofrontal cortex mediates extinction learning while accessing conditioned representations of value

Nat. Neurosci.

(2004)

J.A. Gottfried et al.

Encoding predictive reward value in human amygdala and orbitofrontal cortex

Science

(2003)

C.M. Gremel et al.

Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions

Nat. Commun.

(2013)

L. Grosenick et al.

Interpretable classifiers for FMRI improve prediction of purchases

IEEE Trans. Neural Syst. Rehabil. Eng.

(2008)

Cited by (45)

The dorsomedial prefrontal cortex represents subjective value across effort-based and risky decision-making
2023, NeuroImage
Decisions that require taking effort costs into account are ubiquitous in real life. The neural common currency theory hypothesizes that a particular neural network integrates different costs (e.g., risk) and rewards into a common scale to facilitate value comparison. Although there has been a surge of interest in the computational and neural basis of effort-related value integration, it is still under debate if effort-based decision-making relies on a domain-general valuation network as implicated in the neural common currency theory. Therefore, we comprehensively compared effort-based and risky decision-making using a combination of computational modeling, univariate and multivariate fMRI analyses, and data from two independent studies. We found that effort-based decision-making can be best described by a power discounting model that accounts for both the discounting rate and effort sensitivity. At the neural level, multivariate decoding analyses indicated that the neural patterns of the dorsomedial prefrontal cortex (dmPFC) represented subjective value across different decision-making tasks including either effort or risk costs, although univariate signals were more diverse. These findings suggest that multivariate dmPFC patterns play a critical role in computing subjective value in a task-independent manner and thus extend the scope of the neural common currency theory.
Prediction errors drive dynamic changes in neural patterns that guide behavior
2023, Cell Reports
Learning describes the process by which our internal expectation models of the world are updated by surprising outcomes (prediction errors [PEs]) to improve predictions of future events. However, the mechanisms through which error signals dynamically influence existing neural representations are unknown. Here, we use functional magnetic resonance imaging (fMRI) in humans solving a two-step Markov decision task to investigate changes in neural activation patterns following PEs. Using a dynamic multivariate pattern analysis, we can show that PE-related fMRI responses in error-coding regions predict trial-by-trial changes in multivariate neural patterns in the orbitofrontal cortex, the precuneus, and the ventromedial prefrontal cortex (vmPFC). Importantly, the dynamics of these pattern changes in the vmPFC also predicted upcoming changes in choice strategies and thus highlight the importance of these pattern changes for behavior.
It is a matter of perspective: Attentional focus rather than dietary restraint drives brain responses to food stimuli
2023, NeuroImage
Brain responses to food are thought to reflect food's rewarding value and to fluctuate with dietary restraint. We propose that brain responses to food are dynamic and depend on attentional focus. Food pictures (high-caloric/low-caloric, palatable/unpalatable) were presented during fMRI-scanning, while attentional focus (hedonic/health/neutral) was induced in 52 female participants varying in dietary restraint. The level of brain activity was hardly different between palatable versus unpalatable foods or high-caloric versus low-caloric foods. Activity in several brain regions was higher in hedonic than in health or neutral attentional focus (p < .05, FWE-corrected). Palatability and calorie content could be decoded from multi-voxel activity patterns (p < .05, FDR-corrected). Dietary restraint did not significantly influence brain responses to food. So, level of brain activity in response to food stimuli depends on attentional focus, and may reflect salience, not reward value. Palatability and calorie content are reflected in patterns of brain activity.
Spatiotemporal Precision of Neuroimaging in Psychiatry
2023, Biological Psychiatry
Citation Excerpt :
Neuroimaging can complement such computational models of decision making in psychopathology by measuring a reward prediction error signal (i.e., the difference between the reward that was received and the reward that was expected), a key computational component in reinforcement learning and active inference models (101). Reward prediction error signals localize to specific neurochemical circuitry (e.g., dopaminergic pathways) and are observable in both MEG/EEG (102,103) and fMRI (104). Reward prediction error signals, detected with fMRI, accurately predict response to CBT in depression, where an increased responsivity of the amygdala and striatum to unexpected rewards has been interpreted as indicating susceptibility to subsequent belief updating during cognitive restructuring during CBT (105).
Aberrant patterns of cognition, perception, and behavior seen in psychiatric disorders are thought to be driven by a complex interplay of neural processes that evolve at a rapid temporal scale. Understanding these dynamic processes in vivo in humans has been hampered by a trade-off between spatial and temporal resolutions inherent to current neuroimaging technology. A recent trend in psychiatric research has been the use of high temporal resolution imaging, particularly magnetoencephalography, often in conjunction with sophisticated machine learning decoding techniques. Developments here promise novel insights into the spatiotemporal dynamics of cognitive phenomena, including domains relevant to psychiatric illnesses such as reward and avoidance learning, memory, and planning. This review considers recent advances afforded by exploiting this increased spatiotemporal precision, with specific reference to applications that seek to drive a mechanistic understanding of psychopathology and the realization of preclinical translation.
More complex than you might think: Neural representations of food reward value in obesity
2022, Appetite
Citation Excerpt :
Another relevant factor is that most previous studies used only mass-univariate analyses for their fMRI-data, which only inform on where in the brain the level of brain activity is relatively high. However, empirical studies have shown that the level of brain activity in the mesocorticolimbic system is actually unlikely to reflect the reward value of the presented visual food stimuli (Chikazoe et al., 2014a; Janssen et al., 2019; Kahnt, 2018; Salamone & Correa, 2012; Suzuki et al., 2017). Brain activity in this system, instead, may reflect salience.
Obesity reached pandemic proportions and weight-loss treatments are mostly ineffective.
The level of brain activity in the reward circuitry is proposed to be proportionate to the reward value of food stimuli, and stronger in people with obesity. However, empirical evidence is inconsistent. This may be due to the double-sided nature of high caloric palatable foods: at once highly palatable and high in calories (unhealthy).
This study hypothesizes that, viewing high caloric palatable foods, a hedonic attentional focus compared to a health and a neutral attentional focus elicits more activity in reward-related brain regions, mostly in people with obesity. Moreover, caloric content and food palatability can be decoded from multivoxel patterns of activity most accurately in people with obesity and in the corresponding attentional focus.
During one fMRI-session, attentional focus (hedonic, health, neutral) was manipulated using a one-back task with individually tailored food stimuli in 32 healthy-weight people and 29 people with obesity. Univariate analyses (p < 0.05, FWE-corrected) showed that brain activity was not different for palatable vs. unpalatable foods, nor for high vs. low caloric foods. Instead, this was higher in the hedonic compared to the health and neutral attentional focus. Multivariate analyses (MVPA) (p < 0.05, FDR-corrected) showed that palatability and caloric content could be decoded above chance level, independently of either BMI or attentional focus.
Thus, brain activity to visual food stimuli is neither proportionate to the reward value (palatability and/or caloric content), nor significantly moderated by BMI. Instead, it depends on people's attentional focus, and may reflect motivational salience. Furthermore, food palatability and caloric content are represented as patterns of brain activity, independently of BMI and attentional focus. So, food reward value is reflected in patterns, not levels, of brain activity.
Testing the distributed representation hypothesis in object recognition in two open datasets
2022, Neuroscience Letters
Neural representation has long been thought to follow the modularity hypothesis, which states that each type of information corresponds to a specific brain area. Though supported by many studies, this hypothesis surfers the pitfall of inefficiency for information encoding. To overcome difficulties the modularity representation hypothesis faced, researchers have proposed that information may be distributed represented in a specific brain area. The distributed representation hypothesis along with the multi-variate pattern approaches have made great success in detecting representation patterns in the previous decade. However, this hypothesis implicitly requires that the pattern should be transformed in a consistent way with respect to all of the represented information in the specific brain area. And the accuracy and validity of this prediction have never been thoroughly tested. Here in the present study, we tested this prediction in two open datasets compiling the object recognition. We validated the distributed representation patterns in the lateral occipital complex/ventral temporal gyrus where all six classifiers were capable of predicting the correct category represented. Furthermore, we correlated the classifiers’ decision function values to the bold signals and found that the decision function value of the logistic regression classifier was exclusively correlated with activities of the same brain area in both datasets. These results support the distributed representation hypothesis and suggest that our neural system may be embedded within the algorithm of a specific classifier.

View all citing articles on Scopus

View full text

A decade of decoding reward-related fMRI signals and where we go from here

Highlights

Abstract

Introduction

Section snippets

Reward coding in single neurons

What have we learned from decoding reward signals?

Current topics in decoding reward

Discussion and conclusions

Acknowledgments

Neuroimage

Cognition

Neuron

Neuron

Neuron

Neuroimage

Neuroimage

Neuroimage

Neuroimage

Neuroimage

Neuron

Curr. Biol.

Neuron

Neuroimage

Curr. Biol.

Neuroimage

Neuroimage

Neuron

Curr. Opin. Neurobiol.

Neuroimage

Neuron

Neuroimage

Neuroimage

Neuron

Brain Cogn.

Neuroimage

fMRI orientation decoding in V1 does not require global maps or globally coherent orientation stimuli

Front. Psychol.

Repetition suppression: a means to index neural representations using BOLD?

Philos. Trans. R. Soc. Lond. B Biol. Sci.

The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards

Nature

A distinct role of the temporal-parietal junction in predicting socially guided decisions

Science

Autocorrelation structure at rest predicts value correlates of single neurons during reward-guided choice

eLife

A probability distribution over latent causes, in the orbitofrontal cortex

J. Neurosci.

A sensitive and specific neural signature for picture-induced negative affect

PLoS Biol.

Evidence for a common representation of decision values for dissimilar goods in human ventromedial prefrontal cortex

J. Neurosci.

Population coding of affect across stimuli, modalities and individuals

Nat. Neurosci.

Resolving human object recognition in space and time

Nat. Neurosci.

Informatic parcellation of the network involved in the computation of subjective value

Soc. Cogn. Affect. Neurosci.

Measuring neural representations with fMRI: practices and pitfalls

Ann. N. Y. Acad. Sci.

The neuro-computational architecture of value-based selection in the human brain

Cereb. Cortex

Single-unit recordings in the macaque face patch system reveal limitations of fMRI MVPA

J. Neurosci.

Studies of instrumental behavior with sexual reinforcement in male rats (Rattus norvegicus): ii. Effects of preoptic area lesions, castration, and testosterone

J. Comp. Psychol.

Prefrontal cortical regulation of brainwide circuit dynamics and reward-related behavior

Science

Action-specific value signals in reward-related regions of the human brain

J. Neurosci.

Coarse-scale biases for spirals and orientation in human visual cortex

J. Neurosci.

Amygdalar interaction with the mediodorsal nucleus of the thalamus and the ventromedial prefrontal cortex in stimulus-reward associative learning in the monkey

J. Neurosci.

The Human Connectome project’s neuroimaging approach

Nat. Neurosci.

Human orbitofrontal cortex mediates extinction learning while accessing conditioned representations of value

Nat. Neurosci.

Encoding predictive reward value in human amygdala and orbitofrontal cortex

Science