Keywords

1 Introduction

What is common to architecture, basketball, bird watching, and chess? All of these seemingly disparate activities contain a central visual component, which is essential for mastering these activities and developing expertise with them. But, how do people become experts in a given domain? What processes change following the acquisition of expertise, how and are the neural substrates supporting expertise differ from those in novices? Most broadly, expertise is defined as consistently superior performance within a specific domain relative to novices and relative to other domains [1]. The current article focuses on expertise in visual object recognition, which is an acquired skill certain people show in discriminating between similar members of a homogenous object category, a particularly demanding perceptual task. For example, car experts can distinguish between different car models [2] or make ad hoc distinctions, such as between Japanese and European cars [3], but this expertise does not generalize to other similar domains, including airplanes [2] or antique cars [4].

This article presents a neurocognitive approach to the study of visual expertise, the aim of which is to explain the superior performance displayed by experts, often marked by unique behavioral signatures, by specifying the neural mechanisms underlying expertise, and how they pertain to these unique signatures. Recent neuroimaging studies suggest that the neural substrates of expert object recognition are distributed and highly interactive, with the specific regions engaged defined by the domain of expertise and the particular information utilized by the expert. Through experience, this information comes to be extracted and processed through specific observer-based interactions both within the visual system and between visual regions and extrinsic systems, key amongst which are those supporting long-term conceptual knowledge and top-down attention. Such an interactive framework contrasts with the view of expertise as a predominantly sensory or perceptual skill supported by automatic stimulus-driven processes, often localized to discrete category-selective visual regions in occipito-temporal cortex (OTC) [5, 6]. I will start by describing the perceptual view of visual object expertise, as it had a pronounced and long-lasting influence the field of object expertise. I will then highlight its major theoretical and empirical limitations, and present evidence in favor of an alternative, interactive account of expertise. I will conclude by suggesting how this account can be generalized to explain other forms of expertise and used to guide the enhancement of perceptual-cognitive performance.

2 The Perceptual View of Expertise

The hallmark of expert object recognition is the ability of experts to categorize quickly, effortlessly and accurately specific exemplars from a homogenous object category [7]. The remarkable skill that experts display in within-category discrimination (also known as subordinate categorization) manifests in increased sensitivity to fine differences between highly similar stimuli. Thus, one intuitive possibility is that expert object recognition involves primarily changes to sensory or perceptual processing [5]. I refer to this view as the perceptual view of expertise. According to the perceptual view of expertise, in order to overcome the inherent difficulty of within-category discrimination, new modes of processing emerge following the acquisition of expertise. In expertise with natural objects, such as cars or birds, this was suggested to entail changes from local part-based representations to holistic representations [8]. These perceptual processes become automatic over time, to the extent that the expert is assumed to recognize objects in their most specific identity (“Lassie”) as fast they would recognize the at their basic-level (“dog”) [9]. Such automatic stimulus-driven perceptual processing separates expert performance from the novice.

Given its stress on perceptual processing, the perceptual view of expertise suggests that expertise is accompanied by local changes to neural activation in specific regions of visual cortex [1012]. This view is supported by the experience-dependent changes in neural tuning in areas of visual cortex reported in perceptual learning. For example, extensive training on orientation discrimination tasks results in stronger responses and narrower orientation tuning curves in early visual areas [13]. Similarly, long-term training with artificial objects is associated with specific changes in the response of high-level visual cortex [14].

One domain of visual expertise extensively investigated by the perceptual view of expertise is face perception. In terms of natural experience, recognizing faces is arguably the quintessential example of visual expertise [15]. Faces form a highly homogenous set of stimuli with a very similar spatial configuration of parts and therefore, discriminating between individual faces should be, in theory, a difficult perceptual task. Nonetheless, humans are extremely adept in recognizing individual faces and categorizing faces along many other subordinate dimensions (e.g. race, gender). Face perception displays many unique characteristics that differentiate it from recognition of everyday objects. Face processing is holistic [16] and automatic: faces are recognized pre-attentively, and they capture attention exogenously [17]. Accordingly, the perceptual view contends that face recognition can be considered as a general manifestation of perceptual expertise. Consequently, face processing and object expertise were suggested to share common cognitive and neural mechanisms. Expert object processing under this perceptual view is automatic and stimulus-driven, with little impact of attentional, task demands or other higher-level cognitive factors [6]. In line with this view, numerous works examined long-term visual expertise in object recognition using the face perception analogy (for a critical review of these works, see [29]). Like in face perception, experts were shown to automatically process objects of expertise at their individual level relying on holistic information. This was observed in a variety of object categories, including birds, cars, dogs, fingerprints, and chess displays (for a review see [5]). Similarly, the face-selective N170 ERP component was reported to be modulated by visual expertise [19], indicating that expertise impacts early, perceptual stages of processing. Objects of expertise were suggested to be processed by face/category-selective regions in OTC, and particularly by the Fusiform Face Area (FFA: [20]) [6, 2124]. FFA was suggested to support the subordinate categorization of objects (any objects, regardless of category), as long as they share a prototypical configuration of features requiring experience to discriminate between the category members. However, the extent to which FFA supports expert object recognition is highly debated, with several studies failing to find an increase in response magnitude in FFA as a function of expertise [2, 2527].

Regardless of the controversy concerning the reproducibility of expertise effects in FFA, two issues arise when discussing the neural correlates of visual expertise. The first issue is whether the enhanced response in FFA may in actuality reflect the involvement of extraneous top-down factors, which are associated with the expertise of the observers but are not perceptual per se. For example, objects of expertise are more salient and engaging for the expert than for the novice [28], and such enhanced engagement of the expert with objects from his domain of expertise may enhance the magnitude of response in FFA. Enhanced engagement may denote many observer-based factors, such as specific recognition goals, depth of processing, arousal and in particular task-based attention, which has been shown to modulate FFA’s magnitude of response [29]. This is an inherent concern when studying real world expertise, since the experts are selected based on their particular skill, so there is no random assignment to the experimental conditions. Thus, expertise-related modulation of neural activity might reflect controlled top-down modulation of activity in object-selective regions rather than reflecting the operation of a stimulus-driven automatic expert perceptual mechanism [18]. Also pertinent to the question of top-down attention impacting expert-related activity, the second issue with the neuroimaging studies mentioned above, is that these studies focused primarily on the FFA (or other functionally-defined regions in visual cortex), ignoring the possibility that expertise effects may be expressed across the entire cortex, reflecting the operation of wider large-scale cortical networks. If expertise involves top-down controlled processing, one network in particular which might be employed is the fronto-parietal dorsal attentional network [30]; observing widespread patterns of activations characteristic of the dorsal attentional network which are nonetheless specific to category and level of expertise would provide a strong indication that visual expertise recruits additional processes and is not a strictly perceptual skill. Hence, it stands to reason that a full account of the neural correlates of visual expertise, the extent to which they spread beyond visual cortex, and critically, their susceptibility top-down modulations, will provide important insights into the nature of processing in visual expertise. Such studies are described next.

3 The Interactive View of Expertise

Clues regarding the full extent of expertise-related activity comes from fMRI object training studies that reported changes following training outside of FFA, both in additional areas of OTC and outside of OTC, including superior temporal sulcus, posterior parietal cortex and prefrontal cortex (for a review, see [31]). However, while these training studies demonstrate changes following experience, this type of visual-perceptual experience is only one aspect of real world object expertise. Real-world natural objects embody rich information not only in terms of their appearance, but also in their function, motor affordances, and other semantic properties. Given these extended properties, the cortical representations of objects can be considered conceptual and distributed rather than sensory and localized [32]. Critically, it is these differences in conceptual associations, which distinguish experts from novices since long-term real world expert object recognition is not only characterized by perceptual changes, but also by the ability to access relevant and meaningful conceptual information that is not available to non-experts [3335]. In the acquisition of expertise, conceptual knowledge develops, along with other observer-based high-level factors (e.g. autobiographical memories, emotional associations) in conjunction with experience-dependent changes in perceptual processing. However, the conceptual properties of objects have not typically been manipulated in training studies such as those described above (but see [36]). Indeed, a complete account of real world expert object recognition cannot ignore observer-based high-level factors, and must specify how stimulus-based sensory-driven processing interacts with high-level factors. For example, the expert’s increased knowledge and engagement may guide the extraction of diagnostic visual information, which in turn, may be used to expand existing conceptual knowledge [34]. This experience-based interplay between conceptual and perceptual processing is at the core of the interactive view of expertise. This interactive view of expertise contrasts with the perceptual view of expertise (i.e. as automatic, domain-specific, and attention-invariant) and echoes a more general view of visual recognition as an interaction between stimulus information (“bottom-up”) and observer-based cognitive (“top-down”) factors such as goals, expectations, and prior knowledge [3739]. It is important to note that while the interactive view does not support a strict stimulus-driven view of expert processing, it also does not suggest that the effects of experience are driven solely by top-down factors that operate independently of the perceptual processing in sensory cortex. Rather, the interactive view proposes that expert object recognition depends on both sensory stimulus-driven processing as well as more high-level cognitive factors with a critical interaction between these processes, whereby the expert’s increased knowledge and attention guides the extraction of diagnostic visual information.

One of the first neuroimaging studies to explicitly test the role of task-based attentional engagement in visual expertise was Harel et al. (2010) [2]. They assessed the full extent of the neural substrates of visual expertise across the entire brain by presenting car experts and novices with images of cars, faces, and airplanes while they were performing a standard one-back task, requiring detection of image repeats [11]. Directly contrasting the car-selective activation (cars vs. airplanes) of car experts with that of novices culminated in widespread effects of expertise, which encompassed not only category-selective regions in OTC, but also retinotopic early visual cortex as well as areas outside of visual cortex including the precuneus, intraparietal sulcus, and lateral prefrontal cortex. This widespread distributed effect of expertise overlaps with the fronto-parietal dorsal attentional network, and accordingly was suggested to reflect the increased level of top-down engagement that the experts have with their objects of expertise. This hypothesis was tested in a second experiment, which controlled the attentional engagement of experts by manipulating the task relevance of their objects of expertise [11]. Car experts and novices were presented with interleaved images of cars and airplanes but were instructed to attend to cars in one half of the trials (making cars task-relevant), and to attend to airplanes the other half of the trials (deeming cars task-irrelevant), responding whenever they saw an immediate image repeat in the attended category. A view of expertise as an automatic stimulus-driven process would predict a similar pattern of activation irrespective of the task relevance of the cars [6]. Contrary to this prediction, experts showed widespread selectivity for cars only when they were task-relevant (i.e. when they were actively attended to). When the same car images were presented, but were task-irrelevant, the car selectivity in experts diminished considerably, to the extent that there were almost no differences between the experts and novices. In other words, the neural activity characteristic of visual object expertise reflects the enhanced attentional engagement of the experts rather than the mandatory operation of perceptual, stimulus-driven recognition mechanisms. The neural correlates of expertise are not limited to a number of ‘‘hot spots’’ in visual cortex, but rather constitute a large-scale distributed network operating when experts are highly engaged in the recognition of objects from their domain of expertise. This raises the possibility that the preferential activation associated with object expertise is elicited only if the expert is voluntarily engaged in processing of diagnostic visual information - either because it is task relevant or because there are no task constraints limiting this process. This would explain why experimentally reducing the engagement level of the expert reduces the selective cortical activity underlying the visual expertise. Subsequent fMRI studies with car experts supported this conjecture, both in terms of the extent of activation, and its modulation by attentional engagement [24, 40, 41]. Widespread distributed effects were reported in car experts both within visual cortex and outside of it. Visual experience with cars was found to predict neural activity not only in the functionally-defined FFA, but also medial fusiform gyrus, lingual gyrus, and precuneus [24], as well as in early visual cortex, parahippocampal gyrus, hippocampus, middle and superior temporal gyrus, intraparietal sulcus, inferior frontal gyrus, and cingulate gyrus [40]. Such extensive activation as reported in [40] clearly indicates that even when experts are engaged with a very simple visual task, a host of cognitive processes are at play, and many of them, are arguably non-visual. This further demonstrates how visual expertise is intrinsically linked to multiple top-down factors, which interact and modulate “pure” visual processing of the stimulus. Importantly, the attentional modulation by engagement was also addressed in [40]. Visual selective attention was manipulated by including conditions in which cars were always presented in a pair with another object. Distributed patterns of activation were manifest when the cars were attended and the other object ignored (albeit reduced relative to the isolated condition due to the more taxing nature of this task), and critically, when the car images in the pairs were ignored, the extent of activation decreased dramatically. A univariate region of interest analysis further revealed that manipulating the engagement of the experts with their objects of expertise almost entirely abolished car expertise effects in functionally defined regions in OTC.

It is important to note that experts not only direct more attention to objects of expertise, they engage in a multitude of other unique cognitive and affective processes. For example, chess experts utilize multiple cognitive functions, including object recognition, conceptual knowledge, memory, and the processing of spatial configurations [42]. Accordingly, several cortical regions were reported to be active in chess experts when they observe chessboards [25, 42]. Expert-related activity during visual recognition was found to be widespread, extending beyond visual cortex to include activations in collateral sulcus, posterior middle temporal gyrus, occipitotemporal junction, supplementary motor area, primary motor cortex, and left anterior insula. These regions have been suggested to support pattern recognition, perception of complex relations, and action-related functional knowledge of chess objects [42]. Critically, task context and prior knowledge have been shown to play an essential role in driving cortical activations in chess experts [23, 42]. Expert-specific patterns of activation manifested only when the task was specific to the domain of expertise (e.g., searching for particular chess pieces), and not when comparable control tasks with identical visual input were used (i.e., a task that did not require the recognition of particular chess pieces). In essence, when the chess experts were not engaged there was little activity that distinguished them from novices, directly echoing the findings of Harel et al. (2010). Further research is required to determine the exact nature of the interaction between the different cognitive processes involved in chess expertise and actual visual processing (i.e. how visual information is utilized and accessed by higher-level cognitive processes ubiquitous to chess). One chess expertise study pointing at this direction has highlighted the importance of characterizing the functions of different prefrontal and parietal regions as they relate to expertise [43]. Specifically, experts’ processing of random chess displays, as compared to normal displays was linked to activation in prefrontal and lateral parietal brain regions. This was interpreted as support for the idea that expert chess players perform a working memory task when they respond to random boards by engaging an active search for novel chunks involving the dorsal fronto-parietal network.

A complementary approach to study the neural manifestations of expertise is to measure how brain structure differs with expertise as opposed to measuring differences in brain activation. Expertise-related structural changes in the brain allow assessment of consequences of long-term expertise on gray matter structure across the entire brain, independently of the context set by a particular experimental task. Using the structural data from the set of experiments by Harel and colleagues reported above, a recent voxel-based morphometry (VBM) [44] study investigated whether inter-individual differences in behavioral performance with cars would predict differences in gray matter density [35]. According to the interactive view of expertise, one should predict structural changes to be outside of visual cortex, reflecting the involvement of higher-level cognitive processes, rather than visual processing per se. Consistent with this prediction, although the skill demonstrated by the experts was in recognizing and visually matching the models of cars, changes in neural structure associated with their expertise were found outside the of visual cortex. Long-term experience in car recognition positively correlated with increasing gray matter density in local prefrontal cortex structures, and not in OTC. In sum, long-term expertise for real-world objects does not necessarily entail structural changes in regions associated with visual processing per se, but rather in more anterior regions in prefrontal cortex associated with retrieving of goal-relevant knowledge from semantic memory [45]. Experts are more knowledgeable about both shape and function of their objects of expertise and this domain-specific conceptual knowledge interacts with and guides the extraction of visual information [33, 46].

4 Implications of the Interactive View of Expertise

The current paper focused primarily on expertise in visual object recognition, reassessing its common view as a predominantly automatic stimulus-driven perceptual skill that is supported by category-selective areas in high-level visual cortex. However, the interactive view of visual expertise can be expanded to account for the neural manifestations of other types of expertise that involve the extraction of visual information. The interactive framework posits that visual expertise emerges from multiple interactions within and between the visual system and other cognitive systems, such as top-down attention and conceptual memory. These interactions are manifest in widespread distributed patterns of activity across the entire cortex, and are highly susceptible to high-level factors. Importantly, which brain networks will be engaged by different types of expertise is determined by the informational demands imposed by the particular domain of expertise, that is, based on the totality of the cognitive processes invoked by the particular domain [31].

While the current paper concentrated on task relevance and prior knowledge, other mental operations are certain to interact with and constrain visual processing. For example, professional basketball players have been shown to excel at anticipating the consequences of the actions of other players (i.e. success of free shots at a basket). This visual skill has been associated with activations in a body-selective region in OTC as well as in frontal and parietal areas. Whereas the former reflects expert reading of the observed action kinematics, the latter traditionally has been involved in action observation [47]. Extensive and varied activations have been observed in many additional domains of expertise, including architecture, musical notation reading, archery, taxi driving, and reading (for a review, see [31]). This demonstrates that the neural substrates of visual expertise extend well beyond visual cortex, and are manifest in regions supporting attention, memory, spatial cognition, language, and action observation. Importantly, the involvement of these systems is predictable from their general functions, suggesting that expertise evolves largely within the same systems that initially process the stimuli. Overall, it is clear that more complex forms of visual expertise recruit broad and diverse arrays of cortical and subcortical regions. This, in fact, may be the key feature of the neural architecture of expertise: visual expertise, in its broadest sense, engages multiple cognitive processes in addition to perception, and the interplay between these different cognitive systems unites what seems to be very different domains of expertise. Studying the different networks that form the neural correlates of expertise can inform us of the diverse cognitive processes involved in particular domains of expertise. Uncovering the involvement of specific cortical networks in a particular domain of expertise has the potential to reveal the involvement of cognitive processes, which hitherto were not thought to be involved in that domain. This has the potential to transform not only our theoretical understanding of visual expertise, but also can change the way specialized training regimens are developed for expertise. Put simply, processes that were not traditionally considered part of visual expertise would be further addressed and incorporated to newer, neurally feasible models of expertise. These neurally feasible non-visual processes of expertise will be operationalized and used to form specific training programs. The interactive framework to visual expertise implies that training people to become experts in visual recognition has to involve additional non-visual forms of information. Moreover, special consideration needs to be given to the question of how to train observers to become more engaged in their domain, and critically, how to exert control over this enhanced engagement, so it will be applied in the right situation at the right time. In that respect, to guarantee enhanced performance of visual experts would ultimately require integrative multi-faceted training programs, which address basic perceptual processing as well as high-level cognitive, and even motivational or affective factors.

5 Summary

The interactive view of visual expertise suggests that superior performance in recognizing objects is an interactive process, which emerges from multiple interactions within and between the visual system and other cognitive systems, such as top-down attention and conceptual memory. These interactions are manifest in widespread distributed patterns of activity across the entire cortex, and are highly susceptible to high-level factors, such as task relevance and prior knowledge. Notably, which cortical areas and the extent to which they are involved, is determined by the informational demands imposed by the particular domain of expertise. Moreover, the different brain areas implicated in a certain type of expertise do not operate independently, as activity in one area is mutually constrained by activity in the others, reflecting the interactive nature of visual processing in general, and expertise in particular. These insights into the nature processing in visual expertise provide a basis for future cognitive-perceptual training programs, which will capitalize on the multi-process nature of visual expertise.