Elsevier

Neural Networks

Volume 20, Issue 9, November 2007, Pages 1040-1053
Neural Networks

2007 Special Issue
Consciousness CLEARS the mind

https://doi.org/10.1016/j.neunet.2007.09.014Get rights and content

Abstract

A full understanding of consciousness requires that we identify the brain processes from which conscious experiences emerge. What are these processes, and what is their utility in supporting successful adaptive behaviors? Adaptive Resonance Theory (ART) predicted a functional link between processes of Consciousness, Learning, Expectation, Attention, Resonance and Synchrony (CLEARS), including the prediction that “all conscious states are resonant states”. This connection clarifies how brain dynamics enable a behaving individual to autonomously adapt in real time to a rapidly changing world. The present article reviews theoretical considerations that predicted these functional links, how they work, and some of the rapidly growing body of behavioral and brain data that have provided support for these predictions. The article also summarizes ART models that predict functional roles for identified cells in laminar thalamocortical circuits, including the six layered neocortical circuits and their interactions with specific primary and higher-order specific thalamic nuclei and nonspecific nuclei. These predictions include explanations of how slow perceptual learning can occur without conscious awareness, and why oscillation frequencies in the lower layers of neocortex are sometimes slower beta oscillations, rather than the higher-frequency gamma oscillations that occur more frequently in superficial cortical layers. ART traces these properties to the existence of intracortical feedback loops, and to reset mechanisms whereby thalamocortical mismatches use circuits such as the one from specific thalamic nuclei to nonspecific thalamic nuclei and then to layer 4 of neocortical areas via layers 1-to-5-to-6-to-4.

Introduction

Adaptive Resonance Theory (ART) proposes that there is an intimate link between an animal’s conscious awareness and its ability to learn quickly about a changing world throughout life. In particular, ART points to a critical role for “resonant” states in driving fast learning; hence the name adaptive resonance. These resonant states are bound together, using internal top-down feedback, into coherent representations of the world. In particular, ART proposes how learned bottom-up categories and learned top-down expectations interact to create these coherent representations. Learned top-down expectations can be activated in a data-driven manner by bottom-up processes from the external world, or by intentional top-down processes when they prime the brain to anticipate events that may or may not occur. In this way, ART clarifies one sense, but not the only one, in which the brain carries out predictive computation.

When such a learned top-down expectation is activated, matching occurs of the top-down expectation against bottom-up data. If the bottom-up and top-down patterns are not too different, such a matching process can lead to the focusing of attention upon the expected clusters of information, which are called critical feature patterns, at the same time that mismatched signals are suppressed. A resonant state emerges through sustained feedback between the attended bottom-up signal pattern and the active top-down expectation as they reach a consensus between what is expected and what is there in the outside world.

ART predicts that all conscious states in the brain are resonant states, and that these resonant states can trigger rapid learning of sensory and cognitive representations, without causing catastrophic forgetting. This prediction clarifies why it is easier to quickly learn about information to which one pays attention. ART hereby proposes that one reason why advanced animals are intentional and attentional beings is to enable rapid learning about a changing world throughout life.

Psychophysical and neurobiological data in support of ART have been reported in experiments on vision, visual object recognition, auditory streaming, variable-rate speech perception, somatosensory perception and cognitive-emotional interactions, among others. Some of these data are summarized below. Others are reviewed in Carpenter and Grossberg (1991), Grossberg, 1999b, Grossberg, 2003a, Grossberg, 2003b, Grossberg, 2003c, and Raizada and Grossberg (2003). In particular, ART mechanisms seem to be operative at all levels of the visual system, and it has been proposed how these mechanisms are realized by laminar circuits of visual cortex as they interact with specific and nonspecific thalamic nuclei (Grossberg, 2003b, Grossberg and Versace, submitted for publication, Raizada and Grossberg, 2003, Versace and Grossberg, 2005, Versace and Grossberg, 2006). These laminar models of neocortex have been called LAMINART models because the laminar anatomy of neocortex embodies the types of attentional circuits that were predicted by ART (Grossberg, 1999a). Most recently, it has been proposed how a variation of these laminar neocortical circuits in the prefrontal cortex can carry out short-term storage of event sequences in working memory, learning of categories that selectively respond to these stored sequences, and variable-speed performance of the stored sequences under volitional control (Grossberg and Pearson, submitted for publication, Pearson and Grossberg, 2005, Pearson and Grossberg, 2006). These examples from vision and cognition show how both spatial and temporal processes can be carried out by variations of the same neocortical design, and point the way towards a general theory of laminar neocortex that can explain aspects of all higher-order intelligent behavior.

Although ART-style learning and matching processes seem to be found in many sensory and cognitive processes, another type of learning and matching is found in spatial and motor processes. Spatial and motor processing in the brain’s Where processing stream (Goodale & Milner, 1992) obey learning and matching laws that are often complementary (Grossberg, 2000b) to those used for sensory and cognitive processing in the What processing stream of the brain (Mishkin et al., 1983, Ungerleider and Mishkin, 1982). Whereas sensory and cognitive representations use attentive matching to maintain their stability as we learn more about the world, spatial and motor representations are able to forget learned maps and gains that are no longer appropriate as our bodies develop and grow from infanthood to adulthood.

These memory differences can be traced to complementary differences in the corresponding matching and learning processes. ART-like sensory and cognitive learning occurs in an approximate match state, and matching is excitatory, which enables it to realize a type of excitatory priming. Spatial and motor learning often embodies Vector Associative Map (VAM) circuits (Gaudiano and Grossberg, 1991, Guenther et al., 1994) that occur in a mismatch state, and matching is realized by an inhibitory process. These complementary differences clarify why procedural memories are unconscious; namely, the inhibitory matching process that supports spatial and motor processes cannot lead to resonance.

The LAMINART models (e.g. Fig. 1) are not merely anatomically more precise versions of previous ART ideas. They represent a breakthrough in computing that identifies new principles and processes that embody novel computational properties with revolutionary implications. LAMINART models embody a new type of hybrid between feedforward and feedback computing, and also between digital and analog computing (Grossberg, 2003b) for processing distributed data. These properties go beyond the types of Bayesian models that are so popular today. They underlie the fast but stable self-organization that is characteristic of cortical development and lifelong learning.

The synthesis of feedback and feedback processing can be understood from the following example: When an unambiguous scene is processed, the LAMINART model can quickly group the scene in a fast feedforward sweep of activation that passes directly through layer 4 to 2/3 and then on to layers 4 to 2/3 in subsequent cortical areas (Fig. 2(c) and (e)). This property clarifies how recognition can be so fast in response to unambiguous scenes; e.g. Thorpe, Fize, and Marlot (1996). On the other hand, if there are multiple possible groupings in a scene, say in response to a complex textured scene, then competition among these possibilities due to inhibitory interactions in layers 4 and 2/3 (black cells and synapses in Fig. 2) can cause all cell activities to become smaller. This happens because the competitive circuits in the model are self-normalizing; that is, they tend to conserve the total activity of the circuit. This self-normalizing property is related to the ability of the shunting on-center off-surround networks that realize the competitive circuits to process input contrasts over a large dynamic range without saturation (Douglas et al., 1995, Grossberg, 1973, Grossberg, 1980, Heeger, 1992).

In other words, these self-normalizing circuits carry out a type of real-time probability theory in which the amplitude of cell activity covaries with the certainty of the network’s selection, or decision, about a grouping. Amplitude, in turn, is translated into processing speed and coherence of cell activities. Low activation slows down the feedforward processing in the circuit because it takes longer for cell activities to exceed output threshold and to activate subsequent cells above threshold. In the model, network uncertainty is resolved through feedback: Weakly active layer 2/3 grouping cells feed back signals to layers 6-then-4-then-2/3 to close a cortical feedback loop that contrast enhances and amplifies the winning grouping to a degree and at a rate that reflect the amount of statistical evidence for that grouping. As the winner is selected, and weaker groupings are suppressed, its cells become more active and synchronous, hence can again rapidly send the cortical decision to subsequent processing stages.

In summary, the LAMINART circuit behaves like a real-time probabilistic decision circuit that operates as quickly as possible, given the evidence. It operates in a fast feedforward mode when there is little uncertainty, and automatically switches to a slower feedback mode when there is uncertainty. Feedback selects a winning decision which enables the circuit to speed up again, since activation amplitude, synchronization and processing speed both increase with certainty.

The LAMINART model also embodies a novel kind of hybrid computing that simultaneously realizes the stability of digital computing and the sensitivity of analog computing. This is true because the feedback loop between layers 2/3-6-4-2/3 that selects or confirms a winning grouping (Fig. 2(c) and (e)) has the property of analog coherence (Grossberg, 1999a, Grossberg et al., 1997, Grossberg and Raizada, 2000); namely, this feedback loop can synchronously choose and store a winning grouping without losing analog sensitivity to amplitude differences in the input pattern. The coherence that is derived from synchronous storage in the feedback loop provides the stability of digital computing–the feedback loop exhibits hysteresis that can preserve the stored pattern against external perturbations–while preserving the sensitivity of analog computation.

Another property of note in a LAMINART circuit reflects the claim that the ability to rapidly learn throughout life without a loss of stability is related to consciousness: “all conscious states are resonant states”. However, the converse statement: “all resonant states are conscious states” is not predicted to be true. An example of such an exception will now be described.

LAMINART circuits can stabilize development and learning using the intracortical feedback loop between layers 2/3-6-4-2/3. This feedback loop supports an intracortical “resonance”. This contrast-enhancing feedback loop selects winning groupings in the adult. It is also predicted to help stabilize development in the infant and learning throughout life, since cells that fire together wire together (Grossberg, 1999a). This intracortical circuit can work even before intercortical attentional feedback can develop (Fig. 2(e)). The LAMINART model clarified that both of these circuits can stabilize cortical development and learning, not only the top-down intercortical circuit that ART originally predicted. The intracortical feedback loops between different layers of the neocortex prevent an infinite regress from occurring, by stabilizing cortical development and learning before top-down intercortical feedback can develop and play its own role in stabilization.

Early versions of ART predicted that top-down attention can modulate and stabilize the learning process through a competitive matching process (Grossberg, 1976, Grossberg, 1980). Later modelling studies (e.g. Carpenter and Grossberg (1987)) refined this prediction to assert that this matching process is realized by a top-down, modulatory on-center, off-surround network. A great deal of perceptual and brain data have accumulated in support of this hypothesis; see Grossberg (2003b) and Raizada and Grossberg (2003) for reviews of these data, including the popular “biased competition” term for this process (Desimone, 1998).

The LAMINART model advanced this prediction by identifying intercortical and interlaminar circuits that can realize top-down, modulatory on-center, off-surround feedback (Fig. 2(b)). This additional step also clarified how pre-attentive grouping and top-down attention share the same modulatory on-center, off-surround decision circuit from layer 6-to-4 with each other, and also with feedforward pathways that automatically activate cells in response to bottom-up inputs (Fig. 2(a)–(c)). Because a “pre-attentive grouping is its own attentional prime,” these intracortical feedback loops also solve another problem: ART predicted that, in order to prevent unstable development and learning, only bottom-up inputs can supraliminally activate brain sensory and cognitive cells that are capable of learning, since top-down attention is typically modulatory (except when volition enables top-down attention to generate visual imagery or thoughts; see Grossberg (2000a)). How, then, can illusory contours form without destabilizing brain circuits? Because a “pre-attentive grouping is its own attentional prime,” it can use the layer 6-to-4 competitive decision circuit to select the correct grouping cells for learning, even without top-down attention.

This refinement of the ART prediction implies that, although top-down attention is needed for fast and stable learning of conscious experiences to occur, learning can also occur if a pre-attentive grouping competitively selects the correct cells with which to “resonate,” and thereby synchronize, for a sufficiently long time using its intracortical 2/3-6-4-2/3 feedback circuit. Such learning may be slow and inaccessible to consciousness. Watanabe, Náñez, and Sasaki (2001) have recently reported consistent data about slow perceptual learning without conscious awareness. More experiments need to be done to test if the predicted intracortical but interlaminar cortical mechanisms contribute to this sort of learning.

The sharing by grouping and attention of the same decision circuit also enables the model to explain and simulate more data, including data about how attention can selectively activate an object by propagating along the object’s boundary (Roelfsema, Lamme, & Spekreijse, 1998); see Grossberg and Raizada (2000). Additional examples of the role of boundary attention have been described in simulations of the Necker cube (Grossberg & Swaminathan, 2004) and of bistable transparency (Grossberg & Yazdanbakhsh, 2005).

Given the importance of attention in generating conscious experiences, it should be noted that at least three mechanistically-distinct types of attention have been distinguished by cortical modelling studies of visual perception and recognition: Boundary attention, whereby spatial attention can propagate along an object boundary to select the entire object for inspection; surface attention, whereby spatial attention can selectively fill in the surface shape of an object to form an “attentional shroud” (Tyler & Kontsevich, 1995); and prototype attention whereby critical feature patterns of a learned object category can be selectively enhanced. Boundary attention is summarized above. Surface attention helps to intelligently search a scene with eye movements and to learn view-invariant object categories (Fazl, Grossberg, & Mingolla, 2007). Prototype attention is the type of attention that is realized by ART top-down category learning circuits (Carpenter and Grossberg, 1987, Carpenter and Grossberg, 1991, Grossberg, 2003a). All three types of attention utilize one or another type of resonant feedback loops.

Distinguishing these three types of attention is difficult if only because they interact within and across the What and Where cortical processing streams. For example, boundary attention seems to be activated, at least in the experiments of Roelfsema et al. (1998), when spatial attentional maps in the Where cortical stream, notably parietal cortex, project to perceptual boundary representations in the What cortical stream, notably the pale stripes of cortical area V2. Surface attention can be activated when spatial attentional maps in the Where cortical stream, again from parietal cortex, project to perceptual surface representations, notably in the thin stripes of cortical area V2 and in V4, and conversely to form a surface-shroud resonance. Finally, prototype attention seems to act entirely within the What cortical stream from learned recognition categories in prefrontal cortex and inferotemporal cortex to perceptual representations in V2 and V4.

Characterizing these three mechanistically distinct types of attention is further complicated by the fact that feedback interactions occur between the boundary and surface representations in cortical areas V1 and V2, and are predicted to help separate figures from their background in depth, and to control saccadic eye movements (Fang and Grossberg, in press, Fazl et al., 2007, Grossberg, 1994, Grossberg, 1997). Modulation of either boundary or surface representations by spatial attention can therefore be expected to have effects on both types of representations due to these inter-stream feedback interactions. In addition, it is well known that moving stimuli can activate both the What and Where streams and can automatically attract spatial attention in parietal cortex via Where stream pathways from V1 and MT to MST and parietal cortex.

The dynamics of LAMINART circuits, whether in their pre-attentively or attentively activated modes, depend upon the existence of balanced excitatory and inhibitory signals in different cortical layers. In particular, a balance between excitation and inhibition is needed in the perceptual grouping circuit by bipole cells in layer 2/3 (Fig. 2(c) and (e)). This balance helps to ensure that perceptual groupings can form inwardly between pairs or greater numbers of inducers, but not outwardly from a single inducer. Likewise, a balance between excitation and inhibition is required in the on-center of the circuit from layer 6-to-4 that can provide excitatory modulation of cell activities in layer 4, but not fire them fully (Fig. 2(a)–(c)). As noted above, this latter circuit plays an important role in attention (Fig. 2(b)) and in the pre-attentive selection of a correct perceptual grouping in response to a complicated scene (Fig. 2(c)). Grossberg and Williamson (2001) proposed that such balanced circuits are needed for the cortex to develop and learn in a stable way, and simulated how such balanced connections could grow during cortical development. Indeed, if inhibition develops to be too weak, then excitation can propagate uncontrollably, whereas if it is too strong, then cells cannot get sufficiently activated.

On the other hand, balanced excitatory and inhibitory connections have also been used to explain the observed variability in the number and temporal distribution of spikes emitted by cortical neurons (Shadlen and Newsome, 1998, van Vreeswijk and Sompolinsky, 1998). These spiking patterns are quite inefficient in firing cortical cells. Given the LAMINART model proposal that such variability may reflect mechanisms that are needed to ensure stable development and learning by cortical circuits–that is, “stability implies variability”–the cortex is faced with the difficult problem of how to overcome the inefficiency of variable spiking patterns in driving responses from cortical neurons. The LAMINART model shows how these balanced excitatory and inhibitory connections work together to overcome the inefficiency of intermittent spiking by resynchronizing desynchronized signals that belong to the same object, and thereby ensuring that the cortex processes them efficiently. In other words, the very process that enables cortical cells to respond selectively to input patterns–namely, balanced excitation and inhibition–also ensures that cortical cells can fire vigorously and synchronously in response to those patterns that are selected by cortical bottom-up filtering, horizontal grouping, and top-down attention processes.

The remainder of this article summarizes properties of cortical circuits that enable them to realize the predicted CLEARS relationships, and illustrative data that support these predicted circuits.

The problem of learning makes the unity of conscious experience particularly hard to understand, if only because we are able to rapidly learn such enormous amounts of new information, on our own, throughout life. How do we integrate them into unified experiences that cohere into a sense of self? One has only to see an exciting movie just once to marvel at this capacity, since we can then tell our friends many details about it later on, even though the individual scenes flashed by very quickly. More generally, we can quickly learn about new environments, even if no one tells us how the rules of each environment differ. To a remarkable degree, we can rapidly learn new facts without being forced to just as rapidly forget what we already know. As a result, we can confidently go out into the world without fearing that, in learning to recognize a new friend’s face, we will suddenly forget our parents’ faces. This is sometimes called the problem of catastrophic forgetting.

Many contemporary learning algorithms do experience catastrophic forgetting, particularly when they try to learn quickly in response to a changing world. Speaking technically, the brain solves a challenging problem that many current approaches to technology have not yet solved: It is a self-organizing system that is capable of rapid, yet stable, autonomous learning of huge amounts of data from a changing environment that can be filled with unexpected events. Discovering the brain’s solution to this key problem is as important for understanding ourselves as it is for developing new pattern recognition and prediction applications in technology.

I have called the problem whereby the brain learns quickly and stably without catastrophically forgetting its past knowledge the stability-plasticity dilemma. The stability-plasticity dilemma must be solved by every brain system that needs to rapidly and adaptively respond to the flood of signals that subserves even the most ordinary experiences. If the brain’s design is parsimonious, then we should expect to find similar design principles operating in all the brain systems that can stably learn an accumulating knowledge base in response to changing conditions throughout life. The discovery of such principles should clarify how the brain unifies diverse sources of information into coherent moments of conscious experience. ART has attempted to articulate some of these principles, and the neural mechanisms that realize them. The next sections summarize aspects of how this is proposed to occur.

Humans are intentional beings who learn expectations about the world and make predictions about what is about to happen. Humans are also attentional beings who focus processing resources upon a restricted amount of incoming information at any time. Why are we both intentional and attentional beings, and are these two types of processes related? The stability-plasticity dilemma and its solution using resonant states provides a unifying framework for understanding these issues.

To clarify the role of sensory or cognitive expectations, and of how a resonant state is activated, suppose you were asked to “find the yellow ball as quickly as possible, and you will win a $10,000 prize”. Activating an expectation of a “yellow ball” enables its more rapid detection, and with a more energetic neural response. Sensory and cognitive top-down expectations hereby lead to excitatory matching with consistent bottom-up data. Mismatch between top-down expectations and bottom-up data can suppress the mismatched part of the bottom-up data, to focus attention upon the matched, or expected, part of the bottom-up data.

Excitatory matching and attentional focusing on bottom-up data using top-down expectations generates resonant brain states: When there is a good enough match between bottom-up and top-down signal patterns between two or more levels of processing, their positive feedback signals amplify and prolong their mutual activation, leading to a resonant state. Amplification and prolongation of activity triggers learning in the more slowly varying adaptive weights that control the signal flow along pathways from cell to cell. Resonance hereby provides a global context-sensitive indicator that the system is processing data worthy of learning, hence the name Adaptive Resonance Theory, or ART.

In summary, ART predicts a link between the mechanisms which enable us to learn quickly and stably about a changing world, and the mechanisms that enable us to learn expectations about such a world, test hypotheses about it, and focus attention upon information that we find interesting. ART clarifies this link by asserting that, in order to solve the stability-plasticity dilemma, only resonant states can drive rapid new learning.

It is just a step from here to propose that those experiences which can attract our attention and guide our future lives by being learned are also among the ones that are conscious. Support for this additional assertion derives from the many modelling studies whose simulations of behavioral and brain data using resonant states map onto properties of conscious experiences in those experiments.

The type of learning within the sensory and cognitive domain that ART mechanizes is match learning: Match learning occurs only if a good enough match occurs between bottom-up information and a learned top-down expectation that is read out by an active recognition category, or code. When such an approximate match occurs, previously learned knowledge can be refined. Match learning raises the concern about what happens if a match is not good enough? How does such a model escape perseveration on already learned representations?

If novel information cannot form a good enough match with the expectations that are read-out by previously learned recognition categories, then a memory search, or hypothesis testing, is triggered that leads to selection and learning of a new recognition category, rather than catastrophic forgetting of an old one. Fig. 2 illustrates how this happens in an ART model; it will be discussed in greater detail below. In contrast, as noted above, learning within spatial and motor processes is proposed to be mismatch learning that continuously updates sensory-motor maps or the gains of sensory-motor commands. As a result, we can stably learn what is happening in a changing world, thereby solving the stability-plasticity dilemma, while adaptively updating our representations of where objects are and how to act upon them using bodies whose parameters change continuously through time.

It has been mathematically proved that match learning within an ART model leads to stable memories in response to arbitrary list of events to be learned (Carpenter & Grossberg, 1991). However, match learning also has a serious potential weakness: If you can only learn when there is a good enough match between bottom-up data and learned top-down expectations, then how do you ever learn anything that you do not already know? ART proposes that this problem is solved by the brain by using an interaction between complementary processes of resonance and reset, that are predicted to control properties of attention and memory search, respectively. These complementary processes help our brains to balance between the complementary demands of processing the familiar and the unfamiliar, the expected and the unexpected.

Organization of the brain into complementary processes is predicted to be a general principle of brain design that is not just found in ART (Grossberg, 2000b). A complementary process can individually compute some properties well, but cannot, by itself, process other complementary properties. In thinking intuitively about complementary properties, one can imagine puzzle pieces fitting together. Both pieces are needed to finish the puzzle. Complementary processes in the brain are much more dynamic than any such analogy, however: Pairs of complementary processes interact in such a way that their emergent properties overcome their complementary deficiencies to compute complete information about some aspect of the control of intelligent behavior.

The resonance process in the complementary pair of resonance and reset is predicted to take place in the What cortical stream, notably in the inferotemporal and prefrontal cortex. Here top-down expectations are matched against bottom-up inputs (Desimone, 1998, Miller et al., 1991). When a top-down expectation achieves a good enough match with bottom-up data, this match process focuses attention upon those feature clusters in the bottom-up input that are expected. If the expectation is close enough to the input pattern, then a state of resonance develops as the attentional focus takes hold.

Fig. 2 illustrates these ART ideas in a simple two-level example. Here, a bottom-up input pattern, or vector, I activates a pattern X of activity across the feature detectors of the first level F1. For example, a visual scene may be represented by the features comprising its boundary and surface representations. This feature pattern represents the relative importance of different features in the inputs pattern I. In Fig. 2(a), the pattern peaks represent more activated feature detector cells, the troughs less activated feature detectors. This feature pattern sends signals S through an adaptive filter to the second level F2 at which a compressed representation Y (also called a recognition category, or a symbol) is activated in response to the distributed input T. Input T is computed by multiplying the signal vector S by a matrix of adaptive weights that can be altered through learning. The representation Y is compressed by competitive interactions across F2 that allow only a small subset of its most strongly activated cells to remain active in response to T. The pattern Y in the figure indicates that a small number of category cells may be activated to different degrees. These category cells, in turn, send top-down signals U to F1. The vector U is converted into the top-down expectation V by being multiplied by another matrix of adaptive weights. When V is received by F1, a matching process takes place between the input vector I and V which selects that subset X of F1 features that were “expected” by the active F2 category Y. The set of these selected features is the emerging “attentional focus”.

If the top-down expectation is close enough to the bottom-up input pattern, then the pattern X of attended features reactivates the category Y which, in turn, reactivates X. The network hereby locks into a resonant state through a positive feedback loop that dynamically links, or binds, the attended features across X with their category, or symbol, Y.

Resonance itself embodies another type of complementary processing. Indeed, there seem to be complementary processes both within and between cortical processing streams (Grossberg, 2000b). This particular complementary relation occurs between distributed feature patterns and the compressed categories, or symbols, that selectively code them:

Individual features at F1 have no meaning on their own, just like the pixels in a picture are meaningless one-by-one. The category, or symbol, in F2 is sensitive to the global patterning of these features, and can selectively fire in response to this pattern. But it cannot represent the “contents” of the experience, including their conscious qualia, due to the very fact that a category is a compressed, or “symbolic” representation. Practitioners of Artificial Intelligence have claimed that neural models can process distributed features, but not symbolic representations. This is not, of course, true in the brain, which is the most accomplished processor both of distributed features and of symbols that is known to humans. Nor is it true in ART.

The resonance between these two types of information converts the pattern of attended features into a coherent context-sensitive state that is linked to its category through feedback. It is this coherent state, which binds together distributed features and symbolic categories, that can enter consciousness. This resonant binding process joins spatially distributed features into either a stable equilibrium or a synchronous oscillation. The original ART article (Grossberg, 1976) predicted the existence of such synchronous oscillations, which were there described in terms of their mathematical properties as “order-preserving limit cycles”. Since the first neurophysiological experiments reported such synchronous oscillations (Eckhorn et al., 1988, Gray and Singer, 1989), there have been a rapidly growing number of supportive experiments. Simulations of fast-synchronizing ART and perceptual grouping circuits were reported in Grossberg and Somers (1991) and Grossberg and Grunewald (1997), and in a laminar cortical model by Yazdanbakhsh and Grossberg (2004). The ability of neural circuits to synchronize quickly is a topic that is worthy of considerable discussion in its own right.

In ART, the resonant state, rather than bottom-up activation, is predicted to drive the learning process. The resonant state persists long enough, and at a high enough activity level, to activate the slower learning processes in the adaptive weights that guide the flow of signals between bottom-up and top-down pathways between levels F1 and F2 in Fig. 2. This viewpoint helps to explain how adaptive weights that were changed through previous learning can regulate the brain’s present information processing, without learning about the signals that they are currently processing unless they can initiate a resonant state. Through resonance as a mediating event, one can understand from a deeper mechanistic view why humans are intentional beings who are continually predicting what may next occur, and why we tend to learn about the events to which we pay attention.

More recent laminar versions of ART, notably the Synchronous Matching ART (SMART) model (Grossberg and Versace, submitted for publication, Versace and Grossberg, 2005, Versace and Grossberg, 2006), show how a match may lead to fast gamma oscillations that facilitate spike-timing dependent plasticity (STDP), whereas mismatch can lead to slower beta oscillations that greatly lower the probability that mismatched events can be learned by a STDP learning law. These new features will be summarized below after more basic concepts are reviewed.

A sufficiently bad mismatch between an active top-down expectation and a bottom-up input, say because the input represents an unfamiliar type of experience, can drive a memory search. Such a mismatch within the attentional system is proposed to activate a complementary orienting system, which is sensitive to unexpected and unfamiliar events. ART suggests that this orienting system includes the hippocampal system, which has long been known to be involved in mismatch processing, including the processing of novel events (Deadwyler et al., 1979, Deadwyler et al., 1981, Otto and Eichenbaum, 1992, Solokov, 1968, Vinogradova, 1975). More recent work on SMART also implicated the nonspecific thalamic nuclei; see below. Output signals from the orienting system rapidly reset the recognition category that has been reading out the poorly matching top-down expectation (Fig. 2(b) and (c)). The cause of the mismatch is hereby removed, thereby freeing the system to activate a different recognition category (Fig. 2(d)). The reset event hereby triggers memory search, or hypothesis testing, which automatically leads to the selection of a recognition category that can better match the input. Various data support the existence of this predicted hypothesis testing cycle. In particular, Banquet and Grossberg (1987) summarized evidence from an experiment on humans that was designed to test this ART prediction by measuring event-related potentials (ERP). This study showed that sequences of P120–N200–P300 ERPs have the properties of ART mismatch–arousal–reset that are predicted to occur during a hypothesis testing cycle. Many subsequent studies have provided additional evidence for predictive coding; e.g. Ahissar and Hochstein (2002), Desimone (1998), Engel, Fries, and Singer (2001), Gao and Suga (1998), Herrmann, Munk, and Engel (2004), Krupa, Ghazanfar, and Nicolelis (1999), and Salin and Bullier (1995).

If no such recognition category exists, say because the bottom-up input represents a truly novel experience, then the search process automatically activates an as yet uncommitted population of cells, with which to learn about the novel information. In order for a top-down expectation to match a newly discovered recognition category, its top-down adaptive weights initially have large values, which are pruned by the learning of a particular expectation.

This learning process works well under both unsupervised and supervised conditions (Carpenter, Grossberg, Markuzon, Reynolds, & Rosen, 1992). Unsupervised learning means that the system can learn how to categorize novel input patterns without any external feedback. Supervised learning uses predictive errors to let the system know whether it has categorized the information correctly. Supervision can force a search for new categories that may be culturally determined, and are not based on feature similarity alone. For example, separating the featurally similar letters E and F into separate recognition categories is culturally determined. Such error-based feedback enables variants of E and F to learn their own category and top-down expectation, or prototype. The complementary, but interacting, processes of attentive-learning and orienting-search together realize a type of error correction through hypothesis testing that can build an ever-growing, self-refining internal model of a changing world.

What combinations of features or other information are bound together into conscious object or event representations? One view is that exemplars, or individual experiences, are learned, because humans can have very specific memories. For example, we can all recognize the particular faces of our friends. On the other hand, storing every remembered experience as exemplars can lead to a combinatorial explosion of memory, as well as to unmanageable problems of memory retrieval. A possible way out is suggested by the fact that humans can learn prototypes which represent general properties of the environment (Posner & Keele, 1968). For example, we can recognize that everyone has a face. But then how do we learn specific episodic memories? ART provides an answer to this question that overcomes problems faced by earlier models.

The first thing to realize is that ART prototypes are not merely averages of the exemplars that are classified by a category, as is typically assumed in classical prototype models. Rather, they are the actively selected critical feature patterns upon which the top-down expectations of the category focus attention. In addition, the generality of the information that is coded by these critical feature patterns is controlled by a gain control process, called vigilance control, which can be influenced by environmental feedback or internal volition (Carpenter & Grossberg, 1987). Low vigilance permits the learning of general categories with abstract prototypes. High vigilance forces a memory search to occur for a new category when even small mismatches exist between an exemplar and the category that it activates. As a result, in the limit of high vigilance, the category prototype may encode an individual exemplar.

Vigilance is computed within the orienting system of an ART model (Fig. 2(b)–(d)). It is here that bottom-up excitation from all the active features in an input pattern I are compared with inhibition from all the active features in a distributed feature representation across F1. If the ratio of the total activity across the active features in F1 (that is, the “matched” features) to the total activity due to all the features in I is less than a vigilance parameter ρ (Fig. 2(b)), then a reset wave is activated (Fig. 2(c)), which can drive the search for another category with which to classify the exemplar. In other words, the vigilance parameter controls how bad a match can be before search for a new category is initiated. If the vigilance parameter is low, then many exemplars can all influence the learning of a shared prototype, by chipping away at the features which are not in common with all the exemplars. If the vigilance parameter is high, then even a small difference between a new exemplar and a known prototype (e.g. F vs. E) can drive the search for a new category with which to represent F.

One way to control vigilance is by a process of match tracking. Here if a predictive error (e.g. D is predicted in response to F), the vigilance parameter increases until it is just higher than the ratio of active features in F1 to total features in I. In other words, vigilance “tracks” the degree of match between input exemplar and matched prototype. This is the minimal level of vigilance that can trigger a reset wave and thus a memory search for a new category. Match tracking realizes a Minimax Learning Rule that conjointly maximizes category generality while it minimizes predictive error. In other words, match tracking uses the least memory resources that can prevent errors from being made.

Because vigilance can vary across learning trials, recognition categories capable of encoding widely differing degrees of generalization or abstraction can be learned by a single ART system. Low vigilance leads to broad generalization and abstract prototypes. High vigilance leads to narrow generalization and to prototypes that represent fewer input exemplars, even a single exemplar. Thus a single ART system may be used, say, to learn abstract prototypes with which to recognize abstract categories of faces and dogs, as well as “exemplar prototypes” with which to recognize individual views of faces and dogs. ART models hereby try to learn the most general category that is consistent with the data. This tendency can, for example, lead to the type of overgeneralization that is seen in young children until further learning leads to category refinement (Chapman et al., 1986, Clark, 1973, Smith et al., 1985, Smith and Kemler, 1978, Ward, 1983).

If vigilance control is important for normal learning, one might expect breakdowns in vigilance control to contribute to certain mental disorders. It has been suggested that an abnormally low vigilance may contribute to medial temporal amnesia (Carpenter & Grossberg, 1993) and that an abnormally high vigilance may contribute to autism (Grossberg & Seidman, 2006). These proposals point to the utility of classifying certain mental disorders as “vigilance diseases”.

A biologically relevant neural model must be able to explain and predict more behavioral and neural data than its competitors. One additional mark of maturity of such a model is that it “works” and can solve complicated real-world problems. Many benchmark studies of ART show that it useful in large-scale engineering and technological applications. See http://profusion.bu.edu/techlab for some illustrative benchmark studies. In particular, vigilance control in the classification of complex data bases enables the number of ART categories that are learned to scale well with the complexity of the input data.

As sequences of inputs are practiced over learning trials, the search process eventually converges upon stable categories. It has been mathematically proved (Carpenter & Grossberg, 1987) that familiar inputs directly access the category whose prototype provides the globally best match, while unfamiliar inputs engage the orienting subsystem to trigger memory searches for better categories until they become familiar. This process continues until the memory capacity, which can be chosen arbitrarily large, is fully utilized. The process whereby search is automatically disengaged is a form of memory consolidation that emerges from network interactions. Emergent consolidation does not preclude structural consolidation at individual cells, since the amplified and prolonged activities that subserve a resonance may be a trigger for learning-dependent cellular processes, such as protein synthesis and transmitter production.

It has also been shown that the adaptive weights which are learned by some ART models can, at any stage of learning, be translated into IF-THEN rules (e.g. Carpenter et al. (1992)). Thus the ART model is a self-organizing rule-discovering production system as well as a neural network. These examples show that the claims of some cognitive scientists and AI practioners that neural network models cannot learn rule-based behaviors are as incorrect as the claims that neural models cannot learn symbols.

The Synchronous Matching Adaptive Resonance Theory (SMART) model advances ART in several ways (Grossberg and Versace, submitted for publication, Versace and Grossberg, 2005, Versace and Grossberg, 2006); see Fig. 3. SMART links attentive learning requirements to how laminar neocortical circuits interact with primary, higher-order (e.g. the pulvinar nucleus; Sherman and Guillery, 2001, Shipp, 2003), and nonspecific thalamic nuclei (van Der Werf, Witter, & Groenewegen, 2002). Corticothalamocortical pathways work in parallel with corticocortical routes (Maunsell and Van Essen, 1983, Salin and Bullier, 1995, Sherman and Guillery, 2002). Specific first-order thalamic nuclei (such as the Lateral Geniculate Nucleus, LGN) relay sensory information to the cerebral cortex, whereas specific second-order thalamic nuclei receive their main input from lower-order cortical areas, notably from layer 5, and relay this information to higher-order cortical areas (Sherman & Guillery, 2002). The model clarifies how a match between cortical and thalamic inputs at the level of specific first-order and higher-order thalamic nuclei might subserve fast stable learning of neural representations in the thalamocortical system.

In particular, suppose that, at a specific thalamic nucleus, a sufficiently good match occurs between a bottom-up input pattern and a top-down expectation from layer 6 of its corresponding cortical area. Such a match can trigger fast synchronized gamma oscillations (γ, 20–70 Hz), whose short period enables synchronized spikes to drive learning via a spike-timing-dependent plasticity (STDP; Bi and Poo, 2001, Levy and Steward, 1983, Markram et al., 1997) learning rule. In particular, STDP is maximal when pre- and post-synaptic cells fire within 10–20 ms of each other, and thus favors learning in match states whose synchronous fast oscillations fall within the temporal constraints of STDP (Traub et al., 1998, Wespatat et al., 2004). In contrast, mismatched cells undergo slower beta oscillations (β, 4–20 Hz). whose spikes do not fall within the STDP learning window.

SMART hereby brings the new features of synchronized oscillation frequency and STDP into the discussion of how learning is selectively regulated. Aggregate and single-cell recordings from multiple thalamic and cortical levels of mammals have shown high- and low-frequency rhythmic synchronous activity correlated with cognitive, perceptual and behavioral tasks, and large-scale neuronal population models have been proposed to model oscillatory dynamics (Bazhenov et al., 1998, Destexhe et al., 1999, Lumer et al., 1997, Siegel et al., 2000). However, these models have not linked brain spikes, oscillations, STDP, and the brain states that subserve cognitive information processing.

SMART proposes that such a match or mismatch at a higher cortical level can occur as follows: Activation of layer 5 cells in a lower cortical area (e.g. V1) generates driving inputs to a higher-order specific thalamic area (e.g. pulvinar); see Rockland, Andresen, Cowie, and Robinson (1999). Terminations arising from layer 5 are similar to retinogeniculate RL synapses, or driving, connections, often found in more proximal segments of the dendrites. This pattern of connectivity seems to be constant across species (Rouiller & Welker, 2000).

A top-down expectation from layer 6II of the corresponding cortical area (e.g. V2) is matched in this thalamic region against the layer 5 output pattern, similar to the way in which retinal inputs to the lateral geniculate nucleus are matched by top-down signals from layer 6II of V1. If a sufficiently good match occurs, then synchronized gamma oscillations can be triggered in the pulvinar and V2, leading to learning of the critical features that are part of the matched pattern.

If the match is not good enough, then the nonspecific thalamic nucleus gets activated by a mechanism that is similar to that summarized in Fig. 2, but which is anatomically more precisely characterized in Fig. 3. Nonspecific thalamic activation is nonspecifically broadcast as an arousal signal to many cortical areas via diffuse inputs across layer 1. We suggest that this nonspecific pathway is part of the orienting system that triggers reset in response to mismatch events. In particular, apical dendrites in layer 1 of layer 5 cells receive this arousal input. If some of these layer 5 cells are active when the arousal burst occurs, their firing rate is enhanced in response to the arousal input. This enhancement of layer 5 cell firing triggers a selective reset of cortical and thalamic cells in the following way:

Layer 5 cells project to layer 4 via layer 6 (Fig. 3). The signals from layer 6 to 4 are gated by habituative transmitters, also called depressing synapses. Activation patterns in these circuits just before the arousal burst bias the habituative network of cells feeding layer 4. The active circuits are presumably the ones that caused the predictive mismatch. When the arousal burst occurs, these previously active cells are disadvantaged relative to cells that were not active. A reset event can then occur that inhibits the previously active cells as it selects new cells with which to better code the novel input.

This model explanation is supported by quantitative simulations of data about single cell biophysics and neurophysiology, laminar neuroanatomy, aggregate cell recordings (current-source densities, local field potentials), and large-scale oscillations at beta and gamma frequencies, which the model functionally links to requirements about how to achieve fast stable attentive learning.

As noted above, SMART predicts that thalamocortical mismatches may cause cortical reset via the deeper cortical layers 6 and 4. Model simulations also show that mismatches lead to slower beta oscillations. Putting these two properties together leads to the prediction that the deeper layers of neocortex may express beta oscillations more frequently than the superficial layers. Such a property has recently been experimentally reported (Buffalo, Fries, & Desimone, 2004). It remains to test whether the observed experimental property is related to the SMART reset prediction.

Two issues may be noted in this regard. One concerns how the prediction may be tested: One possible test would be to carry out a series of experiments on the same animal in which the animal is exposed to environments with progressively more novel events. More novel events should cause more cortical resets. Do more cortical resets per unit time cause more beta oscillations in the lower cortical layers? The second issue notes that the differences between the oscillation frequencies in the deeper and more superficial cortical layers are averages over time. This is essential to realize because there exist interlaminar intracortical feedback loops that may be expected to synchronize all the cortical layers during a match event (Yazdanbakhsh & Grossberg, 2004). Indeed, these are the intracortical feedback loops whereby “a pre-attentive grouping is its own attentional prime”, and thus enable neocortex to develop its circuits, without a loss of stability, even before intercortical attentional circuits can develop.

Section snippets

Discussion

This article summarizes how Adaptive Resonance Theory explains mechanistic relationships between the CLEARS properties of Consciousness, Learning, Expectation, Attention, Resonance, and Synchrony. ART proposes that these processes work together to solve the stability-plasticity dilemma, and thus to enable advanced animals, including humans, to learn quickly about a changing world throughout life without experiencing catastrophic forgetting. Conscious events are predicted to be a subset of

Acknowledgements

Stephen Grossberg was supported in part by the National Science Foundation (NSF SBE-0354378) and the Office of Naval Research (ONR N00014-01-1-0624). Thanks to Megan Johnson for her help in preparing the article.

References (89)

  • S. Grossberg et al.

    Visual brain and visual perception: How does the cortex do perceptual grouping?

    Trends in Neurosciences

    (1997)
  • S. Grossberg et al.

    Contrast-sensitive perceptual grouping and object-based attention in the laminar circuits of primary visual cortex

    Vision Research

    (2000)
  • S. Grossberg et al.

    Synchronized oscillations during cooperative feature linking in a cortical model of visual perception

    Neural Networks

    (1991)
  • S. Grossberg et al.

    A laminar cortical model for 3D perception of slanted and curved surfaces and of 2D images: Development, attention and bistability

    Vision Research

    (2004)
  • S. Grossberg et al.

    Laminar cortical dynamics of 3D surface perception: Stratification, transparency, and neon color spreading

    Vision Research

    (2005)
  • C.S. Herrmann et al.

    Cognitive functions of gamma-band activity: Memory match and utilization

    Trends in Cognitive Sciences

    (2004)
  • W.B. Levy et al.

    Temporal contiguity requirements for long-term associative potentiation/depression in the hippocampus

    Neuroscience

    (1983)
  • M. Mishkin et al.

    Object vision and spatial vision: Two cortical pathways

    Trends in Neurosciences

    (1983)
  • E.M. Rouiller et al.

    A comparative analysis of the morphology of corticothalamic projections in mammals

    Brain Research Bulletin

    (2000)
  • C. Smith et al.

    On differentiation: A case study of the development of the concept of size, weight, and density

    Cognition

    (1985)
  • L.B. Smith et al.

    Levels of experienced dimensionality in children and adults

    Cognitive Psychology

    (1978)
  • R.D. Traub et al.

    Gamma-frequency oscillations: A neuronal population phenomenon, regulated by synaptic and intrinsic cellular processes, and inducing synaptic plasticity

    Progress in Neurobiology

    (1998)
  • Y.D. van Der Werf et al.

    The intralaminar and midline nuclei of the thalamus. Anatomical and functional evidence for participation in processes of arousal and awareness

    Brain Research Brain Research Reviews

    (2002)
  • A. Yazdanbakhsh et al.

    Fast synchronization of perceptual grouping in laminar visual cortical circuits

    Neural Networks

    (2004)
  • M. Ahissar et al.

    View from the top: Herarchies and reverse hierarchies in the visual system

    Neuron

    (2002)
  • J.P. Banquet et al.

    Probing cognitive processes through the structure of event-related potentials during learning: An experimental and theoretical analysis

    Applied Optics

    (1987)
  • M. Bazhenov et al.

    Computational models of thalamocortical augmenting responses

    The Journal of Neuroscience

    (1998)
  • G.Q. Bi et al.

    Synaptic modification by correlated activity: Hebb’s postulate revisited

    Annual Review of Neuroscience

    (2001)
  • E.A. Buffalo et al.

    Layer-specific attentional modulation in early visual areas

    Society for Neuroscience Abstract

    (2004)
  • G.A. Carpenter et al.

    Pattern recognition by self-organizing neural networks

    (1991)
  • G.A. Carpenter et al.

    Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps

    IEEE Transactions on Neural Networks

    (1992)
  • K.L. Chapman et al.

    The effect of feedback on young children’s inappropriate word usage

    Journal of Child Language

    (1986)
  • S.A. Deadwyler et al.

    Entorhinal and septal inputs differentially control sensory-evoked responses in the rat dentate gyrus

    Science

    (1981)
  • R. Desimone

    Visual attention mediated by biased competition in extrastriate visual cortex

    Philosophical Transactions of the Royal Society of London B: Biological Sciences

    (1998)
  • R.J. Douglas et al.

    Recurrent excitation in neocortical circuits

    Science

    (1995)
  • R. Eckhorn et al.

    Coherent oscillations: A mechanism of feature linking in the visual cortex?

    Biological Cybernetics

    (1988)
  • A.K. Engel et al.

    Dynamic predictions: Oscillations and synchrony in top-down processing

    Nature Review Neuroscience

    (2001)
  • Fang, L., & Grossberg, S. (2007). From stereogram to surface: How the brain sees the world in depth. Spatial Vision (in...
  • Fazl, A., Grossberg, S., & Mingolla, E. (2007) View-invariant object category learning, recognition, and search: How...
  • E. Gao et al.

    Experience-dependent corticofugal adjustment of midbrain frequency map in bat auditory system

    Proceedings of the National Academy of Sciences USA

    (1998)
  • C.M. Gray et al.

    Stimulus-specific neuronal oscillations in orientation columns of cat visual cortex

    Proceedings of the National Academy of Sciences

    (1989)
  • S. Grossberg

    Contour enhancement, short-term memory, and constancies in reverberating neural networks

    Studies in Applied Mathematics

    (1973)
  • S. Grossberg

    Adaptive pattern classification and universal recoding, II: Feedback, expectation, olfaction, and illusions

    Biological Cybernetics

    (1976)
  • S. Grossberg

    How does a brain build a cognitive code?

    Psychological Review

    (1980)
  • Cited by (49)

    • A model integrating multiple processes of synchronization and coherence for information instantiation within a cortical area

      2021, BioSystems
      Citation Excerpt :

      Communication through coherence has been proposed by Tiesinga and Sejnowski (2010). Synchronization among VCZ nodes, with information communication, has been demonstrated in pairs of lasers (Fischer et al., 2006; Vicente et al, 2007, 2008), modeled between cortical areas (Gollo et al., 2010) and in triplets of cortical areas in the CNS (Chawla et al., 2001). The latter is particularly relevant, as the triplets are three-way connected groups, making obvious the need for simultaneous bidirectional communication between all elements.

    • Towards solving the hard problem of consciousness: The varieties of brain resonances and the conscious experiences that they support

      2017, Neural Networks
      Citation Excerpt :

      These processes are called the CLEARS processes. ART predicts that all brain representations that solve the stability–plasticity dilemma use variations of CLEARS mechanisms (Grossberg, 1978a, 1980, 2007, 2013a). The CLEARS mechanisms clarify why many animals are intentional beings who pay attention to salient objects, why “all conscious states are resonant states”, and how brains can learn both many-to-one maps (representations whereby many object views, positions, and sizes all activate the same invariant object category; see Section 12) and one-to-many maps (representations that enable us to expertly know many things about individual objects and events; e.g., see Section 19.2).

    View all citing articles on Scopus
    View full text