A Bayesian model for canonical circuits in the neocortex for parallelized and incremental learning of symbol representations
Introduction
We present a model for a hypothetical functional unit of the neocortex and its relationship with proximal peers that share the same cognitive context. Our model is not one of the neocortex at large, which would require a network of cognitive contexts, but we hope it offers a building block for such an objective. Our approach is to first identify key computational aspects of the neocortex, and then build the model upon those assumptions. We demonstrate how the resulting model can orthogonalize a cognitive context developing representations for the cognitive symbols. The model is evaluated on popular machine learning datasets.
Our assumptions are detailed in Section 2. The principal assumption is the existence of an elementary functional unit in the neocortex, identified as a canonical circuit. Next, we assume that each canonical circuit develops to represent a particular cognitive symbol, by learning towards data associated with its symbol, and by learning away from the data of all other symbols. Another assumption is that each canonical circuit must execute concurrently with all other canonical circuits, in complete task parallelism. Next, we assume that neocortical computation is analogous to Bayesian inference, and we approach this aspect through Marr׳s three levels of analysis. Lastly, we assume that canonical circuits must operate inherently incremental by learning from only a few examples, from infinite data streams, and without having to store old data or use multiple epochs. There is existing work in neocortical computational modeling which covers the previous listed assumptions to a certain extent, either individually or in subsets. However, we are not aware of any work which covers all the assumptions jointly. One of our contributions is that we identify the state of each aspect in current neuroscience, propose correlations between them, and propose a model that puts them all in a common framework.
In our model, each cognitive symbol is represented by a canonical circuit in the form of an independent Bayesian neural network. Each of these neural networks updates with its own Bayesian inference process. The inference processes of all cognitive symbols are coupled in an inhibitory way, so that each pursues uniqueness and the overall result is orthogonalization of the cognitive context which describes the data stream. The model starts blank and adds a canonical circuit for each new symbol as it shows up in the stream. For example, cognitive context could be “direction of motion,” with its symbols being “up”, “down”, “right”, etc. Fig. 1 shows a simplified visualization of how canonical circuits orthogonalize a cognitive context. Due to the task parallelism, regardless of how many canonical circuits become involved, the model runs in constant time, and the canonical circuits can be distributed to different processors or machines across the network.
In order to meet the requirements for incremental learning, we developed a novel Bayesian inference method that runs Metropolis-Hastings (MH) on a data stream. The method makes it possible for a single data instance to be sufficient in forming a useful representation of a symbol, and for each symbol to update with each new data instance efficiently. No data instances, before the last seen per symbol, have to be kept for subsequent updates. We discuss how it is possible, in an optimal model, for not even the latest instance to be required. We call our method Incremental Metropolis-Hastings (IMH). IMH recurrently re-uses the last posterior as a new prior. Priors and posteriors are represented as non-parametric probability distributions, utilized through Monte Carlo or kernel density estimators. Therefore, the inference does not suffer from limitations of point-based approximation such as Maximum-a-Posteriori.
From a purely computational perspective, we contribute a Bayesian classification model which is capable of supervised learning of an unlimited number of symbols (classes) from an infinite data stream, and which has simple parameterization. Because of its incremental learning qualities, the model is unique in handling concept drift transparently, i.e. it inherently supports non-stationary class distributions. We show that it matches the performance of state-of-the-art incremental learning methods. IMH is also a computational contribution in itself, because we show how it vastly outperforms particle filters for incremental Bayesian inference, at least with a neural network model.
In the following section we present the background for our model, structured as a literature review of the principal assumptions, each of them identified as a subsection. We attempt to relate them by expressing them into a shared terminology, which allows for a unified perspective upon which we build the model. In the third section we describe our model in detail. The forth section reviews existing and related models. In the fifth section we present the evaluation results, after which we finish with a Conclusions section.
Section snippets
Canonical circuits in the neocortex
The idea of elementary circuits as functional modules in the neocortex was hypothesized as early as 1938 [1], though it remains an open question [2]. A prominent hypothesis of this type is the columnar view of the neocortex, based on functional identification of neural circuits perpendicular to the pial surface [3], [4], [5], as well as a repeating template of neural distribution and connectivity found in such circuits [6], [7], [8]. In the columnar hypothesis, the smallest circuit is called a
Overview and definitions
This work presents two independent but complementing contributions. The principal contribution is a model for incremental and Bayesian orthogonalization of a cognitive context, where each resulting cognitive symbol is adopted by a canonical circuit, and where all canonical circuits execute in parallel. The model describes how canonical circuits relate in a cognitive context, and how cognitive symbols develop while inhibiting each other. We refer to this model as Cognitive Context
Biomorphic perspective
Computational models of the neocortex vary greatly in their objectives and biological inspiration. Many of them do not pursue a generic function but are specialized models, such as for example in modeling ocular dominance with a 2D grid over which simple Hebbian-type relationships are simulated [25]. Models which are more generic cover only a subset of our five principal assumptions. Even if we ignore these assumption, existing models usually present either no evaluation or one done on
Evaluation
The objective of the evaluation is to see how the CCON works as a supervised classifier on data streams. Part of the evaluation is also to compare IMH to particle filters used as inference methods in CCON. It is a limitation in our work that we do not evaluate on datasets that are more related to biological behavior. We are partially prevented from doing this because we would need a network of cognitive contexts, which is part of the future work.
We evaluate CCON in classification tasks on data
Conclusion
We started by identifying five assumptions about neocortical computation: the existence of canonical circuits, their relationship to cognitive symbols and contexts, Bayesian aspects, parallelism and incremental requirements. Following these requirements, we built our CCON model for a single cognitive context which is continually orthogonalized into symbols by canonical circuits, can follow the evolution of non-stationary symbols, and can grow the number of symbols dynamically as the context
Martin Dimkovski is a Computer Science Ph.D. student at York University in Toronto, Canada, in Dr. Aijun An׳s team. He is an Elia scholar, who is awarded at York University for achievements in liberal education and interdisciplinary studies. Martin holds a M.Sc. in Computer Science, a M.Sc. in Information Technology, and a B.Sc. in Computer Science. His research interests are in artificial intelligence and knowledge technology, in particular related to modeling biological intelligence.
References (40)
- et al.
Canonical microcircuits for predictive coding
Neuron
(2012) The symbol grounding problem
Phys. D: Nonlinear Phenom.
(1990)- et al.
Canonical microcircuits for predictive coding
Neuron
(2012) Architecture and structure of the cerebral cortex
A cortical sparse distributed coding model linking mini-and macrocolumn-scale functionality
Front. Neuroanat.
(2010)The columnar organization of the neocortex
Brain
(1997)- et al.
The minicolumn hypothesis in neuroscience
Brain
(2002) - J. Szentagothai, The Ferrier lecture, 1977: the neuron network of the cerebral cortex: a functional interpretation,...
- et al.
A statistical analysis of information-processing properties of lamina-specific cortical microcircuit models
Cereb. Cortex
(2007) - et al.
Interneurons of the neocortical inhibitory system
Nat. Rev. Neurosci
(2004)
Whose cortical column would that be?
Front. Neuroanat.
The neocortical column
Front. Neuroanat.
How to grow a mindstatistics, structure, and abstraction
Science
Free-energy and the brain
Synthese
Towards a mathematical theory of cortical micro-circuits
PLoS Comput. Biol.
Bayesian Data Analysis
Pattern Recognition and Machine Learning
Vision: A Computational Investigation into the Human Representation and Processing of Visual Information
Cited by (9)
Bayes-OS-ELM :An novel ensemble method for classification application
2019, Proceedings - 2019 International Conference on Sensing, Diagnostics, Prognostics, and Control, SDPC 2019Bayesian network learning based on characteristic confidence guidance under large data sets
2019, International Journal of Circuits, Systems and Signal ProcessingLimited-length suffix-array-based method for variable-length motif discovery in time series
2018, Journal of Internet TechnologyAn incremental approach for sparse bayesian network structure learning
2018, Communications in Computer and Information ScienceImproved incremental algorithm of Naive Bayes
2016, Tongxin Xuebao/Journal on Communications
Martin Dimkovski is a Computer Science Ph.D. student at York University in Toronto, Canada, in Dr. Aijun An׳s team. He is an Elia scholar, who is awarded at York University for achievements in liberal education and interdisciplinary studies. Martin holds a M.Sc. in Computer Science, a M.Sc. in Information Technology, and a B.Sc. in Computer Science. His research interests are in artificial intelligence and knowledge technology, in particular related to modeling biological intelligence.
Aijun An is a Professor of Computer Science at York University, Toronto, Canada. She received her PhD degree in computer science from the University of Regina in 1997, and held research positions at the University of Waterloo from 1997 to 2001. She joined York University in 2001. Her research area is data mining. She has published widely in premier journals and conference proceedings on various topics of data mining, including classification, clustering, data stream mining, transitional and diverging pattern mining, high utility pattern mining, sentiment and emotion analysis from text, topic detection, keyword search on graphs, social network analysis, and bioinformatics.