Self organized mapping of data clusters to neuron groups
Introduction
Kohonen’s famous paper on self-organized maps (Kohonen, 1982, Kohonen, 1995) has given rise to an enormous number of publications. The original Kohonen map is inspired from the observation that, in the mammalian brain, the mapping of sensory inputs to the cortex is somatotopic (i.e. that signals from neighboring body areas are mapped to neighboring cortex regions). This concept has proven very successful in practical applications. However, it is based on intuitive considerations, and not on a strict mathematical theory. Therefore, it suffers from certain deficiencies which are discussed in Kohonen (1995).
The original SOM and most of its variants require that the learning parameters (i.e. learning strength and range of adaptation within the neuron grid) are successively modified during the training phase. It is necessary to define a so called annealing scheme, controlling this time dependency. Unfortunately there is no firm theoretical basis for constructing such schemes, and often they are determined empirically. That the mapping changes with time is the purpose of the process. However the time dependency is an explicit one: learning parameters vary with time and thus the quantitative properties of the process itself change during the training phase. This prevents the classical SOM from learning a new mapping when the training is complete.
Meanwhile, there are a large number of papers which define new SOM types. Most of them have a mathematical basis and some of them avoid the explicit time dependency.
Typical examples are the SOAN (self organization with adaptive neighborhood neural network) (Iglesias & Barro, 1999) and the parameterless PLSOM (Berglund & Sitte, 2006). Both use the current mapping error to adjust the internal parameters of the adaptation process. In the time-adaptive SOM (TASOM), (Shah-Hosseini & Safabakhsh, 2003) each node has its own variable learning strength and neighborhood size. As a consequence, the network as a whole keeps its ability to learn independent from time. Somewhat simplified, the TASOM approach may be considered as shifting the dependency on externally controlled learning parameters to a dependency on individual node properties, thus converting the explicit time dependency into an implicit one. The present paper pursues a similar strategy. A rather elaborate mathematical procedure is used in the auto-SOM (Haese, 1999, Haese and Goodhill, 2001): here the weight vectors are adapted by means of a Kalman filter and the learning parameters are determined so as to minimize the prediction error variance. With regard to the result, the generative topographic mapping (GTM Bishop, Svensén, and Williams (1998)) is also a SOM. The adaptation algorithm is probabilistic and does not need a decreasing learning strength or a shrinking range of adaptation. However, the mapping problem is formulated in terms of a latent variable model, and thus the connection to Kohonen’s approach is weak. In some SOM variants, the number of neurons and the whole network structure change with time. Although this is a very strong time dependency, it is an implicit one: addition and removal of neurons are controlled solely by the stream of input signals. Examples are the growing cell structure (Fritzke, 1994a), the growing neural gas (Fritzke, 1995, Fritzke, 1994b) and the plastic self-organizing map (Lang & Warwick, 2002).
Compared to the original Kohonen SOM, these variants are remarkable improvements. They do not need an external control of parameters and, especially the TASOM, is well suited for tasks with changing data sets. Above all, they achieve a better mapping quality, at least in certain applications
On the other hand, mapping quality and performance become less important if SOM structures are used to model processes in the brain. Here we are confronted not only with excellent function, but also with malfunction occurring in certain situations. Therefore it seems useful to study another kind of SOM which is designed primarily for biological modelling purposes, and not for better performance.
Soon after the first SOM publications, Merzenich et al. demonstrated that the reorganization observed in the sensory cortex of mammals after peripheral nerve damage can be modeled as an adaptation process in a Kohonen map (Merzenich et al., 1984). Martinetz, Ritter, and Schulten (1988) showed, that the auditive cortex of a bat may be considered as a neighborhood preserving map of the relevant space of ultrasonic signals. Meanwhile the SOM concept is accepted as a tool for understanding the processing of sensory signals in the brain (Kaas, 1991, Sirosh and Miikkulainen, 1995, Turrigiano and Nelson, 2004, Wiemer et al., 2000). Nevertheless, at least the classical SOM has two properties which hamper its use for the modelling task described above:
- 1.
An explicit time dependence means that there exists a control mechanism outside the SOM. When the training for a task is completed, this mechanism would be responsible for restoring the plasticity if a new task is to be learned. The time independent SOM variants mentioned above, avoid this problem, but their mathematical procedures cannot easily be transferred into a biological framework.
- 2.
The training process of the classical SOM adapts the whole set of disposable neurons to a given set of signal vectors. There remain no neurons which can be used for a subsequent training process with a different set of signal vectors. This is in contradiction to the observed plasticity of the brain. We can learn many new tasks without forgetting those we have learned before.
After all, it appears useful to modify the Kohonen training procedure, retaining the principal properties of the original SOM and avoiding the undesirable properties pointed out above. The necessary modifications are the following:
- •
The adaptation procedure must not contain an explicit time dependency.
- •
Training with a certain data set should result in a mapping of to a certain neuron group and not to the whole neuron set G. The bulk of neurons should be left disposable for subsequent training phases with different data sets .
The SOM introduced in the present paper fulfills these requirements. However in contrast to the time independent SOMs mentioned above, it is constructed in a purely intuitive way. The goal is not an improved mapping, but a neurophysiologically plausible mechanism. Rather than from a mathematical theory the design starts from the idea, that a firing neuron remains for some time in an excited state with increased sensibility and that its excitation spreads to his neighbors. As in the TASOM, a particular neuron has not only its own synaptic coupling vector but, in addition, a state which determines its behavior. The state, however, is only a single entity a, the activation. This is enough to allow for a time independent adaptation procedure which does not use all available neurons, but only a certain neuron group, to represent a data cluster. Because of this property, the resulting SOM is called DCNG-SOM. As in the growing neuron structures (Fritzke, 1994a, Fritzke, 1995, Lang and Warwick, 2002), the incoming stream of data controls the number of neurons involved in the mapping of a cluster. In contrast to these SOM variants, the neurons are not taken from a potentially infinite stock. Instead, the number of neurons is fixed, and they have a fixed position in physical space.
The aspect of performance in applications is not considered. We simply study the properties of the DCNG-SOM resulting from the use of the modified neuron. This is because the long-term objective is not only to model neuroplasticity, but also to explain disorderings in the mapping of sensory signals to the sensory cortex. Such disorderings have been observed (e. g. in patients suffering from focal dystonia).
Focal dystonia is a movement disorder occurring in several forms, as for example, writers cramp and musicians cramp. Apparently the cause is not an organic defect but rather something like an overtraining. This suggests that focal dystonia has to do with a misled self organization process. Sanger and Merzenich (2000) have hypothesized that it is a manifestation of an unstable sensorimotor control loop. They discuss several mechanisms which possibly could lead to a gain >1 in the feedback loop. The mapping mentioned above is a link in this loop. As yet, theoretical studies on focal dystonia concentrate on the time behavior of sensory signals and on the control theoretic aspects of the problem. However, in some musicians with focal dystonia, the disordered cortical representation of digits could be observed directly by functional magnetic resonance imaging (Elbert et al., 1998). The increasing empirical material in this field suggests the study of the formation of disordered mappings with SOM based models
It is a characteristic of focal dystonias that they are task specific. The focal dystonia of musicians affects the control of finger movements only in the context of instrument playing. It does not affect the function of the same fingers in other activities. This means that the disturbed mapping is effective only in a special context. The context specificity becomes understandable if a cortical region is described as a SOM. A sensory stimulus might be considered as a signal vector where describes the context. In this picture, it appears plausible that certain stimuli are mapped to a corrupted part of the map, while stimuli with the same and a different context part are processed in correctly working regions. Admittedly, it remains an open question as to whether sensory stimuli are actually encoded as high dimensional signal vectors.
In the literature, there are some context-aware SOM variants (e.g. the recursive self-organizing map by Voegtlin (2002)). They are designed to represent the temporal context of patterns, and in principle they allow the reconstruction of pattern sequences. In contrast, the present paper restricts itself to the more simple problem of constructing different maps for data clusters which are presented successively to the net.
Section snippets
Mapping of data clusters to neuron Groups: The training procedure
Let us assume a data set with the property i.e. consists of distinct clusters .
Further, assume a set of neurons located in a finite region of three dimensional space. This region might be thought as a cortical region with irregularly distributed neurons. For a first study of our adaptation procedure, however, we will consider the simple case of a two dimensional regular grid with neurons . This simplifies
Experiments
In what follows, we will restrict ourselves to examples with . In this case, results can still be visualized easily, while on the other hand the dimension is sufficient to illustrate the motivation behind the mapping of clusters to neuron groups. Imagine for instance that components represent sensory signals from a finger, and that two distinct values of characterize two different contexts, e.g. writing and violin playing. One of the test cases studied (Experiment EX1) is
Neurons outside of groups
In neurobiology, learning of novel associations by adult humans is explained by synaptic plasticity and recruitment of ‘unused’ neurons (see for example Hogan and Diederich (1995)). The question as to whether and to what extent unused neurons exist and new neurons are generated, is still under discussion (Draganski et al., 2004). In literature on neuronal networks, the concept of unused neurons also occurs (Khosravi & Safabakhsh, 2005). Evidently, it has slightly different meanings in different
Behavior for large signal space dimension
If is large, say , the distances between an arbitrary signal vector and all the weight vectors are very close to each other. This is due to the fact that the spherical shell between and containing all points with distance from has a volume increasing with , and that on the other hand r is limited by the finite hypercube. Thus one should suspect that the winner cannot be uniquely identified. To investigate this effect, an experiment EX4 was carried out with
Initialization for large dimension
For high-dimensional signal spaces, the adequate initialization of a SOM becomes difficult. In the experiments EX1-EX4, the DCNG SOM was initialized randomly, i.e. the -vectors were distributed randomly within a hypercube enclosing the anticipated data sets. The idea behind this strategy is that for any signal there should be a vector not too far from . This can be passably fulfilled for but not for large . Let us assume for the moment that the initial -vectors define a
Conclusions
As already explained the intention of this paper is not to present an improved SOM for purposes of data analysis. However the DCNG-SOM has some properties which might qualify it as a useful structure to model neuroplasticity, and especially those aspects of learning which are connected with focal dystonias. They can be summarized as follows:
- •
The adaptation procedure has no explicit time dependence.
- •
The DCNG-SOM does not exhaust the whole neuron grid for a single data cluster. This is important
References (26)
Growing cell structures—a self-organizing network for unsupervised and supervised learning
Neural Networks
(1994)Growing cell structures — a self-organizing network for unsupervised and supervised learning
Neural Networks
(1994)Recursive self-organizing maps
Neural Networks
(2002)- et al.
Apollos Gabe und Fluch–Funktionelle und Dysfunktionelle Plastizität bei Musikern
Neuroforum
(2003) - et al.
The parameter-less self-organizing map algorithm
IEEE Transactions on Neural Networks
(2006) - et al.
GTM: The generative topographic mapping
Neural Computation
(1998) - et al.
Neuroplasticity: Changes in grey matter induced by training
Nature
(2004) - et al.
Alteration of digital representations in somatosensory cortex in focal hand dystonia
NeuroReport
(1998) A growing neural gas network learns topologies
Kalman filter implementation of self-organizing feature maps
Neural Computation
(1999)
Auto-SOM: Recursive parameter estimation for guidance of self-organizing feature maps
Neural Computation
Cited by (9)
Development of recycling strategy for large stacked systems: Experimental and machine learning approach to form reuse battery packs for secondary applications
2020, Journal of Cleaner ProductionCitation Excerpt :To analyse the four parameters simultaneously and find the best battery classification combination, the SOM neural networks is used. SOM is a variant of artificial neural network (ANN), which uses unsupervised learning to generate a low-dimensional (two-dimensional) discrete representation of the input space of training samples (Müller, 2009). It was developed and refined by T. Kohonen in 1982 (Marinho et al., 2019).
A model of task-specific focal dystonia
2013, Neural NetworksMarket segmentation and ideal point identification for new product design using fuzzy data compression and fuzzy clustering methods
2012, Applied Soft Computing JournalCitation Excerpt :It assesses the rank-order or overall value for alternatives with different profiles of attribute levels, and then uses the holistic judgement information to estimate discrete levels of single-attribute value functions by regressions, hierarchical Bayes models, or linear programming [7]. Self-organized feature map is widely used for dimension reduction and clustering, concurrently for various applications of which the data is in multi-dimensions [8–10]. However, this approach has only been used on processing real data, while processing data in fuzzy numbers has not been addressed.
Machine learning approach for solving inconsistency problems of Li-ion batteries during the manufacturing stage
2020, International Journal of Energy ResearchOsteoarthritis Classification Using Self Organizing Map Based Gray Level Run Length Matrices
2018, Proceedings of 2017 5th International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering, ICICI-BME 2017