Abstract
This paper presents a novel architecture for associative learning and recall of different sensor and actuator patterns. A modular design allows the inclusion of various input and output modalities. The approach is a generic one that can deal with any kind of multidimensional real-valued data. Sensory data are incrementally grouped into clusters, which represent different categories of the input data. Clusters of different sensors or actuators are associated with each other based on the co-occurrence of corresponding inputs. Upon presenting a previously learned pattern as a cue, associated patterns can be recalled. The proposed architecture has been evaluated in a practical situation in which a robot had to associate visual patterns in the form of road signs with different configurations of its arm joints. This experiment assessed how long it takes to learn stable representations of the input patterns and tested the recall performance for different durations of learning. Depending on the dimensionality of the data, stable representations require many inputs to be formed and only over time similar small clusters are combined into larger clusters. Nevertheless, sufficiently good recall can be achieved earlier when the topology is still in an immature state and similar patterns are distributed over several clusters. The proposed architecture tolerates small variations in the inputs and can generalise over the varying perceptions of specific patterns but remains sensitive to fine geometrical shapes.
Similar content being viewed by others
References
Thórisson KR. A new constructivist AI: from manual methods to self-constructive systems. In: Wang P, Goertzel B, editors. Theoritical foundations of artificial general intelligence. Amsterdam: Atlantic Press; 2012.
Hebb DO. The organization of behavior—a neuropsychological theory. New York: Wiley; 1949.
O’Reilly RC, Munakata Y. Computational explorations in cognitive neuroscience—understanding the mind by simulating the brain. Cambridge: MIT Press; 2000.
Haikonen POA. The role of associative processing in cognitive computation. Cogn Comput. 2009;1(1):42–9.
Wichert A. Sub-symbols and icons. Cogn Comput. 2009;1:342–7.
Velik R. A Model for multimodal humanlike Perception based on modular hierarchical symbolic information processing, knowledge integration, and learning. In: Proceedings of the 2nd international conference on bio-inspired models of network, information, and computing systems; 2007. p. 168–175.
Velik R, Bruckner D. Neuro-symbolic networks: introduction to a new information processing principle. In: Proceedings of the 6th IEEE international conference on industrial informatics; 2008. p. 1042–1047.
Keysermann MU, Vargas PA. Desiderata for a memory model. In: De Wilde P, Coghill GM, Kononova AV, editors. Proceedings of the 12th UK workshop on computational intelligence. school of mathematical and computer sciences, Heriot-Watt University; 2012. p. 37–44. ISBN 978-0-9574042-0-5.
Haikonen POA. XCR-1: an experimental cognitive robot based on an associative neural architecture. Cogn Comput. 2011;3:360–6.
Baxter PE, de Greeff J, Belpaeme T. Cognitive architecture for human–robot interaction: towards behavioural alignment. Biolog Inspir Cogn Archit. 2013;6:30–9.
Vavrec̆ka M, Farkas̆ I. A multimodal connectionist architecture for unsupervised grounding of spatial language. Cogn Comput. 2014;6:101–12.
Fritzke B. A growing neural gas network learns topologies. In: Tesauro G, Touretzky D, Leen T, editors. Advances in neural information processing systems 7. Cambridge: MIT Press; 1995. p. 625–32.
Martinetz T, Schulten K. Topology representing networks. Neural Netw. 1994;7(3):507–22.
Fritzke B. A self-organizing network that can follow non-stationary distributions. In: Proceedings of ICANN’97: international conference on artificial neural networks. Springer; 1997. p. 613–618.
Furao S, Hasegawa O. An incremental network for on-line unsupervised classification and topology learning. Neural Netw. 2006;19:90–106.
Furao S, Ogura T, Hasegawa O. An enhanced self-organizing incremental neural network for online unsupervised learning. Neural Netw. 2007;20:893–903.
Sudo A, Sato A, Hasegawa O. Associative memory for online learning in noisy environments using self-organizing incremental neural network. IEEE Trans Neural Netw. 2009;20(6):964–72.
Tangruamsub S, Kawewong A, Tsuboyama M, Hasegawa O. Self-organizing incremental associative memory-based robot navigation. IEICE Trans Inf Syst. 2012;E95–D(10):2415–25.
Furao S, Ouyang Q, Kasai W, Hasegawa O. A general associative memory based on self-organizing incremental neural network. Neurocomputing. 2013;104:57–71.
Tan AH, Carpenter GA, Grossberg S. Intelligence through interaction: towards a unified theory for learning. In: Liu D, Fei S, Hou ZG, Zhang H, Sun C, editors. Advances in neural networks—ISNN 2007. vol. 4491, Lecture Notes in Computer Science. Springer, Berlin; 2007. p. 1094–1103.
Grossberg S. Adaptive resonance theory: how a brain learns to consciously attend, learn, and recognize a changing world. Neural Netw. 2013;37:1–47.
Prudent Y, Ennaji A. An incremental growing neural gas learns topologies. In: Proceedings of the 2005 IEEE international joint conference on neural network (IJCNN’05); 2005. vol. 2, p. 1211–1216.
Marsland S, Shapiro J, Nehmzow U. A self-organising network that grows when required. Neural Netw. 2002;15(8):1041–58.
Rescorla RA, Wagner AR. A theory of pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Prokasy WF, editors. Classical conditioning II: current theory and research. New York: Appleton-Century-Crofts; 1972. p. 64–99.
Amor HB, Berger E, Vogt D, Jung B. Kinesthetic bootstrapping: teaching motor skills to humanoid robots through physical interaction. In: Mertsching B, Hund M, Aziz Z, editors. KI 2009: advances in artificial intelligence. vol. 5803, Lecture Notes in Computer Science. Springer, Berlin Heidelberg; 2009. p. 492–499.
Akgun B, Cakmak M, Jiang K, Thomaz AL. Keyframe-based learning from demonstration. Int J Soc Robot. 2012;4(4):343–55.
Husbands P, Smith T, Jakobi N, O’Shea M. Better living through chemistry: evolving gasnets for robot control. Connect Sci. 1998;10(3–4):185–210.
Vargas PA, Di Paolo EA, Harvey I, Husbands P, editors. The horizons of evolutionary robotics. intelligent robotics and autonomous agents series. MIT Press, New York; 2014.
Author information
Authors and Affiliations
Corresponding author
Appendix: M-SOINN Algorithm
Appendix: M-SOINN Algorithm
This section describes the complete M-SOINN algorithm. The following annotations are used:
-
\(|| \cdot ||\) denotes the Euclidean distance measure.
-
dim denotes the dimensionality of the input patterns.
-
p denotes the current input pattern.
-
\({\text {count}}_p\) denotes to number of inputs since the last clean-up step.
-
N denotes the set of nodes in the topology.
-
\({\text {thr}}_n\) denotes the similarity threshold for node n.
-
\({\text {err}}_n\) denotes the error for node n.
-
\({\text {sig}}_n\) denotes the number of signals for node n.
-
E denotes the set of edges in the topology.
-
\({\text {age}}_e\) denotes the age of edge e.
-
\({\text {neighbours}}(n)\) denotes all nodes that are directly connected to node n with an edge.
-
C denotes the set of clusters in the topology.
The different steps for processing an input are illustrated in the following:
-
1.
Input a new pattern \(p \in \mathbb {R}^{{\text {dim}}}\).
-
2.
Increment the number of inputs: \({\text {count}}_p = {\text {count}}_p + 1\).
-
3.
If the number of nodes is less than two (\(|N| < 2\)):
-
(a)
Create a new node \(n_{\mathrm{{new}}}\) with pattern \(p\): \(N = N \cup n_{\text{new}}\) with \(p_{\text{new}} = p\).
-
(b)
Initialise the number of signals for node \(n_{\text{new}}\) to 1: \(\text{sig}_{n_\text{new}} = 1\).
-
(c)
Proceed with the next input.
-
(a)
-
4.
Find the two nodes which are nearest to the current input pattern p:
-
(a)
\(s_1 = \arg \min _{n \in N} || p - p_n ||\)
-
(b)
\(s_2 = \arg \min _{n \in N \backslash s_1} || p - p_n ||\)
-
(a)
-
5.
Update the similarity threshold \({\text {thr}}_s\) for node \(s \forall s \in {s_1, s_2}\):
-
If a fixed threshold is used, set the threshold according to the parameter \(\tau\): \({\text {thr}}_s = \sqrt{\tau ^2 \cdot {\text {dim}}}\).
-
If a dynamic threshold is used:
-
If node s has neighbours (edges exist between node s and other nodes), set the threshold to the maximum distance to its neighbours: \({\text {thr}}_s = \max _{n \in {\text {neighbours}}(s)} || p_s - p_n ||\).
-
If node s is isolated (no edges exist between node s and other nodes), set the threshold to the minimum distance to any other nodes: \({\text {thr}}_s = \min _{n \in N \backslash s} || p_s - p_n||.\)
-
-
-
6.
If at least one of the distances to the two nearest nodes exceeds the corresponding similarity threshold (\(|| p - p_{s_1} || > {\text {thr}}_{s_1}\) or \(|| p - p_{s_2} || > {\text {thr}}_{s_2}\)):
-
(a)
Create a new node \(n_{\text{new}}\) with pattern \(p:\) \(N = N \cup n_{\text{new}}\) with \(p_{\text{new}} = p.\)
-
(b)
Initialise the number of signals for node \(n_{\text{new}}\) to \(1:\) \(\text{sig}_{n_\text{new}} = 1.\)
-
(c)
(Connection of new nodes)
If the distance to the nearest node is within the threshold (\(|| p - p_{s_1} || \le {\text {thr}}_{s_1}\)):
-
(i)
Let c denote the cluster which node s 1 belongs to.
-
(ii)
Check if the distance to the cluster mean is greater for node n new than for node s 1:
-
(A)
Let \(\overline{p_c}\) denote the mean of cluster \(c\) with \(\overline{p_c} = \frac{ \sum _{n \in N_c} p_n }{ |N_c| }.\)
-
(B)
Check if \(|| \overline{p_c} - p || > || \overline{p_c} - p_{s_1} ||\) (condition 1).
-
(A)
-
(iii)
Check if the distance between node nnew and node s 1 is greater than the average distance between nodes of cluster c:
-
(A)
Let \(\overline{d_c}\) denote the average distance between the nodes of cluster c.
-
(B)
Check if \(|| p - p_{s_1} || > \overline{d_c}\)(condition 2).
-
(A)
-
(iv)
If both conditions 1 and 2 are fulfilled:
-
(A)
Create an edge between node n new and node s 1: \(E = E \cup (n_{\text{new}},s_1).\)
-
(A)
-
(i)
-
(a)
-
7.
If both the distances to the two nearest nodes are within the corresponding similarity thresholds (\(|| p - p_{s_1} || \le \text{thr}_{s_1}\) and \(|| p - p_{s_2} || \le \text{thr}_{s_2}\)):
-
(a)
Increment the age of all edges that are directly connected to node \(s_1:\) \({\text {age}}_{(s_1,n)} = {\text {age}}_{(s_1,n)} + 1 \forall n \in \text{neighbours}(s_1).\)
-
(b)
If no edge exists between node \(s_1\) and node \(s_2,\) create this edge: \(E = E \cup (s_1,s_2).\)
-
(c)
Reset the age of the edge between node \(s_1\) and node \(s_2:\) \({\text {age}}_{(s_1,s_2)} = 0.\)
-
(d)
Increase the accumulated error of node \(s_1\) by the distance to the pattern \(p:\) \(\text{err}_{s_1} = \text{err}_{s_1} + || p_{s_1} - p ||.\)
-
(e)
Increment the number of signals for node \(s_1:\) \(\text{sig}_{s_1} = \text{sig}_{s_1} + 1.\)
-
(f)
Adjust the pattern of node \(s_1:\) \(p_{s_1} = p_{s_1} + \epsilon _1 \cdot (p - p_{s_1})\) with \(\epsilon _1 = \frac{1}{{\text {sig}}_{s_1}} .\)
-
(g)
Adjust the patterns of the neighbours of node \(s_1:\) \(p_n = p_n + \epsilon _2 \cdot (p - p_n) \forall n \in \text{neighbours}(s_1)\) with \(\epsilon _2 = 0.01 \cdot \frac{1}{{\text {sig}}_{s_1}} .\)
-
(h)
Remove edges with an age greater than the specified \({\text {age}}_{{\text {dead}}}:\) \(E = E \backslash e \forall e \in E\) with \({\text {age}}_e > {\text {age}}_{{\text {dead}}}.\)
-
(a)
-
8.
If the number of inputs reaches a pre-defined value \(\lambda \) (\({\text {count}}_p \ge \lambda \)), clean up the topology:
-
(a)
Reset the number of inputs: \({\text {count}}_p = 0.\)
-
(b)
(Removal of longest edge)
Remove the edge with the maximum length: \(E = E \backslash e_m\) with \(e_m = \arg \max _{(n_a,n_b) \in E} || p_a - p_b || .\)
-
(c)
(Removal of minimum-density node)
Remove the node with the minimum number of signals: \(N = N \backslash n_m\) with \(n_m = \arg \min _{n \in N} \text{sig}_n.\)
-
(d)
Reduce the local error by inserting a new node:
-
(i)
Determine the node \(n_q\) with the maximum error: \(n_q = \arg \max _{n \in N} \text{err}_n.\)
-
(ii)
If node \(n_q\) has neighbours:
-
(A)
Determine the neighbour \(n_f\) with the maximum error: \(n_f = \arg \max _{n \in {\text {neighbours}}(n_q)} \text{err}_n.\)
-
(B)
Create a new node \(n_r:\) \(N = N \cup n_r.\)
-
(C)
Set the pattern for node \(n_r\) to \( p_r = \frac{1}{2} (p_q + p_f).\)
-
(D)
Set the error for node \(n_r\) to \( {\text {err}}_r = {\text {err}}_q. \)
-
(E)
Set the number of signals for node \(n_r\) to \(\text{sig}_r = \text{sig}_q.\)
-
(F)
Decrease the error of node \(n_q:\) \( {\text {err}}_q = 0.5 \cdot {\text {err}}_q.\)
-
(G)
Decrease the error of node \(n_f\): \( {\text {err}}_f = 0.5 \cdot {\text {err}}_f \).
-
(H)
Create an edge between node \(n_q\) and node \(n_r\): \( E = E \cup (n_q,n_r) \).
-
(I)
Create an edge between node \(n_r\) and node \(n_f\): \( E = E \cup (n_r,n_f) \).
-
(J)
Remove the edge between node \(n_q\) and node \(n_f:\) \( E = E \backslash (n_q,n_f) \).
-
(A)
-
(i)
-
(e)
Prune clusters by removing weakly connected nodes:
-
(i)
Let \(\overline{{sig}}\) denote the average number of signals in the topology: \( \overline{sig} = \frac{ \sum _{n \in N} {\text {sig}}_n }{ |N| } \).
-
(ii)
Remove nodes with two neighbours based on the number of signals and the \(c_2\) parameter:
\( \forall n \in N \): \( N = N \backslash n \) if \( |{\text {neighbours}}(n)| = 2\) and \({\text {sig}}_n < c_2 \cdot \overline{sig} \).
-
(iii)
Remove nodes with one neighbour based on the number of signals and the \(c_1\) parameter:
\( \forall n \in N \): \( N = N \backslash n \) if \(|\text{neighbours}(n)| = 1\) and \( \text{sig}_n < c_1 \cdot \overline{sig}\).
-
(iv)
Remove isolated nodes:
\( \forall n \in N\): \(N = N \backslash n\) if \(|\text{neighbours}(n)| = 0\).
-
(i)
-
(f)
(Joining of clusters)
If at least two clusters exist in the topology (\( |C| \ge 2 \)), join two clusters if they are close enough to each other:
-
(i)
Determine the two clusters \(c_a\) and \(c_b\) with the minimum distance between them:
\(c_a, c_b = \arg \min _{c_a, c_b \in C, c_a \ne c_b} || p_a - p_b || \) with \( n_a \in c_a, n_b \in c_b \).
-
(ii)
Let \(d_{{\text {min}}}\) denote the minimum distance between the clusters \(c_a\) and \(c_b\):
\( d_{{\text {min}}} = \min _{c_a, c_b \in C, c_a \ne c_b} || p_a - p_b || \) with \( n_a \in c_a, n_b \in c_b \).
-
(iii)
If an absolute join tolerance is used, join the clusters based on the parameter \(\phi \):
-
(A)
If \( d_{{\text {min}}} < \sqrt{\phi ^2 \cdot {\text {dim}}} \)
create an edge between node \(n_a\) and node \(n_b\): \(E = E \cup (n_a,n_b)\).
-
(A)
-
(iv)
If a relative join tolerance is used, join the clusters based on the parameter \(\psi \) and the average node distances in clusters \(c_a\) and \(c_b\):
-
(A)
Let \(\overline{d_{c_a}}\) denote the average distance between the nodes of cluster \(c_a\).
-
(B)
Let \(\overline{d_{c_b}}\) denote the average distance between the nodes of cluster \(c_b\).
-
(C)
If \( d_{{\text {min}}} < \psi \cdot \overline{d_{c_a}} \) and \( d_{{\text {min}}} < \psi \cdot \overline{d_{c_b}} \)
create an edge between node \(n_a\) and node \(n_b\): \(E = E \cup (n_a,n_b)\).
-
(A)
-
(v)
If an edge was created (and two clusters were joined), repeat (Joining of clusters).
-
(i)
-
(a)
Rights and permissions
About this article
Cite this article
Keysermann, M.U., Vargas, P.A. Towards Autonomous Robots Via an Incremental Clustering and Associative Learning Architecture. Cogn Comput 7, 414–433 (2015). https://doi.org/10.1007/s12559-014-9311-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-014-9311-y