A hand shape instruction recognition and learning system using growing SOM with asymmetric neighborhood function
Introduction
Theories and techniques of Human–Machine Interaction (HMI) are becoming more and more important nowadays, for the rapid developments of partner robots such as housework robots, entertainment robots, pet robots, etc. Users of robots or other intelligent machines are not only limited to experts and engineers, but also ordinary people and even children. For an intelligent autonomous robot, visual or auditory information should be acquired and analyzed to estimate the intention or instructions of human. So approaches such as gesture or posture recognition [1], [2], [3], [4], [5], [6], voice discrimination [7], [8], [9], face recognition [10], [11], [12], etc., have attracted many researchers.
Generally, for autonomous systems, there are two common issues to deal with:
- a)
What is the input?
- b)
How to get the output, or adapt the input for an unknown input?
The first issue seems to be more important, because the second issue it is directly related to the former. The input data needs to be classified, recognized and recorded by the system. So it is the major task of pattern recognition.
In [2], [3], [7], [8], [9], Kohonen׳s Self-Organizing Map (SOM) [13], [14], [15], [16], [17] is used to solve the first issue. SOM is a powerful unsupervised machine learning method to deal with clustering of high-dimensional data. As being a type of artificial neural networks (ANNs) which represents competitive learning in the primary sensory cortex, SOM has been extensively applied to data mining, pattern recognition, and visualized clustering of complex data sets.
In [4,5], hidden Markov model (HMM) is used to recognize hand gestures. HMM is a stochastic model which supposes than an unknown system belongs to a Markov processing with unknown parameters and estimates the parameters according to the observable information. For its high cost of computation, HMM is not spread in practical use.
In [6], Kuremoto et al. used a simplified retina-V1 model and one-pass dynamic programming method to estimate the different gestures of an arm of user.
Zhao, Huang and Sun [10] proposed a neural networks committee (NNC) machine to tackle face recognition based on multi-features.
In [11], Chen et al. proposed a kernel machine-based one-parameter regularized fisher discriminant method for face recognition, and showed that it is superior to conventional linear discriminant analysis (LDA) methods.
Li, Zheng and Huang in [12] gave an improved LDA named “locally linear discriminant embedding (LLDE)”.
Additionally, advanced ANNs [18], [19], [20], support vector machines (SVMs) [21] are also powerful tools for pattern recognition and HMI problems.
In this paper, we concentrate to develop a visual recognition system using an improved SOM. The system for HMI is supposed to classify input images consisting different shapes of a user’s hand, and learn how to output correct actions according to the judgment of the user (instructor).
Theoretically, there are some defects existed in the original SOM such as:
- a)
Exhaustion of units.
- b)
Difficulty of adjustment of parameters.
- c)
Disorder of topology of map.
To overcome these problems, different improved SOM variants have been proposed.
For example, to deal with the exhaustion of units, Growing SOM (GSOM) is proposed in [22], [23], [24], [25], and Transient-SOM (T-SOM) is proposed in our previous works [26], [27].
To keep stable learning convergence and use less parameters, Parameterless SOM (PL-SOM) is proposed by Berglund & Sitte in [28] and we adopted its idea to GSOM as a Parameterless Growing SOM (PL-G-SOM) [7], [8], [9], [29].
To avoid defects of map׳s topology in learning process, an asymmetric neighborhood function is considered and adapted to the original SOM by Aoki and Aoyagi in [30], [31], [32] recently. However, this method is not used in other SOM variants such as GSOM, and PL-G-SOM.
In this paper, we propose to introduce Aoki and Aoyagi׳s asymmetric neighborhood function [30], [31], [32] into GSOM as two novel variation of SOM, one is named “AGSOM” and other one is “IAGSOM”. In AGSOM, the direction of asymmetric neighborhood is fixed during learning process. In IAGSOM, opposite directions are used iteratively to solve the disorder of topology asymptotically. AGSOM and IAGSOM are used in a hand shape instruction recognition and learning system [26], [27], [29] for HMI. Experiments including image processing were performed to confirm the proposed system.
In the rest of this paper, we will describe the original SOM [13], [14], [15], PL-G-SOM [7], [8], [9], [29], AGSOM (proposed), and IAGSOM (proposed) in Section 2, and employ them into a hand shape instruction recognition and learning system in Section 3. Hand image instruction learning experiments were reported in Section 4. Conclusions are in Section 5 at last.
Section snippets
Original SOM
Kohonen׳s SOM algorithm maps n-dimensional data x(x1, x2 …, xn) to m units in a low-dimensional space (so-called “feature map”) with connections mi (m1, m2 …, mn) by a winner-takes-all rule:where i=1, 2, …, m means the number of units on the low-dimension map (1 or 2-dimension grids). indicates the best-match-unit (BMU) on the map which has the shortest Euclidean distance with the input data x. Initially, connection weight mi is given by a random value, and following
A hand shape instruction recognition and learning system
To realize Human–Machine Interaction (HMI), the behavior of a user such as pose of body, gesture of hands, voice, face expression, etc need to be recognized and understood by the machine. In our previous work, a hand shape instruction recognition and learning system was constructed for partner robots, such as pet robot, housework robot, entertainment robot, etc using T-SOM [26], [27], and PL-G-SOM [29]. Here, we use AGSOM and IAGSOM described above to accelerate the convergence of learning
Skin area segmentation
Shape estimation and “Region of Interest (ROI)” segmentation have been studied widely for several decades. Neural networks combined with evolutionary algorithm [34], constrained maximum variance mapping (CMVM) [35], density-based clustering method [36] and region level based methods [37] have been proposed. To extract the hand shape from images captured by camera of robot, we used an image processing method proposed in our previous work [9], [26], [27], [29]. The method uses the threshold
Conclusion
In this paper, we employed an asymmetric neighborhood function proposed by Aoki and Aoyagi in Growing SOM (GSOM). The modified GSOM by the asymmetric neighborhood with different learning process was named as “AGSOM” and “IAGSOM”. The new variations of SOM showed their superiority over conventional PL-G-SOM in hand shape recognition and instruction learning system. Comparing novel systems with the conventional system, the proposed systems provide better learning convergence and stability.
For a
Acknowledgments
A part of this work was supported by Grant-in-Aid for Scientific Research of JSPS (Nos. 23500181, 25330287, and 26330254), Japan.
Takashi Kuremoto received the B.E. degree in System Engineering at University of Shanghai for Science and Technology, China in 1986, and M.E. and Ph.D. degrees at Yamaguchi University, Japan in 1996 and 2014, respectively. He worked as a system engineer at Research Institute of Automatic Machine of Beijing from 1986 to 1992 and is currently an assistant professor in Division of Computer Science & Design Engineering, Graduate School of Science and Engineering at Yamaguchi University, Japan. He
References (40)
- et al.
A gesture recognition system with retina-V1 model and one-pass dynamic programming
Neurocomputing
(2013) - et al.
Human face recognition based on multiple features using neural networks committee
Pattern Recognit. Lett.
(2004) - et al.
Locally linear discriminant embedding, an efficient method for face recognition
Pattern Recognit.
(2008) The self-organizing map
Neurocomputing
(1998)- et al.
Theoretical aspects of the SOM algorithm
Neurocomputing
(1998) Essentials of the self-organizing map
Neural Netw.
(2013)- et al.
A novel full structure optimization algorithm for radial basis probabilistic neural networks
Neurocomputing
(2006) - et al.
Applications of the growing self-organizing map
Neurocomputing
(1998) - et al.
Shape recognition based on neural networks trained by differential evolution algorithm
Neurocomputing
(2007) - et al.
Feature extraction using constrained maximum variance mapping
Pattern Recognit.
(2008)
An efficient local Chan-Vese model for image segmentation
Pattern Recognit.
Visual interpretation of hand gesture for human–computer interaction: a review
IEEE Trans. Pattern Anal. Mach. Intell.
Parametric hidden Markov models for gesture recognition
IEEE Trans. Pattern Anal. Mach. Intell.
Parameterless-growing-SOM and its application to a voice instruction learning system
J. Robot.
Instruction learning systems for partner robots
Kernel machine-based one-parameter regularized fisher discriminant method for face recognition
IEEE Trans. Syst., Man. Cybern.
Cited by (10)
Online state space generation by a growing self-organizing map and differential learning for reinforcement learning
2020, Applied Soft ComputingCitation Excerpt :By constructing a hierarchy of SOMs, it is possible to present global and local features by each different level of SOMs. Kuremoto et al. proposed a parameterless growing SOM and an asymmetric neighborhood function for online learning [24,25]. Their asymmetric neighborhood function is suitable for online learning and seems to be a fully applicable model if the annealing schedule and the function are adjusted to improve the learning speed of online reinforcement learning.
A faster dynamic convergency approach for self-organizing maps
2023, Complex and Intelligent SystemsPruning Growing Self-Organizing Map Network for Human Physical Activity Identification
2022, Journal of Healthcare EngineeringUser identification system for inked fingerprint pattern based on central moments
2021, Indonesian Journal of Electrical Engineering and Computer ScienceHand Recognition System Based on Invariant Moments Features
2021, Advances in Intelligent Systems and ComputingNew Hardware Architecture for Self-Organizing Map Used for Color Vector Quantization
2020, Journal of Circuits, Systems and Computers
Takashi Kuremoto received the B.E. degree in System Engineering at University of Shanghai for Science and Technology, China in 1986, and M.E. and Ph.D. degrees at Yamaguchi University, Japan in 1996 and 2014, respectively. He worked as a system engineer at Research Institute of Automatic Machine of Beijing from 1986 to 1992 and is currently an assistant professor in Division of Computer Science & Design Engineering, Graduate School of Science and Engineering at Yamaguchi University, Japan. He was an Academic Visitor of School of Computer Science, The University of Manchester, U.K. in 2008. His research interests include bioinformatics, machine learning, complex systems, forecasting and swarm intelligence. He is a member of IEICE, IEEE, IIF and SICE.
Takuhiro Otani received the B.E., M.E. Degrees in information science and engineering from Yamaguchi University, Japan, in 2012 and 2014, respectively. His research interests are neural networks, image processing, and human-machine interaction. He is now with Fujitsu Ten Limited.
Masanao Obayashi received the B.S. degree in electrical engineering from Kyushu Institute of Technology, Japan, in 1975, the M.S. and the Ph.D. degrees in engineering from Kyushu University, Japan, in 1977 and 1997, respectively. From 1977 to 1989, he was with Hitachi Ltd. From 1989 to 1998, he was with Kyushu University, where he was a Research Associate and an Associate Professor. From 1998 he is with Yamaguchi University, Japan, since July 2001 he has been a Professor in the Faculty of Engineering, Yamaguchi University. He is now with the Graduate school of Science and Engineering of the same university. His current research interests are intelligent computing and intelligent control. Prof. Obayashi is a member of SICE, IEEJ, JSAI, JSFTII and IEEE.
Kunikazu Kobayashi received B.E., M.E., and Ph.D. degrees in electronic engineering, computer science, and computer engineering from Yamaguchi University, Japan, respectively. From 1994 to 2012, he was an assistant professor at Division of Computer Science & Design Engineering, Yamaguchi University. He was a visiting research professor at Department of Biomedical Engineering, University of Houston, USA in 2010. He was awarded the Highest Honor Student Award from Yamaguchi University and the Best Paper Award at ANNIE in 1994. Currently, he is an associate professor of School of Information Science and Technology, Aichi Prefectural University, Japan. His research interests include neural networks (especially wavelet neural networks), Bayesian method, reinforcement learning, multi-agent system, and biomedical data analysis. He is a member of IEEE, IEEJ, SICE, and IEICE and also a member of research committee on real application oriented machine learning of IEEJ.
Shingo Mabu received the B.E. and M.E degrees in electrical engineering from Kyushu University, Japan, in 2001 and 2003, respectively, and the Ph.D. degree from Waseda University, Japan, in 2006. From 2006 to 2007, he was a visiting lecturer in Waseda University, and from 2007 to 2012 he was an assistant professor in Waseda University. Since September 2012, he has been an assistant Professor in Graduate School of Science and Engineering, Yamaguchi University. His research interests are evolutionary computation, reinforcement learning and data mining. He is a member of SICE, IEEJ and IEEE.