Elsevier

Neurocomputing

Volume 188, 5 May 2016, Pages 31-41
Neurocomputing

A hand shape instruction recognition and learning system using growing SOM with asymmetric neighborhood function

https://doi.org/10.1016/j.neucom.2014.10.108Get rights and content

Abstract

For Human–Machine Interaction systems, it is a convenient method to send user׳s instructions to robots, TV sets, and other electronic equipments by showing different shapes of a hand of user. In our previous works, we proposed to use improved Kohonen׳s Self-Organizing Maps (SOMs), i.e., Transient-SOM (T-SOM) and Parameterless Growing SOM (PL-G-SOM) to recognize different patterns of hand shapes given by different bendings of five fingers of a hand. Recently, an asymmetric neighborhood function was proposed and introduced into the conventional SOM to improve the learning performance by Aoki and Aoyagi. In this paper, we propose to employ their asymmetric neighborhood function into Growing SOM (GSOM), which is an improved SOM to deal with additional online learning for input data. Furthermore, the improved GSOM is applied to a hand shape recognition and instruction learning system, and the results of experiments with eight kinds of instructions showed the effectiveness of the proposed system.

Introduction

Theories and techniques of Human–Machine Interaction (HMI) are becoming more and more important nowadays, for the rapid developments of partner robots such as housework robots, entertainment robots, pet robots, etc. Users of robots or other intelligent machines are not only limited to experts and engineers, but also ordinary people and even children. For an intelligent autonomous robot, visual or auditory information should be acquired and analyzed to estimate the intention or instructions of human. So approaches such as gesture or posture recognition [1], [2], [3], [4], [5], [6], voice discrimination [7], [8], [9], face recognition [10], [11], [12], etc., have attracted many researchers.

Generally, for autonomous systems, there are two common issues to deal with:

  • a)

    What is the input?

  • b)

    How to get the output, or adapt the input for an unknown input?

The first issue seems to be more important, because the second issue it is directly related to the former. The input data needs to be classified, recognized and recorded by the system. So it is the major task of pattern recognition.

In [2], [3], [7], [8], [9], Kohonen׳s Self-Organizing Map (SOM) [13], [14], [15], [16], [17] is used to solve the first issue. SOM is a powerful unsupervised machine learning method to deal with clustering of high-dimensional data. As being a type of artificial neural networks (ANNs) which represents competitive learning in the primary sensory cortex, SOM has been extensively applied to data mining, pattern recognition, and visualized clustering of complex data sets.

In [4,5], hidden Markov model (HMM) is used to recognize hand gestures. HMM is a stochastic model which supposes than an unknown system belongs to a Markov processing with unknown parameters and estimates the parameters according to the observable information. For its high cost of computation, HMM is not spread in practical use.

In [6], Kuremoto et al. used a simplified retina-V1 model and one-pass dynamic programming method to estimate the different gestures of an arm of user.

Zhao, Huang and Sun [10] proposed a neural networks committee (NNC) machine to tackle face recognition based on multi-features.

In [11], Chen et al. proposed a kernel machine-based one-parameter regularized fisher discriminant method for face recognition, and showed that it is superior to conventional linear discriminant analysis (LDA) methods.

Li, Zheng and Huang in [12] gave an improved LDA named “locally linear discriminant embedding (LLDE)”.

Additionally, advanced ANNs [18], [19], [20], support vector machines (SVMs) [21] are also powerful tools for pattern recognition and HMI problems.

In this paper, we concentrate to develop a visual recognition system using an improved SOM. The system for HMI is supposed to classify input images consisting different shapes of a user’s hand, and learn how to output correct actions according to the judgment of the user (instructor).

Theoretically, there are some defects existed in the original SOM such as:

  • a)

    Exhaustion of units.

  • b)

    Difficulty of adjustment of parameters.

  • c)

    Disorder of topology of map.

To overcome these problems, different improved SOM variants have been proposed.

For example, to deal with the exhaustion of units, Growing SOM (GSOM) is proposed in [22], [23], [24], [25], and Transient-SOM (T-SOM) is proposed in our previous works [26], [27].

To keep stable learning convergence and use less parameters, Parameterless SOM (PL-SOM) is proposed by Berglund & Sitte in [28] and we adopted its idea to GSOM as a Parameterless Growing SOM (PL-G-SOM) [7], [8], [9], [29].

To avoid defects of map׳s topology in learning process, an asymmetric neighborhood function is considered and adapted to the original SOM by Aoki and Aoyagi in [30], [31], [32] recently. However, this method is not used in other SOM variants such as GSOM, and PL-G-SOM.

In this paper, we propose to introduce Aoki and Aoyagi׳s asymmetric neighborhood function [30], [31], [32] into GSOM as two novel variation of SOM, one is named “AGSOM” and other one is “IAGSOM”. In AGSOM, the direction of asymmetric neighborhood is fixed during learning process. In IAGSOM, opposite directions are used iteratively to solve the disorder of topology asymptotically. AGSOM and IAGSOM are used in a hand shape instruction recognition and learning system [26], [27], [29] for HMI. Experiments including image processing were performed to confirm the proposed system.

In the rest of this paper, we will describe the original SOM [13], [14], [15], PL-G-SOM [7], [8], [9], [29], AGSOM (proposed), and IAGSOM (proposed) in Section 2, and employ them into a hand shape instruction recognition and learning system in Section 3. Hand image instruction learning experiments were reported in Section 4. Conclusions are in Section 5 at last.

Section snippets

Original SOM

Kohonen׳s SOM algorithm maps n-dimensional data x(x1, x2 …, xn) to m units in a low-dimensional space (so-called “feature map”) with connections mi (m1, m2 …, mn) by a winner-takes-all rule:c=argmin(||xmi||),iwhere i=1, 2, …, m means the number of units on the low-dimension map (1 or 2-dimension grids). cindicates the best-match-unit (BMU) on the map which has the shortest Euclidean distance with the input data x. Initially, connection weight mi is given by a random value, and following

A hand shape instruction recognition and learning system

To realize Human–Machine Interaction (HMI), the behavior of a user such as pose of body, gesture of hands, voice, face expression, etc need to be recognized and understood by the machine. In our previous work, a hand shape instruction recognition and learning system was constructed for partner robots, such as pet robot, housework robot, entertainment robot, etc using T-SOM [26], [27], and PL-G-SOM [29]. Here, we use AGSOM and IAGSOM described above to accelerate the convergence of learning

Skin area segmentation

Shape estimation and “Region of Interest (ROI)” segmentation have been studied widely for several decades. Neural networks combined with evolutionary algorithm [34], constrained maximum variance mapping (CMVM) [35], density-based clustering method [36] and region level based methods [37] have been proposed. To extract the hand shape from images captured by camera of robot, we used an image processing method proposed in our previous work [9], [26], [27], [29]. The method uses the threshold

Conclusion

In this paper, we employed an asymmetric neighborhood function proposed by Aoki and Aoyagi in Growing SOM (GSOM). The modified GSOM by the asymmetric neighborhood with different learning process was named as “AGSOM” and “IAGSOM”. The new variations of SOM showed their superiority over conventional PL-G-SOM in hand shape recognition and instruction learning system. Comparing novel systems with the conventional system, the proposed systems provide better learning convergence and stability.

For a

Acknowledgments

A part of this work was supported by Grant-in-Aid for Scientific Research of JSPS (Nos. 23500181, 25330287, and 26330254), Japan.

Takashi Kuremoto received the B.E. degree in System Engineering at University of Shanghai for Science and Technology, China in 1986, and M.E. and Ph.D. degrees at Yamaguchi University, Japan in 1996 and 2014, respectively. He worked as a system engineer at Research Institute of Automatic Machine of Beijing from 1986 to 1992 and is currently an assistant professor in Division of Computer Science & Design Engineering, Graduate School of Science and Engineering at Yamaguchi University, Japan. He

References (40)

  • X.-F. Wang et al.

    An efficient local Chan-Vese model for image segmentation

    Pattern Recognit.

    (2010)
  • V.I. Pavlovic et al.

    Visual interpretation of hand gesture for human–computer interaction: a review

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1997)
  • C. Nolker, H. Ritter, Parametrized SOMs for hand posture reconstruction, In: Proceedings of IEEE-INNS-ENNS...
  • G. Heidemann, H. Bekel, L. Bax, A. Saalbach, Hand gesture recognition: selforganising maps as a graphical user...
  • M. Hossain, M. Jenkin, Recognizing hand-raising gesture using HMM, In: Proceedings of 2nd Canadian Conference on...
  • A.D. Wilson et al.

    Parametric hidden Markov models for gesture recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1999)
  • T. Kuremoto et al.

    Parameterless-growing-SOM and its application to a voice instruction learning system

    J. Robot.

    (2010)
  • T. Kuremoto, T. Yamane, L.-B. Feng, K. Kobayashi, and M. Obayashi, A human-machine interaction system: a voice command...
  • T. Kuremoto et al.

    Instruction learning systems for partner robots

  • W.S. Chen et al.

    Kernel machine-based one-parameter regularized fisher discriminant method for face recognition

    IEEE Trans. Syst., Man. Cybern.

    (2005)
  • Cited by (10)

    View all citing articles on Scopus

    Takashi Kuremoto received the B.E. degree in System Engineering at University of Shanghai for Science and Technology, China in 1986, and M.E. and Ph.D. degrees at Yamaguchi University, Japan in 1996 and 2014, respectively. He worked as a system engineer at Research Institute of Automatic Machine of Beijing from 1986 to 1992 and is currently an assistant professor in Division of Computer Science & Design Engineering, Graduate School of Science and Engineering at Yamaguchi University, Japan. He was an Academic Visitor of School of Computer Science, The University of Manchester, U.K. in 2008. His research interests include bioinformatics, machine learning, complex systems, forecasting and swarm intelligence. He is a member of IEICE, IEEE, IIF and SICE.

    Takuhiro Otani received the B.E., M.E. Degrees in information science and engineering from Yamaguchi University, Japan, in 2012 and 2014, respectively. His research interests are neural networks, image processing, and human-machine interaction. He is now with Fujitsu Ten Limited.

    Masanao Obayashi received the B.S. degree in electrical engineering from Kyushu Institute of Technology, Japan, in 1975, the M.S. and the Ph.D. degrees in engineering from Kyushu University, Japan, in 1977 and 1997, respectively. From 1977 to 1989, he was with Hitachi Ltd. From 1989 to 1998, he was with Kyushu University, where he was a Research Associate and an Associate Professor. From 1998 he is with Yamaguchi University, Japan, since July 2001 he has been a Professor in the Faculty of Engineering, Yamaguchi University. He is now with the Graduate school of Science and Engineering of the same university. His current research interests are intelligent computing and intelligent control. Prof. Obayashi is a member of SICE, IEEJ, JSAI, JSFTII and IEEE.

    Kunikazu Kobayashi received B.E., M.E., and Ph.D. degrees in electronic engineering, computer science, and computer engineering from Yamaguchi University, Japan, respectively. From 1994 to 2012, he was an assistant professor at Division of Computer Science & Design Engineering, Yamaguchi University. He was a visiting research professor at Department of Biomedical Engineering, University of Houston, USA in 2010. He was awarded the Highest Honor Student Award from Yamaguchi University and the Best Paper Award at ANNIE in 1994. Currently, he is an associate professor of School of Information Science and Technology, Aichi Prefectural University, Japan. His research interests include neural networks (especially wavelet neural networks), Bayesian method, reinforcement learning, multi-agent system, and biomedical data analysis. He is a member of IEEE, IEEJ, SICE, and IEICE and also a member of research committee on real application oriented machine learning of IEEJ.

    Shingo Mabu received the B.E. and M.E degrees in electrical engineering from Kyushu University, Japan, in 2001 and 2003, respectively, and the Ph.D. degree from Waseda University, Japan, in 2006. From 2006 to 2007, he was a visiting lecturer in Waseda University, and from 2007 to 2012 he was an assistant professor in Waseda University. Since September 2012, he has been an assistant Professor in Graduate School of Science and Engineering, Yamaguchi University. His research interests are evolutionary computation, reinforcement learning and data mining. He is a member of SICE, IEEJ and IEEE.

    View full text