ART2 neural network interacting with environment
Introduction
Although neural networks (NNs) are extensively studied by many researchers, adaptive-resonance-theory-based (ART2) NNs have not been widely studied [1], [2]. The research in this area is mainly focused on (i) pattern excursion, this is typically studied by the method of fractionizing and fitting [3], or by adding two bounds [4]; (ii) subjective setting of vigilance parameter, the research on this topic is typically based on the creation of new vigilance test criterion [5], [6]; and (iii) output without hierarchical structure, on which improved and modified ART2 NN learning algorithms were proposed [7], [8]. Like many other types of NNs, ART2 NNs require a large set of samples for the training purpose. In some cases, to obtain such samples is very difficult or even impossible. To overcome this difficulty, this paper proposes the use of reinforcement learning (RL) to train ART2 networks without the need of samples. In other words, the learning is based on the information from the system, for which pattern recognition technique is applied, i.e., the learning is based on interaction with a dynamic environment.
In general, ART2 NN [9] performs steady classification through competitive learning and self-stabilization mechanism. Utilizing the ability of pattern recognition and classification of ART2 NNs has been a popular subject of study in the engineering, artificial intelligence, and machine learning communities. It is applied in many research areas such as fault detection [10], video scene change detection [11], the viability of recommender systems in a citizen web portal [12], and so on.
As one of the most commonly used RL algorithms, temporal difference (TD) is used in this paper to train ART2 NN [13], via interaction with the dynamic environment instead of training samples. The state parameters of the system are inputted to the ART2 NNs. From this information, the ART2 NN generates a required pattern, which in turn is used by the system. At the RL level, according to the TD algorithm and the output from the system, it “rewards” the ART2 NN. Through this iterative process, a good ART2 NN can be finally obtained.
The rest of the paper is organized as follows. Section 2 gives a brief introduction to RL. In Section 3, the RL-ART2 model, its architecture, and learning algorithm are described. Section 4 presents some simulation results and error analysis for an example system, i.e. collaboration movement of mobile robots, which are compared with those obtained by applying a genetic algorithm for learning. Finally, a brief conclusion is given in Section 5.
Section snippets
Reinforcement learning
RL [13], also called strengthen learning or encourage learning, is one of the important machine learning methods. In a dynamic environment, it can be used to acquire self-adaptability response behavior, i.e. an asynchronous dynamic-planning method based on Markova decision process (MDP). Because of its ability to implement decision optimization through interaction with an environment, RL has a broad application value in solving complex optimization control problems.
A general RL model is
Reinforcement-learning-based ART2 NN (RL-ART2)
In this section, we briefly review the basic concepts and notations of ART2 NNs, as shown in Fig. 2. Subsequently, the architectures of ART2 nodes incorporating RL and its learning algorithm are outlined.
Simulation experiment and analysis
The parameters and settings of the environment are as follows:
- •
Simulator: TeamBots [16].
- •
Environment shown in Fig. 5, the objects graded as 0 and 1 are robot R, whose movement velocity is 0.7. The object in a square shape is the recycle bin, the objects in a long pole are the goal objects and other objects are static obstacles.
- •
For each control mode, run about 200 episodes to get an average value. Each episode has 400 simulation steps and each simulation step is about 0.06 s.
- •
The maximum deflection
Conclusions
ART2 NNs, like many other types of NNs, require a large set of samples for the training purpose. In some cases, to obtain such samples is very difficult or even impossible. To overcome this difficulty, we developed an on-line reinforced learning algorithm in this paper to train an ART2 NN without the need of samples. In other words, a novel RL-based ART2 NN, RL-ART2, is developed and is capable of online learning through interaction with environments. The proposed approach was evaluated by
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grants 60774059 and 60834002, Shanghai Leading Academic Disciplines under Grant T0103, the Excellent Young Teachers Program of Shanghai Municipal Commission of Education, and Key Project of Science and Technology Commission of Shanghai Municipality under Grants 061107031 and 06ZR14131. Also, the authors would like to thank Dr. T.C. Yang and Professor H.S. Hu for their helpful suggestions in improving the quality
Jian Fan received his B.E. degree in artillery command and M.S. degree in computer science from PLA Artillery College in 1999 and 2002, and his Ph.D. degree in Control Theory and Control Engineering from Shanghai University in 2006. His current research interests includes intelligence control, robot control, and computation intelligence. Now, he works as a post-doctor in the Department of Automation, Shanghai University.
References (17)
- et al.
An approach based on the adaptive resonance theory for analyzing the viability of recommender systems in a citizen Web portal
J. Expert Syst. Appl.
(2007) - et al.
Fuzzy ART: fast stable learning and categorization of analog patterns by adaptive resonance system
J. Neural Networks,
(1991) Adaptive pattern classification and universal recoding, I: parallel development and coding of neural feature detectors
J. Biol. Cybern.
(1976)Adaptive pattern classification and universal recoding, II: feedback, expectation, olfaction, and illusions
J. Biol. Cybern.
(1976)- et al.
The research on a fractionizing and fitting ART2 neural network with supervise
Acta Electron. Sini.
(2004) - et al.
ART2 neural network with bidirectional matching mechanism
J. Zhejiang Univ. Eng. Sci.
(2004) - X.D. Qian, Z.O. Wang, Y. Wang, A method of data clustering based on improved algorithm of ART2, in: Proceedings of the...
- et al.
ART2 neural networks with more vigorous vigilance test criterion
J. Image Graphics
(2001)
Cited by (6)
Seawater intrusion pattern recognition supported by unsupervised learning: A systematic review and application
2023, Science of the Total EnvironmentCitation Excerpt :However, other ANN present good alternatives to the Kohonen map. Adaptive resonance theory 2 is an unsupervised network that classifies samples based on their memory,which makes it possible to include new samples after training, classify existing clusters, or create new ones (Fan et al., 2008). Moreover, gas neural networks can be used to preserve data topology, such as SOM, but avoids empty neurons (Du, 2010).
Research on Online Classification of Road Vehicle Types Based on Electromagnetic Induction
2019, Hunan Daxue Xuebao/Journal of Hunan University Natural SciencesAn Analysis of the Optimal Customer Clusters Using Dynamic Multi-Objective Decision
2018, International Journal of Information Technology and Decision MakingA novel neural network structure for motion control in joints
2017, 2016 International Conference on Electrical, Electronics, Communication, Computer and Optimization Techniques, ICEECCOT 2016Crack fault detection for a gearbox using discrete wavelet transform and an adaptive resonance theory neural network
2015, Strojniski Vestnik/Journal of Mechanical EngineeringA neural-based concurrency control algorithm for database systems
2013, Neural Computing and Applications
Jian Fan received his B.E. degree in artillery command and M.S. degree in computer science from PLA Artillery College in 1999 and 2002, and his Ph.D. degree in Control Theory and Control Engineering from Shanghai University in 2006. His current research interests includes intelligence control, robot control, and computation intelligence. Now, he works as a post-doctor in the Department of Automation, Shanghai University.
Yang Song received his B.E. degree and Ph.D. degree from the Department of Automation, Nanjing University of Science and Technology in 1998 and 2006. His interests are switched systems, hybrid systems, and robust control. He now works as a post-doctor in the Department of Automation, Shanghai University.
MinRui Fei received his B.E. and M.S. degrees in Industrial Automation from Shanghai University of Technology in 1984 and 1992, respectively, and his Ph.D. degree in Control Theory and Control Engineering from Shanghai University in 1997. Since 1998, he has been a professor and doctoral supervisor at Shanghai University. His current research interests are in the areas of intelligent control, complex system modeling, networked control systems, field control systems, etc.