Incremental class learning approach and its application to handwritten digit recognition
Introduction
The catastrophic interference problem [1], [2], [3] remains a significant impediment in building large, scalable learning systems based on neural networks. In its simplest form, the problem may be stated as follows: when a network trained to solve task A is subsequently trained to solve task B, it “forgets” the solution to task A. In other words, the network is unable to acquire new knowledge without destroying previously acquired knowledge structures. A seemingly simple solution to this problem is to retrain the network on a cumulative training set containing examples from all previously learned categories. However, for large-scale problems this approach is not practical.
Incremental training methods are especially important for the fast or real-time control problems (e.g. ATM traffic control). Due to the large amount of noisy on-line data the Multilayer Perceptron-type networks with relatively slow retraining schemes are not suitable for such tasks. While there exist neural network models with learning schemes not prone to catastrophic interference (e.g. ART-type networks [4]), their effectiveness in dealing with large-scale and noisy problem domains is still under research. Some promising results have been obtained for the hybrid neuro-fuzzy ART models, e.g. Fuzzy ARTMAP, FasArt or PROBART.
The incremental class learning (ICL) approach [5] attempts to address the catastrophic interference problem and at the same time offers a learning framework that promotes the sharing of previously learned knowledge structures. With respect to object recognition and classification problems, the approach may be summarized as follows: The system starts off with all the nodes and links it will ever have, but initially, it focuses on only a small number of categories. After it learns to recognize these categories, it tries to identify which of the features formed in the “hidden layers” play a critical role in the recognition of these categories. The system “freezes” these critical features by fixing their input weights. As a result, they cannot be obliterated by subsequent learning. These frozen features, however, can participate in structures that are learned subsequently to recognize other categories. As the system learns to recognize more and more categories, it is hoped that the set of features will gradually stabilize and eventually, learning a new category will primarily consist of combining existing features in novel ways.
The paper is organized as follows. In Section 2, the proposed ICL approach is described and its relation to some of the existing incremental learning methods is discussed. Section 3 presents computer simulation results of the ICL for the handwritten digit recognition (HDR) problem. Conclusions and directions for future work appear in Section 4.
Section snippets
Incremental class learning
The ICL approach is a supervised learning procedure for neural networks that can be described as follows:
- •
Subproblems are learned incrementally.
- •
Structures playing a critical role in solving a subproblem are frozen.
- •
The above structures are available for subsequent learning.
- •
Solutions to subproblems are combined in an appropriate manner to solve the complete problem.
The success of the approach depends on four key factors. First, it should be possible to decompose the problem into subproblems in an
ICL application to HDR problem
In this section we present the application of the proposed ICL method to the HDR problem. The main objective of the work reported here is to show the efficacy of the proposed learning scheme in the context of a non-trivial and real-world problem domain involving noisy data. While we strive for a solid recognition performance, it is not our objective to develop a state-of-the-art HDR system. Such systems achieve higher recognition rates by incorporating sophisticated and specialized post- and
Conclusions
In this work we have proposed the ICL approach based on freezing relevant features and sharing common (similar) features among multiple classes. The ICL approach not only takes advantage of existing knowledge when learning a new problem, it also offers immunity from the catastrophic interference problem. Promising results obtained for the unconstrained HDR problem suggest that the approach may be a suitable framework for building large, scalable learning systems. We conjecture that the sharing
References (21)
- et al.
Catastrophic interference in connectionist networks: the sequential learning problem
- et al.
A massively parallel architecture for a self-organizing neural pattern recognition machine
Computer Vision Graphics and Image Processing
(1987) - et al.
Feature discovery by competitive learning
Cognitive Science
(1985) - J.L. McClelland, B.L. McNaughton, R.C. O'Reilly, Why there are complementary learning systems in the hippocampus and...
- et al.
Catastrophic interference in learning process by neural networks
- L. Shastri, Attribution learning as a solution to the catastrophic interference problem in learning with neural nets,...
- et al.
The cascade-correlation learning architecture
The upstart algorithm: a method for constructing and training feedforward neural networks
Neural Computation
(1990)Consonant recognition by modular construction of large phonemic time-delay neural networks
- et al.
Recognizing handwritten digit strings using modular spatio-temporal connectionist networks
Connection Science
(1995)
Cited by (19)
Domain adaptation and continual learning in semantic segmentation
2021, Advanced Methods and Deep Learning in Computer VisionAchieving a compromise between performance and complexity of structure: An incremental approach
2015, Information SciencesCitation Excerpt :These situations arise during many real-world tasks, and thus machine learning techniques that apply incremental learning have been applied to several applications, such as recommender systems that adapt dynamically [30], causality models for events built from data streams [1], robots that learn dynamic models incrementally [16] and adaptive mechanisms of spam filtering [13]. Several incremental approaches have been presented in the literature [6,27,28,33]. However, given the large amount of data needed for long-term training, these techniques tend to produce models with complicated structures, rendering them unfeasible for accomplishing certain tasks.
Reinforced learning systems based on merged and cumulative knowledge to predict human actions
2014, Information SciencesCitation Excerpt :Imitation, observation and trial-and-error are usually used to implement learning systems. These systems are based on iterative learning [8,20,21,29,50,53,54] or incremental learning [6,19,26–28] techniques and use the results of previous learning phases to predict the current or future output data of a given system. Iterative learning control is usually used to learn from tracking errors, which are subsequently minimized when achieving automated repetitive tasks.
A multi-viewpoint system to support abductive reasoning
2011, Information SciencesCitation Excerpt :Reasoning based on reinforcement to strengthen the reorganisation of knowledge [26,29,38,39]. Incremental reasoning to manage online-skill knowledge, focusing on a single problem at a time [22,23]. Reasoning based on adjustments to manage the current knowledge [20].
Risk-sensitive loss functions for sparse multi-category classification problems
2008, Information Sciences
- 1
This research was performed while the author was visiting International Computer Science Institute and EECS Department, University of California at Berkeley, Berkeley, CA, USA thanks to the support of Fulbright Senior Scholarship No. 20895.