Abstract
Continual learning is the constant development of increasingly complex behaviors; the process of building more complicated skills on top of those already developed. A continual-learning agent should therefore learn incrementally and hierarchically. This paper describes CHILD, an agent capable of Continual, Hierarchical, Incremental Learning and Development. CHILD can quickly solve complicated non-Markovian reinforcement-learning tasks and can then transfer its skills to similar but even more complicated tasks, learning these faster still.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Albus, J. S. (1979). Mechanisms of planning and problem solving in the brain. Mathematical Biosciences, 45:247–293.
Barr, A. & Feigenbaum, E. A. (1981). The Handbook of Artificial Intelligence, volume 1. Los Altos, California: William Kaufmann, Inc.
Baxter, J. (1995). Learning model bias. Technical Report NC-TR-95-46, Royal Holloway College, University of London, Department of Computer Science.
Caruana, R. (1993). Multitask learning: A knowledge-based source of inductive bias. In Machine Learning: Proceedings of the tenth International Conference, pages 41–48. Morgan Kaufmann Publishers.
Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In Proceedings of the National Conference on Artificial Intelligence (AAAI-92). Cambridge, MA: AAAI/MIT Press.
Dawkins, R. (1976). Hierarchical organisation: a candidate principle for ethology. In Bateson, P. P. G. and Hinde, R. A., editors, Growing Points in Ethology, pages 7–54. Cambridge: Cambridge University Press.
Dayan, P. & Hinton, G. E. (1993). Feudal reinforcement learning. In Giles, C. L., Hanson, S. J., and Cowan, J. D., editors, Advances in Neural Information Processing Systems 5, pages 271–278. San Mateo, California: Morgan Kaufmann Publishers.
Drescher, G. L. (1991). Made-Up Minds: A Constructivist Approach to Artificial Intelligence. Cambridge, Massachusetts: MIT Press.
Elman, J. L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48: 71–79.
Fahlman, S. E. (1991). The recurrent cascade-correlation architecture. In Lippmann, R. P., Moody, J. E., and Touretzky, D. S., editors, Advances in Neural Information Processing Systems 3, pages 190–196. San Mateo, California: Morgan Kaufmann Publishers.
Giles, C., Chen, D., Sun, G., Chen, H., Lee, Y., & Goudreau, M. (1995). Constructive learning of recurrent neural networks: Problems with recurrent cascade correlation and a simple solution. IEEE Transactions on Neural Networks, 6(4):829.
Jameson, J. W. (1992). Reinforcement control with hierarchical back propagated adaptive critics. Submitted to Neural Networks.
Jordan, M. I. (1986). Serial order: A parallel distributed processing approach. ICS Report 8604, Institute for Cognitive Science, University of California, San Diego.
Kaelbling, L. P. (1993a). Hierarchical learning in stochastic domains: Preliminary results. In Machine Learning: Proceedings of the tenth International Conference, pages 167–173. Morgan Kaufmann Publishers.
Kaelbling, L. P. (1993b). Learning to achieve goals. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1094–1098. Chambéry, France: Morgan Kaufmann.
Laird, J. E., Rosenbloom, P. S., & Newell, A. (1986). Chunking in soar: The anatomy of a general learning mechanism. Machine Learning, 1:11–46.
Lin, L.-J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8:293–321.
Lin, L.-J. (1993). Reinforcement Learning for Robots Using Neural Networks. PhD thesis, Carnegie Mellon University. Also appears as Technical Report CMU-CS-93-103.
McCallum, A. K. (1996). Learning to use selective attention and short-term memory in sequential tasks. In From Animals to Animats, Fourth International Conference on Simulation of Adaptive Behavior, (SAB'96).
McCallum, R. A. (1993). Overcoming incomplete perception with Utile Distinction Memory. In Machine Learning: Proceedings of the Tenth International Conference, pages 190–196. Morgan Kaufmann Publishers.
Pollack, J. B. (1991). The induction of dynamical recognizers. Machine Learning, 7: 227–252.
Pratt, L. Y. (1993). Discriminability-based transfer between neural networks. In Giles, C., Hanson, S. J., and Cowan, J. D., editors, Advances in Neural Information Processing Systems 5, pages 204–211. San Mateo, CA: Morgan Kaufmann Publishers.
Ring, M. B. (1993). Learning sequential tasks by incrementally adding higher orders. In Giles, C. L., Hanson, S. J., and Cowan, J. D., editors, Advances in Neural Information Processing Systems 5, pages 115–122. San Mateo, California: Morgan Kaufmann Publishers.
Ring, M. B. (1994). Continual Learning in Reinforcement Environments. PhD thesis, University of Texas at Austin, Austin, Texas 78712.
Ring, M.B. (1996). Finding promising exploration regions by weighting expected navigation costs. Arbeitspapiere der GMD 987, GMD—German National Research Center for Information Technology.
Robinson, A. J. & Fallside, F. (1987). The utility driven dynamic error propagation network. Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering Department.
Roitblat, H. L. (1988). A cognitive action theory of learning. In Delacour, J. and Levy, J. C. S., editors, Systems with Learning and Memory Abilities, pages 13–26. Elsevier Science Publishers B.V. (North-Holland).
Roitblat, H. L. (1991). Cognitive action theory as a control architecture. In Meyer, J. A. and W ilson, S. W., editors, From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior, pages 444–450. MIT Press.
Sanger, T. D. (1991). A tree structured adaptive network for function approximation in high-dimensional spaces. IEEE Transactions on Neural Networks, 2(2):285–301.
Schmidhuber, J. (1992). Learning unambiguous reduced sequence descriptions. In Moody, J. E., Hanson, S. J., and Lippman, R. P., editors, Advances in Neural Information Processing Systems 4, pages 291–298. San Mateo, California: Morgan Kaufmann Publishers.
Schmidhuber, J. (1994). On learning how to learn learning strategies. Technical Report FKI- 198- 94 (revised), Technische Universität München, Institut für Informatik.
Schmidhuber, J. & Wahnsiedler, R. (1993). Planning simple trajectories using neural subgoal generators. In Meyer, J. A., Roitblat, H., and Wilson, S., editors, From Animals to Animats 2: Proceedings of the Second International Conference on Simulation of Adaptive Behavior, pages 196–199. MIT Press.
Sharkey, N. E. & Sharkey, A. J. (1993). adaptive generalisation. Artificial Intelligence Review, 7: 313–328.
Silver, D. L. & Mercer, R. E. (1995). Toward a model of consolidation: The retention and transfer of neural net task knowledge. In Proceedings of the INNS World Congress on Neural Networks, volume III, pages 164–169. Washington, DC.
Singh, S. P. (1992). Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning, 8: 323-340.
Thrun, S. (1996). Is learning the n-th thing any easier than learning the first? In Touretzky, D. S., Mozer, M. C., and Hasselno, M. E., editors, Advances in Neural Information Processing Systems 8. MIT Press.
Thrun, S. & Schwartz, A. (1995). Finding structure in reinforcement learning. In Tesauro, G., Touretzky, D., and Leen, T., editors, Advances in Neural Information Processing Systems 7, pages 385–392. MIT Press.
Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. PhD thesis, King's College.
Wilson, S.W. (1989). Hierarchical credit allocation in a classifier system. In Elzas, M. S., Ören, T. I., and Zeigler, B. P., editors, Modeling and Simulation Methodology. Elsevier Science Publishers B.V.
Wixson, L. E. (1991). Scaling reinforcement learning techniques via modularity. In Birnbaum, L. A. and Collins, G. C., editors, Machine Learning: Proceedings of the Eighth International Workshop (ML91), pages 368–372. Morgan Kaufmann Publishers.
Wynn-Jones, M. (1993). Node splitting: A constructive algorithm for feed-forward neural networks. Neural Computing and Applications, 1(1):17–22.
Yu, Y.-H. & Simmons, R. F. (1990). Extra ouput biased learning. In Proceedings of the International Joint Conference on Neural Networks. Hillsdale, NJ: Erlbaum Associates.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Ring, M.B. CHILD: A First Step Towards Continual Learning. Machine Learning 28, 77–104 (1997). https://doi.org/10.1023/A:1007331723572
Issue Date:
DOI: https://doi.org/10.1023/A:1007331723572