Abstract
To acquire expert skills in a sequential decision making domain that is too vast to be explored thoroughly, an intelligent agent has to be capable of inducing crucial knowledge from the most representative parts of it. One way to shape the learning process and guide the learner in the right direction is effective selection of such parts that provide the best training experience. To realize this concept, we propose a shaping method that orchestrates the training by iteratively exposing the learner to subproblems generated autonomously from the original problem. The main novelty of the proposed approach consists in equalling the learning process with the search in subproblem space and in employing a coevolutionary algorithm to perform this search. Each individual in the population encodes a sequence of subproblems that is evaluated by confronting the learner trained on it with other learners shaped in this way by particular individuals. When applied to the game of Othello, temporal difference learning on the best found subproblem sequence yields substantially better players than learning on the entire problem at once.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press (1998)
Skinner, B.: The behavior of organisms: An experimental analysis. Appleton-Century (1938)
Randløv, J., Alstrøm, P.: Learning to drive a bicycle using reinforcement learning and shaping. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 463–471. Morgan Kaufmann, San Francisco (1998)
Popovici, E., Bucci, A., Wiegand, R.P., de Jong, E.D.: Coevolutionary principles. In: Handbook of Natural Computing. Springer, Berlin (2010)
Jaśkowski, W., Krawiec, K.: Formal analysis, hardness and algorithms for extracting internal structure of test-based problems. Evolutionary Computation 19(4), 639–671 (2011)
Szubert, M.: cECJ — Coevolutionary Computation in Java (2010), http://www.cs.put.poznan.pl/mszubert/projects/cecj.html
Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning 3, 9–44 (1988)
Szubert, M., Jaśkowski, W., Krawiec, K.: Coevolutionary Temporal Difference Learning for Othello. In: 2009 IEEE Symposium on Computational Intelligence and Games, pp. 104–111 (2009)
Mihalkova, L., Mooney, R.: Using active relocation to aid reinforcement learning. In: Proceedings of the 19th International FLAIRS Conference, pp. 580–585 (2006)
Cohn, D., Atlas, L., Ladner, R.: Improving generalization with active learning. Machine Learning 15(2), 201–221 (1994)
Rachelson, E., Schnitzler, F., Wehenkel, L., Ernst, D.: Optimal sample selection for batch-mode reinforcement learning. In: Proceedings of the 3rd International Conference on Agents and Artificial Intelligence, ICAART 2011 (2011)
Torrey, L., Shavlik, J.: Transfer Learning. In: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques, pp. 242–264. IGI Global (2009)
Konidaris, G., Barto, A.: Autonomous shaping: Knowledge transfer in reinforcement learning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 489–496. ACM (2006)
Epstein, S.: Toward an ideal trainer. Machine Learning 15(3), 251–277 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Szubert, M., Krawiec, K. (2012). Autonomous Shaping via Coevolutionary Selection of Training Experience. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds) Parallel Problem Solving from Nature - PPSN XII. PPSN 2012. Lecture Notes in Computer Science, vol 7492. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32964-7_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-32964-7_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32963-0
Online ISBN: 978-3-642-32964-7
eBook Packages: Computer ScienceComputer Science (R0)