Predictive feature selection for genetic policy search

Loscalzo, Steven; Wright, Robert; Yu, Lei

doi:10.1007/s10458-014-9268-y

Predictive feature selection for genetic policy search

Published: 10 July 2014

Volume 29, pages 754–786, (2015)
Cite this article

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Steven Loscalzo¹,
Robert Wright² &
Lei Yu²

427 Accesses
Explore all metrics

Abstract

Automatic learning of control policies is becoming increasingly important to allow autonomous agents to operate alongside, or in place of, humans in dangerous and fast-paced situations. Reinforcement learning (RL), including genetic policy search algorithms, comprise a promising technology area capable of learning such control policies. Unfortunately, RL techniques can take prohibitively long to learn a sufficiently good control policy in environments described by many sensors (features). We argue that in many cases only a subset of available features are needed to learn the task at hand, since others may represent irrelevant or redundant information. In this work, we propose a predictive feature selection framework that analyzes data obtained during execution of a genetic policy search algorithm to identify relevant features on-line. This serves to constrain the policy search space and reduces the time needed to locate a sufficiently good policy by embedding feature selection into the process of learning a control policy. We explore this framework through an instantiation called predictive feature selection embedded in neuroevolution of augmenting topology (NEAT), or PFS-NEAT. In an empirical study, we demonstrate that PFS-NEAT is capable of enabling NEAT to successfully find good control policies in two benchmark environments, and show that it can outperform three competing feature selection algorithms, FS-NEAT, FD-NEAT, and SAFS-NEAT, in several variants of these environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Qualitative differences between evolutionary strategies and reinforcement learning methods for control of autonomous agents

Article 07 December 2022

Interactive reinforced feature selection with traverse strategy

Article 21 January 2023

Dynamic Online Parameter Configuration of Genetic Algorithms Using Reinforcement Learning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Project located at: http://rars.sourceforge.net.
http://anji.sourceforge.net.
http://www.cs.waikato.ac.nz/ml/weka.

References

Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57, 469–483.
Article Google Scholar
Bellman, R. (2003). Dynamic programming. Mineola: Dover Publications.
Google Scholar
Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. In Proceedings of the 26th annual international conference on machine learning (pp. 41–48). New York: ACM.
Böhm, N., Kkai, G., & Mandl, S. (2004). Evolving a heuristic function for the game of tetris. Lernen, Wissensentdeckung und Adaptivität (LWA) (pp. 118–122). Berlin: Humbold-Universität.
Google Scholar
Boutilier, C., Dean, T., & Hanks, S. (1999). Decision-theoretic planning: Structural assumptions and computational leverage. JAIR, 11, 1–94.
MathSciNet Google Scholar
Cannady, J. (2000). Next generation intrusion detection: Autonomous reinforcement learning of network attacks. In Proceedings of the 23rd National Information Systems Secuity Conference (pp. 1–12).
Castelletti, A., Galelli, S., Restelli, M., & Soncini-Sessa, R. (2011). Tree-based variable selection for dimensionality reduction of large-scale control systems. In IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), IEEE (pp. 62–69).
Cliff, D., & Miller, G. (1995). Racking the red queen: Measurements of adaptive progress in co-evolutionary simulations. In F. Morn, A. Moreno, J. Merelo, & P. Chacn (Eds.), Advances in artificial life, lecture notes in computer science (Vol. 929, pp. 200–218). Berlin Heidelberg: Springer. doi: 10.1007/3-540-59496-5_300
Deisenroth, M., & Rasmussen, C. (2011). Pilco: A model-based and data-efficient approach to policy search. In L. Getoor & T. Scheffer (Eds.), Proceedings of the 28th International Conference on Machine Learning (ICML-11) (pp. 465–472). New York: ACM.
Devijver, P., & Kittler, J. (1982). Pattern recognition: A statistical approach. London: Prentice Hall International.
Google Scholar
Dietterich, T. G. (1998). The maxq method for hierarchical reinforcement learning. In Proceedings of the Fifteenth International Conference on Machine Learning, Morgan Kaufmann (pp. 118–126).
Diuk, C., Li, L., & Leffler, B. (2009). The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning. In L. Bottou & M. Littman (Eds.), Proceedings of the 26th International Conference on Machine Learning (pp. 249–256). Montreal: Omnipress.
Google Scholar
Doroodgar, B., & Nejat, G. (2010). A hierarchical reinforcement learning based control architecture for semi-autonomous rescue robots in cluttered environments. In 2010 IEEE Conference on Automation Science and Engineering (CASE) (pp. 948–953).
Ernst, D., Geurts, P., & Wehenkel, L. (2005). Tree-based batch mode reinforcement learning. JMLR, 6, 503–556.
MathSciNet Google Scholar
Goldberg, D. E., & Richardson, J. (1987). Genetic algorithms with sharing for multimodal function optimization. Proceedings of the Second International Conference on Genetic Algorithms on Genetic Algorithms and Their Application (pp. 41–49). Hillsdale, NJ: L. Erlbaum Associates Inc.
Google Scholar
Gomez, F., & Miikkulainen, R. (1997). Incremental evolution of complex general behavior. Adaptive Behavior, 5, 5–317.
Article Google Scholar
Gomez, F. J., & Miikkulainen, R. (1999). Solving non-markovian control tasks with neuroevolution. In Proceedings of the 16th International Joint Conference on Artificial Intelligence, Morgan Kaufmann (pp. 1356–1361).
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182.
Google Scholar
Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46, 389–422.
Article Google Scholar
Hachiya, H., & Sugiyama, M. (2010). Feature selection for reinforcement learning: Evaluating implicit state-reward dependency via conditional mutual information. In Proceedings of the ECML (pp. 474–489).
Hall, M. (1999). Correlation based feature selection for machine learning. PhD thesis, University of Waikato, Department of Computer Science.
Jolliffe, I. T. (2010). Principal component analysis (2nd ed.). New York: Springer.
Google Scholar
Jung, T., & Stone, P. (2009). Feature selection for value function approximation using bayesian model selection. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 660–675).
Knowles, J. D., Watson, R. A., & Corne, D. W. (2001). Reducing local optima in single-objective problems by multi-objectivization. In E. Zitzler, L. Thiele, K. Deb, C. Coello Coello, & D. Corne (Eds.), Evolutionary multi-criterion optimization, lecture notes in computer science (Vol. 1993, pp. 269–283). Berlin Heidelberg: Springer. doi: 10.1007/3-540-44719-9_19
Kolter, J. Z., & Ng, A. Y. (2009). Regularization and feature selection in least-squares temporal difference learning. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 521–528).
Konidaris, G., & Barto, A. (2009). Efficient skill learning using abstraction selection. Proceedings of the 21st International Jont Conference on Artifical Intelligence (pp. 1107–1112). San Francisco, CA: Morgan Kaufmann Publishers Inc.
Google Scholar
Konidaris, G., Kuindersma, S., Barto, A., & Grupen, R. (2010). Constructing skill trees for reinforcement learning agents from demonstration trajectories. NIPS, 23, 1162–1170.
Google Scholar
Kveton, B., Hauskrecht, M., & Guestrin, C. (2006). Solving factored MDPs with hybrid state and action variables. Journal of Artificial Intelligence Research, 27, 153–201.
MathSciNet Google Scholar
Lazaric, A., Restelli, M., & Bonarini, A. (2007). Reinforcement learning in continuous action spaces through sequential monte carlo methods. Advances in Neural Information Processing Systems (pp. 833–840). Cambridge: MIT Press.
Google Scholar
Lehman, J., & Stanley, K. O. (2011). Abandoning objectives: Evolution through the search for novelty alone. Evolutionary Computation, 19(2), 189–223.
Article Google Scholar
Li, L., Walsh, T. J., & Littman, M. L. (2006). Towards a unified theory of state abstraction for MDPs. In Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics (pp. 531–539).
Liu, H., & Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4), 491–502.
Article Google Scholar
Loscalzo, S., Wright, R., Acunto, K., & Yu, L. (2012). Sample aware embedded feature selection for reinforcement learning. In Proceedings of GECCO (pp. 879–886).
Mahadevan, S. (2005). Representation policy iteration. Proceedings of the Proceedings of the Twenty-First Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-05) (pp. 372–379). Arlington, Virginia: AUAI Press.
Google Scholar
March, J. G. (1991). Exploration and exploitation in organizational learning. Organizational Science, 2(1), 71–87.
Article MathSciNet Google Scholar
Melo, F. S., & Lopes, M. (2008). Fitted natural actor-critic: A new algorithm for continuous state-action MDPs. In ECML/PKDD(2) (pp. 66–81).
Mouret, J. B., & Doncieux, S. (2012). Encouraging behavioral diversity in evolutionary robotics: An empirical study. Evolutionary Computation, 20(1), 91–133. doi:10.1162/EVCO_a_00048.
Article Google Scholar
Nouri, A., & Littman, M. (2010). Dimension reduction and its application to model-based exploration in continuous spaces. Machine Learning, 81, 85–98.
Article MathSciNet Google Scholar
Parr, R., Painter-Wakefield, C., Li, L., & Littman, M.L. (2007). Analyzing feature generation for value-function approximation. In ICML (pp. 737–744).
Pazis, J., & Lagoudakis, M. G. (2009). Binary action search for learning continuous-action control policies. In Proceedings of the 26th Annual International Conference on Machine Learning ICML ’09 (pp. 793–800). New York: ACM.
Petrik, M., Taylor, G., Parr, R., & Zilberstein, S. (2010). Feature selection using regularization in approximate linear programs for markov decision processes. In Proceedings of the 27th International Conference on Machine Learning (pp. 871–878).
Powell, W. B. (2011). Approximate dynamic programming: Solving the curses of dimensionality (2nd ed.). Hoboken, NJ: Wiley.
Book Google Scholar
Puterman, M. L. (1994). Markov decision processes: Discrete stochastic dynamic programming. New York: Wiley-Interscience.
Book Google Scholar
Servin, A., & Kudenko, D. (2008). Multi-agent reinforcement learning for intrusion detection: A case study and evaluation. In Proceedings of the European Conference on Artificial Intelligence (pp. 873–874).
Sher, G. I. (2012). Handbook of neuroevolution through Erlang. New York: Springer.
Google Scholar
Stanley, K. O., & Miikkulainen, R. (2002). Efficient reinforcement learning through evolving neural network topologies. In Proceedings of GECCO (pp. 569–577).
Sutton, R., & Barto, A. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
Google Scholar
Tan, M., Hartley, M., Bister, M., & Deklerck, R. (2009). Automated feature selection in neuroevolution. Evolutionary Intelligence, 1(4), 271–292.
Article Google Scholar
Tan, M., Deklerck, R., Jansen, B., & Cornelis, J. (2012). Analysis of a feature-deselective neuroevolution classifier (FD-NEAT) in a computer-aided lung nodule detection system for ct images. In T. Soule & J. H. Moore (Eds.), GECCO (Companion) (pp. 539–546). New York: ACM.
Chapter Google Scholar
Taylor, M. E., & Stone, P. (2009). Transfer learning for reinforcement learning domains: A survey. JMLR, 10, 1633–1685.
MathSciNet Google Scholar
Tesauro, G., Das, R., Chan, H., Kephart, J. O., Levine, D., III FLR, & Lefurgy, C. (2007). Managing power consumption and performance of computing systems using reinforcement learning. In NIPS.
Vigorito, C. M., & Barto, A. G. (2009). Incremental structure learning in factored MDPs with continuous states and actions. Tech. rep.: University of Massachusetts Amherst - Department of Computer Science.
Google Scholar
Watkins, C. J. C. H., & Dayan, P. (1992). Technical note q-learning. Machine Learning, 8, 279–292.
Google Scholar
Whiteson, S., & Stone, P. (2006). Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research, 7, 877–917.
MathSciNet Google Scholar
Whiteson, S., Stone, P., & Stanley, K. O. (2005). Automatic feature selection in neuroevolution. In Proceedings of GECCO (pp. 1225–1232).
Wright, R., Loscalzo, S., & Yu, L. (2011). Embedded incremental feature selection for reinforcement learning. In ICAART 2011 - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence, Artificial Intelligence, Rome, Italy, January 28–30 (Vol. 1, pp. 263–268).
Xu, L., Yan, P., & Chang, T. (1988). Best first strategy for feature selection. In Proceedings of the Ninth International Conference on Pattern Recognition (pp. 706–708).

Download references

Acknowledgments

This work was performed under 13-RI-CRADA-13, and was supported in part through computational resources provided by the U.S. DoD HPCMP AFRL/RI Affiliated Resource Center. The authors would like to thank the anonymous reviewers for their helpful comments and suggestions, and Kevin Acunto for his work porting the RARS environment to Java.

Author information

Authors and Affiliations

AFRL Information Directorate, 26 Electronic Parkway, Rome, NY , 13441, USA
Steven Loscalzo
Binghamton University, 4400 Vestal Parkway East, Binghamton, NY, 13902, USA
Robert Wright & Lei Yu

Authors

Steven Loscalzo
View author publications
You can also search for this author in PubMed Google Scholar
Robert Wright
View author publications
You can also search for this author in PubMed Google Scholar
Lei Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Steven Loscalzo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Loscalzo, S., Wright, R. & Yu, L. Predictive feature selection for genetic policy search. Auton Agent Multi-Agent Syst 29, 754–786 (2015). https://doi.org/10.1007/s10458-014-9268-y

Download citation

Published: 10 July 2014
Issue Date: September 2015
DOI: https://doi.org/10.1007/s10458-014-9268-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predictive feature selection for genetic policy search

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Qualitative differences between evolutionary strategies and reinforcement learning methods for control of autonomous agents

Interactive reinforced feature selection with traverse strategy

Dynamic Online Parameter Configuration of Genetic Algorithms Using Reinforcement Learning

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Predictive feature selection for genetic policy search

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Qualitative differences between evolutionary strategies and reinforcement learning methods for control of autonomous agents

Interactive reinforced feature selection with traverse strategy

Dynamic Online Parameter Configuration of Genetic Algorithms Using Reinforcement Learning

Explore related subjects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation