Feature-Discovering Approximate Value Iteration Methods

Wu, Jia-Hong; Givan, Robert

doi:10.1007/11527862_25

Jia-Hong Wu²⁰ &
Robert Givan²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3607))

Included in the following conference series:

International Symposium on Abstraction, Reformulation, and Approximation

1049 Accesses

Abstract

Sets of features in Markov decision processes can play a critical role in approximately representing value and in abstracting the state space. Selection of features is crucial to the success of a system and is most often conducted by a human. We study the problem of automatically selecting problem features, and propose and evaluate a simple approach reducing the problem of selecting a new feature to standard classification learning. We learn a classifier that predicts the sign of the Bellman error over a training set of states. By iteratively adding new classifiers as features with this method, training between iterations with approximate value iteration, we find a Tetris feature set that outperforms randomly constructed features significantly, and obtains a score of about three-tenths of the highest score obtained by using a carefully hand-constructed feature set. We also show that features learned with this method outperform those learned with the previous method of Patrascu et al. [4] on the same SysAdmin domain used for evaluation there.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Dynamic Classifier Selection Based on Imprecise Probabilities: A Case Study for the Naive Bayes Classifier

Simple strategies for semi-supervised feature selection

Article Open access 17 July 2017

Feature Selection for Hidden Markov Models with Discrete Features

References

Bellman, R., Kalaba, R., Kotkin, B.: Polynomial approximation – a new computational technique in dynamic programming. Math. Comp. 17(8), 155–161 (1963)
MATH MathSciNet Google Scholar
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)
MATH Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Patrascu, R., Poupart, P., Schuurmans, D., Boutilier, C., Guestrin, C.: Greedy linear value-approximation for factored markov decision processes. In: AAAI (2002)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Sutton, R.S.: Learning to predict by the methods of temporal differences. MLJ 3, 9–44 (1988)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, Cambridge (1998)
Google Scholar
Tesauro, G.: Temporal difference learning and td-gammon. Comm. ACM 38(3), 58–68 (1995)
Article Google Scholar
Utgoff, P.E., Precup, D.: Constuctive function approximation. In: Motoda, H., Liu, H. (eds.) Feature extraction, construction, and selection: A data-mining perspective, pp. 219–235. Kluwer, Dordrecht (1998)
Google Scholar
Widrow, B., Hoff Jr., M.E.: Adaptive switching circuits. IRE WESCON Convention Record, 96–104 (1960)
Google Scholar
Williams, R.J., Baird, L.C.: Tight performance bounds on greedy policies based on imperfect value functions. Technical report, Northeastern University (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

Electrical and Computer Engineering, Purdue University, W. Lafayette, IN, 47907, USA
Jia-Hong Wu & Robert Givan

Authors

Jia-Hong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Robert Givan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

UR 079 GEODES, IRD, 32 avenue Henri Varagnat, 93143, Bondy, France
Jean-Daniel Zucker
Dip. di Informatica, Università del Piemonte Orientale, Via Bellini 25/G, 15100, Alessandria, Italy
Lorenza Saitta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, JH., Givan, R. (2005). Feature-Discovering Approximate Value Iteration Methods. In: Zucker, JD., Saitta, L. (eds) Abstraction, Reformulation and Approximation. SARA 2005. Lecture Notes in Computer Science(), vol 3607. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527862_25

Download citation

DOI: https://doi.org/10.1007/11527862_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27872-6
Online ISBN: 978-3-540-31882-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Feature-Discovering Approximate Value Iteration Methods

Abstract

Access this chapter

Preview

Similar content being viewed by others

Dynamic Classifier Selection Based on Imprecise Probabilities: A Case Study for the Naive Bayes Classifier

Simple strategies for semi-supervised feature selection

Feature Selection for Hidden Markov Models with Discrete Features

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Feature-Discovering Approximate Value Iteration Methods

Abstract

Access this chapter

Preview

Similar content being viewed by others

Dynamic Classifier Selection Based on Imprecise Probabilities: A Case Study for the Naive Bayes Classifier

Simple strategies for semi-supervised feature selection

Feature Selection for Hidden Markov Models with Discrete Features

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation