skip to main content
10.1145/3501710.3524734acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
research-article
Open access

Poster Abstract: Model-Free Reinforcement Learning for Symbolic Automata-encoded Objectives

Published: 04 May 2022 Publication History

Abstract

In this work, we propose the use of symbolic automata as formal specifications for reinforcement learning agents. The use of symbolic automata serves as a generalization of both bounded-time temporal logic-based specifications and deterministic finite automata, allowing us to describe input alphabets over metric spaces. Furthermore, our use of symbolic automata allows us to define non-sparse potential-based rewards which empirically shape the reward surface, leading to better convergence during RL. We also show that our potential-based rewarding strategy still allows us to obtain the policy that maximizes the satisfaction of the given specification.

References

[1]
Derya Aksaray, Austin Jones, Zhaodan Kong, Mac Schwager, and Calin Belta. 2016. Q-Learning for Robust Satisfaction of Signal Temporal Logic Specifications. In 2016 IEEE 55th Conference on Decision and Control (CDC). 6565–6570. https://doi.org/10.1109/cdc.2016.7799279
[2]
Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete Problems in AI Safety. arXiv:1606.06565 [cs] (July 2016). arxiv:1606.06565 [cs] http://arxiv.org/abs/1606.06565
[3]
Anand Balakrishnan and Jyotirmoy V Deshmukh. 2019. Structured reward shaping using signal temporal logic specifications. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 3481–3486. https://doi.org/10.1109/IROS40897.2019.8968254
[4]
Andrew G. Barto, Richard S. Sutton, and Charles W. Anderson. 1983. Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems. IEEE Transactions on Systems, Man, and Cybernetics SMC-13, 5 (Sept. 1983), 834–846. https://doi.org/10.1109/TSMC.1983.6313077
[5]
Loris D’Antoni and Margus Veanes. 2017. The Power of Symbolic Automata and Transducers. In Computer Aided Verification, Rupak Majumdar and Viktor Kunčak (Eds.). Vol. 10426. Springer International Publishing, Cham, 47–67. https://doi.org/10.1007/978-3-319-63387-9_3
[6]
E. M. Hahn, M. Perez, S. Schewe, F. Somenzi, A. Trivedi, and D. Wojtczak. 2020. Reward Shaping for Reinforcement Learning with Omega- Regular Objectives. arXiv:2001.05977 [cs] (Jan. 2020). arxiv:2001.05977 [cs] http://arxiv.org/abs/2001.05977
[7]
Mohammadhosein Hasanbeig, Alessandro Abate, and Daniel Kroening. 2018. Logically-Constrained Reinforcement Learning. arXiv:1801.08099 [cs] (Jan. 2018). arxiv:1801.08099 [cs] http://arxiv.org/abs/1801.08099
[8]
Abolfazl Lavaei, Fabio Somenzi, Sadegh Soudjani, Ashutosh Trivedi, and Majid Zamani. 2020. Formal Controller Synthesis for Continuous-Space MDPs via Model-Free Reinforcement Learning. arXiv:2003.00712 [cs, eess] (March 2020), 98–107. https://doi.org/10.1109/ICCPS48487.2020.00017 arxiv:2003.00712 [cs, eess]
[9]
Dorsa Sadigh, Eric S. Kim, Samuel Coogan, S. Shankar Sastry, and Sanjit A. Seshia. 2014. A Learning Based Approach to Control Synthesis of Markov Decision Processes for Linear Temporal Logic Specifications. In 53rd IEEE Conference on Decision and Control. 1091–1096. https://doi.org/10.1109/cdc.2014.7039527
[10]
Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction(second edition ed.). MIT press, Cambridge, Massachusetts.
[11]
Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-Learning. Machine Learning 8, 3 (May 1992), 279–292. https://doi.org/10.1007/BF00992698

Cited By

View all
  • (2024)Optimal Runtime Assurance via Reinforcement Learning2024 ACM/IEEE 15th International Conference on Cyber-Physical Systems (ICCPS)10.1109/ICCPS61052.2024.00013(67-76)Online publication date: 13-May-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HSCC '22: Proceedings of the 25th ACM International Conference on Hybrid Systems: Computation and Control
May 2022
265 pages
ISBN:9781450391962
DOI:10.1145/3501710
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 May 2022

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

HSCC '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 153 of 373 submissions, 41%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)126
  • Downloads (Last 6 weeks)17
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Optimal Runtime Assurance via Reinforcement Learning2024 ACM/IEEE 15th International Conference on Cyber-Physical Systems (ICCPS)10.1109/ICCPS61052.2024.00013(67-76)Online publication date: 13-May-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media