Learning Temporal Task Specifications From Demonstrations

Baert, Mattijs; Leroux, Sam; Simoens, Pieter

doi:10.1007/978-3-031-70074-3_5

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14847))

Included in the following conference series:

International Workshop on Explainable, Transparent Autonomous Agents and Multi-Agent Systems

219 Accesses

Abstract

As we progress towards real-world deployment, the critical need for interpretability in reinforcement learning algorithms grows more pivotal, ensuring the safety and reliability of intelligent agents. This paper tackles the challenge of acquiring task specifications in linear temporal logic through expert demonstrations, aiming to alleviate the burdensome task of specification engineering. The rich semantics of temporal logics serve as an interpretable framework for delineating intricate, multi-stage tasks. We propose a method which iteratively learns a task specification and a nominal policy solving this task. In each iteration, the task specification is refined to better distinguish expert trajectories from trajectories sampled from the nominal policy. With this process we obtain a concise and interpretable task specification. Unlike previous work, our method is capable of learning directly from trajectories in the original state space and does not require predefined atomic propositions. We showcase the effectiveness of our method on multiple tasks in both an office and a Minecraft-inspired environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Against the Clock: Lessons Learned by Applying Temporal Planning in Practice

Generalization of temporal logic tasks via future dependent options

Article 26 August 2024

Model-free reinforcement learning for motion planning of autonomous agents with complex tasks in partially observable environments

Article 26 March 2024

Notes

1.
https://gitlab.ilabt.imec.be/mwbaert/l-ltl-fd.

References

Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 1 (2004)
Google Scholar
Aksaray, D., Jones, A., Kong, Z., Schwager, M., Belta, C.: Q-learning for robust satisfaction of signal temporal logic specifications. In: 2016 IEEE 55th Conference on Decision and Control (CDC), pp. 6565–6570. IEEE (2016)
Google Scholar
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D.: Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016)
Andreas, J., Klein, D., Levine, S.: Modular multitask reinforcement learning with policy sketches. In: International conference on machine learning, pp. 166–175. PMLR (2017)
Google Scholar
Baert, M., Mazzaglia, P., Leroux, S., Simoens, P.: Maximum causal entropy inverse constrained reinforcement learning. arXiv preprint arXiv:2305.02857 (2023)
Bellman, R.: A Markovian decision process. J. Math. Mech. 679–684 (1957)
Google Scholar
Bombara, G., Belta, C.: Offline and online learning of signal temporal logic formulae using decision trees. ACM Trans. Cyber-Phys. Syst. 5(3), 1–23 (2021)
Article Google Scholar
Buchi, J.R.: On a decision method in restricted second order arithmetic. In: Proceedings of the International Congress on Logic, Methodology and Philosophy of Science (1960)
Google Scholar
Camacho, A., Icarte, R.T., Klassen, T.Q., Valenzano, R.A., McIlraith, S.A.: LTL and beyond: Formal languages for reward function specification in reinforcement learning. In: IJCAI, vol. 19, pp. 6065–6073 (2019)
Google Scholar
Camacho, A., Varley, J., Jain, D., Iscen, A., Kalashnikov, D.: Disentangled planning and control in vision based robotics via reward machines. arXiv preprint arXiv:2012.14464 (2020)
Chiu, T.Y., Le Ny, J., David, J.P.: Temporal logic explanations for dynamic decision systems using anchors and monte Carlo tree search. Artif. Intell. 318, 103897 (2023)
Article MathSciNet Google Scholar
Chou, G., Ozay, N., Berenson, D.: Learning temporal logic formulas from suboptimal demonstrations: theory and experiments. Auton. Robot. 46(1), 149–174 (2022)
Article Google Scholar
Duret-Lutz, A., Poitrenaud, D.: SPOT: an extensible model checking library using transition-based generalized büchi automata. In: DeGroot, D., Harrison, P.G., Wijshoff, H.A.G., Segall, Z. (eds.) 12th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2004), 4-8 October 2004, Vollendam, The Netherlands, pp. 76–83. IEEE Computer Society (2004). https://doi.org/10.1109/MASCOT.2004.1348184
Dwyer, M.B., Avrunin, G.S., Corbett, J.C.: Patterns in property specifications for finite-state verification. In: Proceedings of the 21st International Conference on Software Engineering, pp. 411–420 (1999)
Google Scholar
Fronda, N., Abbas, H.: Differentiable inference of temporal logic formulas. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 41(11), 4193–4204 (2022)
Article Google Scholar
Furelos-Blanco, D., Law, M., Jonsson, A., Broda, K., Russo, A.: Hierarchies of reward machines. In: International Conference on Machine Learning, pp. 10494–10541. PMLR (2023)
Google Scholar
Ghiorzi, E., Colledanchise, M., Piquet, G., Bernagozzi, S., Tacchella, A., Natale, L.: Learning linear temporal properties for autonomous robotic systems. IEEE Robot. Autom. Lett. 8(5), 2930–2937 (2023)
Article Google Scholar
Ho, J., Ermon, S.: Generative adversarial imitation learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Icarte, R.T., Klassen, T.Q., Valenzano, R., McIlraith, S.A.: Reward machines: exploiting reward function structure in reinforcement learning. J. Artif. Intell. Res. 73, 173–208 (2022)
Article MathSciNet Google Scholar
Jha, S., Tiwari, A., Seshia, S.A., Sahai, T., Shankar, N.: Telex: learning signal temporal logic from positive examples using tightness metric. Formal Methods Syst. Des. 54, 364–387 (2019)
Article Google Scholar
Kasenberg, D., Scheutz, M.: Interpretable apprenticeship learning with temporal logic specifications. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 4914–4921. IEEE (2017)
Google Scholar
Kong, Z., Jones, A., Belta, C.: Temporal logics for learning and detection of anomalous behavior. IEEE Trans. Autom. Control 62(3), 1210–1222 (2016)
Article MathSciNet Google Scholar
Kuo, Y.L., Katz, B., Barbu, A.: Encoding formulas as deep networks: reinforcement learning for zero-shot execution of LTL formulas. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5604–5610. IEEE (2020)
Google Scholar
Leung, K., Aréchiga, N., Pavone, M.: Backpropagation through signal temporal logic specifications: infusing logical structure into gradient-based methods. Int. J. Robot. Res. 42(6), 356–370 (2023)
Article Google Scholar
Li, D., Cai, M., Vasile, C.I., Tron, R.: Learning signal temporal logic through neural network for interpretable classification. In: 2023 American Control Conference (ACC), pp. 1907–1914. IEEE (2023)
Google Scholar
Li, X., Vasile, C.I., Belta, C.: Reinforcement learning with temporal logic rewards. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3834–3839. IEEE (2017)
Google Scholar
Littman, M.L., Topcu, U., Fu, J., Isbell, C., Wen, M., MacGlashan, J.: Environment-independent task specifications via GLTL. arXiv preprint arXiv:1704.04341 (2017)
Pnueli, A.: The temporal logic of programs. In: 18th Annual Symposium on Foundations of Computer Science, Providence, Rhode Island, USA, 31 October - 1 November 1977, pp. 46–57. IEEE Computer Society (1977). https://doi.org/10.1109/SFCS.1977.32
Roy, R., Gaglione, J.R., Baharisangari, N., Neider, D., Xu, Z., Topcu, U.: Learning interpretable temporal properties from positive examples only. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 6507–6515 (2023)
Google Scholar
Shah, A., Kamath, P., Shah, J.A., Li, S.: Bayesian inference of temporal task specifications from demonstrations. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Vazquez-Chanlatte, M., Jha, S., Tiwari, A., Ho, M.K., Seshia, S.: Learning task specifications from demonstrations. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Voloshin, C., Le, H., Chaudhuri, S., Yue, Y.: Policy optimization with linear temporal logic constraints. Adv. Neural. Inf. Process. Syst. 35, 17690–17702 (2022)
Google Scholar
Xiong, Z., Eappen, J., Qureshi, A.H., Jagannathan, S.: Constrained hierarchical deep reinforcement learning with differentiable formal specifications (2022)
Google Scholar
Xu, Z., Gavran, I., Ahmad, Y., Majumdar, R., Neider, D., Topcu, U., Wu, B.: Joint inference of reward machines and policies for reinforcement learning. In: Proceedings of the International Conference on Automated Planning and Scheduling, vol. 30, pp. 590–598 (2020)
Google Scholar
Xu, Z., Topcu, U.: Transfer of temporal logic formulas in reinforcement learning. In: IJCAI: Proceedings of the Conference, vol. 28, p. 4010. NIH Public Access (2019)
Google Scholar
Yan, R., Julius, A.: Neural network for weighted signal temporal logic. arXiv preprint arXiv:2104.05435 (2021)
Yan, R., Ma, T., Fokoue, A., Chang, M., Julius, A.: Neuro-symbolic models for interpretable time series classification using temporal logic description. In: 2022 IEEE International Conference on Data Mining (ICDM), pp. 618–627. IEEE (2022)
Google Scholar

Download references

Acknowledgements

S.L. and P.S. acknowledge the financial support from the Flanders AI Research Program.

Author information

Authors and Affiliations

IDLab, Department of Information Technology, Ghent University-imec, Ghent, Belgium
Mattijs Baert, Sam Leroux & Pieter Simoens

Authors

Mattijs Baert
View author publications
You can also search for this author in PubMed Google Scholar
Sam Leroux
View author publications
You can also search for this author in PubMed Google Scholar
Pieter Simoens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mattijs Baert .

Editor information

Editors and Affiliations

University of Applied Sciences and Arts Western Switzerland, Sierre, Switzerland
Davide Calvaresi
Luxembourg Institute of Science and Technology, Esch-sur-Alzette, Luxembourg
Amro Najjar
Alma Mater Studiorum - Università di Bologna, Bologna, Italy
Andrea Omicini
Ozyegin University, Istanbul, Türkiye
Reyhan Aydogan
Alma Mater Studiorum - Università di Bologna, Bologna, Italy
Rachele Carli
Alma Mater Studiorum - Università di Bologna, Bologna, Italy
Giovanni Ciatto
University of Luxembourg, Esch-sur-Alzette, Luxembourg
Joris Hulstijn
Umeå University, Umeå, Sweden
Kary Främling

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Baert, M., Leroux, S., Simoens, P. (2024). Learning Temporal Task Specifications From Demonstrations. In: Calvaresi, D., et al. Explainable and Transparent AI and Multi-Agent Systems. EXTRAAMAS 2024. Lecture Notes in Computer Science(), vol 14847. Springer, Cham. https://doi.org/10.1007/978-3-031-70074-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-70074-3_5
Published: 25 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70073-6
Online ISBN: 978-3-031-70074-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning Temporal Task Specifications From Demonstrations

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Against the Clock: Lessons Learned by Applying Temporal Planning in Practice

Generalization of temporal logic tasks via future dependent options

Model-free reinforcement learning for motion planning of autonomous agents with complex tasks in partially observable environments

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Learning Temporal Task Specifications From Demonstrations

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Against the Clock: Lessons Learned by Applying Temporal Planning in Practice

Generalization of temporal logic tasks via future dependent options

Model-free reinforcement learning for motion planning of autonomous agents with complex tasks in partially observable environments

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation