skip to main content
10.1145/3641513.3650129acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
research-article
Free Access
Results Reproduced / v1.1

Incorporating Logic in Online Preference Learning for Safe Personalization of Autonomous Vehicles

Published:14 May 2024Publication History

ABSTRACT

Customizing autonomous vehicles to align with user preferences while ensuring safety may significantly impact their adoption. Collecting user preference data by asking a large number of comparison questions can be demanding. In this work, we use active learning along with temporal logic descriptions of constraints to enable safe learning of preferences with a reduced number of questions. We take a Bayesian inference approach combined with Weighted Signal Temporal Logic (WSTL), resulting in a WSTL formula that can rank signals based on user preferences and be used for correct-and-custom-by-construction control synthesis. Our method is practical for formulas and signals with various complexity since we compute STL-related values offline. We provide an upper bound for the number of answers in disagreement with user answers. We demonstrate the performance of our method both on synthetic data and by human subject experiments in an immersive driving simulator. We consider two driving scenarios, one involving a vehicle approaching a pedestrian crossing and the other with an overtake maneuver. Our results over synthetic experiments with ground truth weight valuation show that our query selection algorithm converges faster than random query selection. Human subject study results show an average agreement of 94% with user answers during training, and 79% during validation (which increases to 86% when restricted to high confidence results).

References

  1. Chandrayee Basu, Qian Yang, David Hungerman, Mukesh Sinahal, and Anca D. Draqan. 2017. Do You Want Your Autonomous Car to Drive Like You?. In 2017 12th ACM/IEEE Intl. Conf. on Human-Robot Interaction (HRI. 417–425.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Erdem Biyik, Malayandi Palan, Nicholas C. Landolfi, Dylan P. Losey, and Dorsa Sadigh. 2020. Asking Easy Questions: A User-Friendly Approach to Active Reward Learning. In Proceedings of the Conference on Robot Learning. PMLR, 1177–1190. ISSN: 2640-3498.Google ScholarGoogle Scholar
  3. Ralph Allan Bradley and Milton E. Terry. 1952. Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons. Biometrika 39, 3/4 (1952), 324–345.Google ScholarGoogle Scholar
  4. Erdem Bıyık. 2022. Learning Preferences for Interactive Autonomy. Ph. D. Dissertation. Stanford University.Google ScholarGoogle Scholar
  5. Gustavo A Cardona, Disha Kamale, and Cristian-Ioan Vasile. 2023. Mixed Integer Linear Programming Approach for Control Synthesis with Weighted Signal Temporal Logic. In Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control. 1–12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Yuxin Chen, S Hamed Hassani, Amin Karbasi, and Andreas Krause. 2015. Sequential information maximization: When is greedy near-optimal?. In Conference on Learning Theory. PMLR, 338–363.Google ScholarGoogle Scholar
  7. Ryan Cosner, Maegan Tucker, Andrew Taylor, Kejun Li, Tamas Molnar, Wyatt Ubelacker, Anil Alan, Gabor Orosz, Yisong Yue, and Aaron Ames. 2022. Safety-Aware Preference-Based Learning for Safety-Critical Control. In Proc. of The 4th Annual Learning for Dynamics and Control Conf., Vol. 168. PMLR, 1020–1033.Google ScholarGoogle Scholar
  8. Giuseppe De Giacomo and Moshe Y. Vardi. 2013. Linear Temporal Logic and Linear Dynamic Logic on Finite Traces. In Proc. of the Twenty-Third Intl. Joint Conf. on Artificial Intelligence. 854–860.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Alexandre Donzé and Oded Maler. 2010. Robust Satisfaction of Temporal Logic over Real-Valued Signals. In Formal Modeling and Analysis of Timed Systems. Springer Berlin Heidelberg, 92–106.Google ScholarGoogle Scholar
  10. Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. 2017. CARLA: An Open Urban Driving Simulator. In Proceedings of the 1st Annual Conference on Robot Learning. 1–16.Google ScholarGoogle Scholar
  11. Brochu Eric, Nando Freitas, and Abhijeet Ghosh. 2007. Active Preference Learning with Discrete Choice Data. In Advances in Neural Information Processing Systems, Vol. 20. Curran Associates, Inc.Google ScholarGoogle Scholar
  12. Georgios E. Fainekos, Antoine Girard, Hadas Kress-Gazit, and George J. Pappas. 2009. Temporal logic motion planning for dynamic robots. Automatica 45, 2 (2009), 343–352.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nicole Fronda and Houssam Abbas. 2022. Differentiable Inference of Temporal Logic Formulas. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 41, 11 (2022), 4193–4204.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Johannes Fürnkranz and Eyke Hüllermeier. 2011. Preference learning. Springer Berlin Heidelberg. 1–466 pages.Google ScholarGoogle Scholar
  15. Daniel Golovin, Andreas Krause, and Debajyoti Ray. 2010. Near-optimal bayesian active learning with noisy observations. Advances in Neural Information Processing Systems 23 (2010).Google ScholarGoogle Scholar
  16. Martina Hasenjäger and Heiko Wersing. 2017. Personalization in advanced driver assistance systems and autonomous vehicles: A review. In 2017 IEEE 20th Intl. Conf. on Intelligent Transportation Systems (ITSC). 1–7.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Bassam Helou, Aditya Dusi, Anne Collin, Noushin Mehdipour, Zhiliang Chen, Cristhian Lizarazo, Calin Belta, Tichakorn Wongpiromsarn, Radboud Duintjer Tebbens, and Oscar Beijbom. 2021. The reasonable crowd: Towards evidence-based and interpretable models of driving behavior. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 6708–6715.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Rachel Holladay, Shervin Javdani, Anca Dragan, and Siddhartha Srinivasa. 2016. Active Comparison Based Learning Incorporating User Uncertainty and Noise. In RSS Workshop on Model Learning for Human-Robot Communication.Google ScholarGoogle Scholar
  19. Ruya Karagulle, Nikos Aréchiga, Jonathan DeCastro, and Necmiye Ozay. 2022. Classification of Driving Behaviors Using STL Formulas: A Comparative Study. In Formal Modeling and Analysis of Timed Systems. Springer Intl. Publishing, 153–162.Google ScholarGoogle Scholar
  20. Ruya Karagulle, Nikos Aréchiga, Andrew Best, Jonathan DeCastro, and Necmiye Ozay. 2024. A Safe Preference Learning Approach for Personalization with Applications to Autonomous Vehicles. IEEE Robotics and Automation Letters (2024), 1–8. https://doi.org/10.1109/LRA.2024.3375626Google ScholarGoogle ScholarCross RefCross Ref
  21. Jesper Karlsson, Sanne van Waveren, Christian Pek, Ilaria Torre, Iolanda Leite, and Jana Tumova. 2021. Encoding Human Driving Styles in Motion Planning for Autonomous Vehicles. In 2021 IEEE International Conference on Robotics and Automation (ICRA). 1050–1056.Google ScholarGoogle Scholar
  22. Xiao Li, Guy Rosman, Igor Gilitschenski, Cristian-Ioan Vasile, Jonathan A. DeCastro, Sertac Karaman, and Daniela Rus. 2021. Vehicle Trajectory Prediction Using Generative Adversarial Network With Temporal Logic Syntax Tree Features. IEEE Robotics and Automation Letters 6, 2 (2021), 3459–3466.Google ScholarGoogle ScholarCross RefCross Ref
  23. Alexis Linard, Ilaria Torre, Bartoli Ermanno, Alex Sleat, Iolanda Leite, and Jana Tumova. 2023. Real-time RRT* with Signal Temporal Logic Preferences. In Intl. Conf. on Intelligent Robots and Systems (IROS).Google ScholarGoogle ScholarCross RefCross Ref
  24. Lars Lindemann and Dimos V. Dimarogonas. 2019. Control Barrier Functions for Signal Temporal Logic Tasks. IEEE Control Systems Letters 3, 1 (2019), 96–101.Google ScholarGoogle ScholarCross RefCross Ref
  25. Oded Maler and Dejan Nickovic. 2004. Monitoring Temporal Properties of Continuous Signals. In Formal Techniques, Modelling and Analysis of Timed and Fault-Tolerant Systems, Yassine Lakhnech and Sergio Yovine (Eds.). Springer Berlin Heidelberg, 152–166.Google ScholarGoogle Scholar
  26. Lucas Maystre and Matthias Grossglauser. 2017. Just Sort It! A Simple and Effective Approach to Active Preference Learning. In Proceedings of the 34th International Conference on Machine Learning. PMLR, 2344–2353. ISSN: 2640-3498.Google ScholarGoogle Scholar
  27. Noushin Mehdipour, Cristian-Ioan Vasile, and Calin Belta. 2021. Specifying User Preferences Using Weighted Signal Temporal Logic. IEEE Control Systems Letters 5, 6 (2021), 2006–2011.Google ScholarGoogle ScholarCross RefCross Ref
  28. Mohammad Naghshvar, Tara Javidi, and Kamalika Chaudhuri. 2012. Noisy bayesian active learning. In 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 1626–1633.Google ScholarGoogle ScholarCross RefCross Ref
  29. Daniel Neider and Ivan Gavran. 2018. Learning Linear Temporal Properties. In 2018 Formal Methods in Computer Aided Design (FMCAD). 1–10.Google ScholarGoogle Scholar
  30. Petter Nilsson, Omar Hussien, Ayca Balkan, Yuxiao Chen, Aaron D. Ames, Jessy W. Grizzle, Necmiye Ozay, Huei Peng, and Paulo Tabuada. 2016. Correct-by-Construction Adaptive Cruise Control: Two Approaches. IEEE Trans. on Control Systems Technology 24, 4 (2016), 1294–1307.Google ScholarGoogle ScholarCross RefCross Ref
  31. Dorsa Sadigh, Anca Dragan, Shankar Sastry, and Sanjit Seshia. 2017. Active Preference-Based Learning of Reward Functions. In Robotics: Science and Systems XIII. Robotics: Science and Systems Foundation.Google ScholarGoogle Scholar
  32. Burr Settles. 2012. Active Learning. Springer International Publishing.Google ScholarGoogle Scholar
  33. G. Venkatesh. 2011. Temporal Logic with Preferences and Reasoning About Games. In Proof, Computation and Agency: Logic at the Crossroads, Johan van Benthem, Amitabha Gupta, and Rohit Parikh (Eds.). Springer Netherlands, 241–258.Google ScholarGoogle Scholar
  34. Nils Wilde, Dana Kulić, and Stephen L. Smith. 2020. Active Preference Learning using Maximum Regret. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 10952–10959. ISSN: 2153-0866.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ruixuan Yan, Agung Julius, Maria Chang, Achille Fokoue, Tengfei Ma, and Rosario Uceda-Sosa. 2021. STONE: Signal Temporal Logic Neural Network for Time Series Classification. In 2021 Intl. Conf. on Data Mining Workshops (ICDMW). 778–787.Google ScholarGoogle Scholar

Index Terms

  1. Incorporating Logic in Online Preference Learning for Safe Personalization of Autonomous Vehicles

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          HSCC '24: Proceedings of the 27th ACM International Conference on Hybrid Systems: Computation and Control
          May 2024
          307 pages
          ISBN:9798400705229
          DOI:10.1145/3641513

          Copyright © 2024 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 May 2024

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate153of373submissions,41%
        • Article Metrics

          • Downloads (Last 12 months)3
          • Downloads (Last 6 weeks)3

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format