Incorporating Logic in Online Preference Learning for Safe Personalization of Autonomous Vehicles

Authors:
Ruya Karagulle

University of Michigan, USA

University of Michigan, USA

0000-0002-9536-7053
View Profile

,
Necmiye Ozay

University of Michigan, USA

University of Michigan, USA

0000-0002-5552-4392
View Profile

,
Nikos Arechiga

Toyota Research Institute, USA

Toyota Research Institute, USA

0009-0005-5585-7006
Search about this author

,
Jonathan Decastro

Toyota Research Institute, USA

Toyota Research Institute, USA

0000-0002-0933-9671
View Profile

,
Andrew Best

Toyota Research Institute, USA

Toyota Research Institute, USA

0009-0000-5128-0282
View Profile

HSCC '24: Proceedings of the 27th ACM International Conference on Hybrid Systems: Computation and ControlMay 2024Article No.: 5Pages 1–11https://doi.org/10.1145/3641513.3650129

Published:14 May 2024Publication History

HSCC '24: Proceedings of the 27th ACM International Conference on Hybrid Systems: Computation and Control

Pages 1–11

ABSTRACT

Customizing autonomous vehicles to align with user preferences while ensuring safety may significantly impact their adoption. Collecting user preference data by asking a large number of comparison questions can be demanding. In this work, we use active learning along with temporal logic descriptions of constraints to enable safe learning of preferences with a reduced number of questions. We take a Bayesian inference approach combined with Weighted Signal Temporal Logic (WSTL), resulting in a WSTL formula that can rank signals based on user preferences and be used for correct-and-custom-by-construction control synthesis. Our method is practical for formulas and signals with various complexity since we compute STL-related values offline. We provide an upper bound for the number of answers in disagreement with user answers. We demonstrate the performance of our method both on synthetic data and by human subject experiments in an immersive driving simulator. We consider two driving scenarios, one involving a vehicle approaching a pedestrian crossing and the other with an overtake maneuver. Our results over synthetic experiments with ground truth weight valuation show that our query selection algorithm converges faster than random query selection. Human subject study results show an average agreement of 94% with user answers during training, and 79% during validation (which increases to 86% when restricted to high confidence results).

References

Chandrayee Basu, Qian Yang, David Hungerman, Mukesh Sinahal, and Anca D. Draqan. 2017. Do You Want Your Autonomous Car to Drive Like You?. In 2017 12th ACM/IEEE Intl. Conf. on Human-Robot Interaction (HRI. 417–425.Google ScholarDigital Library
Erdem Biyik, Malayandi Palan, Nicholas C. Landolfi, Dylan P. Losey, and Dorsa Sadigh. 2020. Asking Easy Questions: A User-Friendly Approach to Active Reward Learning. In Proceedings of the Conference on Robot Learning. PMLR, 1177–1190. ISSN: 2640-3498.Google Scholar
Ralph Allan Bradley and Milton E. Terry. 1952. Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons. Biometrika 39, 3/4 (1952), 324–345.Google Scholar
Erdem Bıyık. 2022. Learning Preferences for Interactive Autonomy. Ph. D. Dissertation. Stanford University.Google Scholar
Gustavo A Cardona, Disha Kamale, and Cristian-Ioan Vasile. 2023. Mixed Integer Linear Programming Approach for Control Synthesis with Weighted Signal Temporal Logic. In Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control. 1–12.Google ScholarDigital Library
Yuxin Chen, S Hamed Hassani, Amin Karbasi, and Andreas Krause. 2015. Sequential information maximization: When is greedy near-optimal?. In Conference on Learning Theory. PMLR, 338–363.Google Scholar
Ryan Cosner, Maegan Tucker, Andrew Taylor, Kejun Li, Tamas Molnar, Wyatt Ubelacker, Anil Alan, Gabor Orosz, Yisong Yue, and Aaron Ames. 2022. Safety-Aware Preference-Based Learning for Safety-Critical Control. In Proc. of The 4th Annual Learning for Dynamics and Control Conf., Vol. 168. PMLR, 1020–1033.Google Scholar
Giuseppe De Giacomo and Moshe Y. Vardi. 2013. Linear Temporal Logic and Linear Dynamic Logic on Finite Traces. In Proc. of the Twenty-Third Intl. Joint Conf. on Artificial Intelligence. 854–860.Google ScholarDigital Library
Alexandre Donzé and Oded Maler. 2010. Robust Satisfaction of Temporal Logic over Real-Valued Signals. In Formal Modeling and Analysis of Timed Systems. Springer Berlin Heidelberg, 92–106.Google Scholar
Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. 2017. CARLA: An Open Urban Driving Simulator. In Proceedings of the 1st Annual Conference on Robot Learning. 1–16.Google Scholar
Brochu Eric, Nando Freitas, and Abhijeet Ghosh. 2007. Active Preference Learning with Discrete Choice Data. In Advances in Neural Information Processing Systems, Vol. 20. Curran Associates, Inc.Google Scholar
Georgios E. Fainekos, Antoine Girard, Hadas Kress-Gazit, and George J. Pappas. 2009. Temporal logic motion planning for dynamic robots. Automatica 45, 2 (2009), 343–352.Google ScholarDigital Library
Nicole Fronda and Houssam Abbas. 2022. Differentiable Inference of Temporal Logic Formulas. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 41, 11 (2022), 4193–4204.Google ScholarDigital Library
Johannes Fürnkranz and Eyke Hüllermeier. 2011. Preference learning. Springer Berlin Heidelberg. 1–466 pages.Google Scholar
Daniel Golovin, Andreas Krause, and Debajyoti Ray. 2010. Near-optimal bayesian active learning with noisy observations. Advances in Neural Information Processing Systems 23 (2010).Google Scholar
Martina Hasenjäger and Heiko Wersing. 2017. Personalization in advanced driver assistance systems and autonomous vehicles: A review. In 2017 IEEE 20th Intl. Conf. on Intelligent Transportation Systems (ITSC). 1–7.Google ScholarDigital Library
Bassam Helou, Aditya Dusi, Anne Collin, Noushin Mehdipour, Zhiliang Chen, Cristhian Lizarazo, Calin Belta, Tichakorn Wongpiromsarn, Radboud Duintjer Tebbens, and Oscar Beijbom. 2021. The reasonable crowd: Towards evidence-based and interpretable models of driving behavior. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 6708–6715.Google ScholarDigital Library
Rachel Holladay, Shervin Javdani, Anca Dragan, and Siddhartha Srinivasa. 2016. Active Comparison Based Learning Incorporating User Uncertainty and Noise. In RSS Workshop on Model Learning for Human-Robot Communication.Google Scholar
Ruya Karagulle, Nikos Aréchiga, Jonathan DeCastro, and Necmiye Ozay. 2022. Classification of Driving Behaviors Using STL Formulas: A Comparative Study. In Formal Modeling and Analysis of Timed Systems. Springer Intl. Publishing, 153–162.Google Scholar
Ruya Karagulle, Nikos Aréchiga, Andrew Best, Jonathan DeCastro, and Necmiye Ozay. 2024. A Safe Preference Learning Approach for Personalization with Applications to Autonomous Vehicles. IEEE Robotics and Automation Letters (2024), 1–8. https://doi.org/10.1109/LRA.2024.3375626Google ScholarCross Ref
Jesper Karlsson, Sanne van Waveren, Christian Pek, Ilaria Torre, Iolanda Leite, and Jana Tumova. 2021. Encoding Human Driving Styles in Motion Planning for Autonomous Vehicles. In 2021 IEEE International Conference on Robotics and Automation (ICRA). 1050–1056.Google Scholar
Xiao Li, Guy Rosman, Igor Gilitschenski, Cristian-Ioan Vasile, Jonathan A. DeCastro, Sertac Karaman, and Daniela Rus. 2021. Vehicle Trajectory Prediction Using Generative Adversarial Network With Temporal Logic Syntax Tree Features. IEEE Robotics and Automation Letters 6, 2 (2021), 3459–3466.Google ScholarCross Ref
Alexis Linard, Ilaria Torre, Bartoli Ermanno, Alex Sleat, Iolanda Leite, and Jana Tumova. 2023. Real-time RRT* with Signal Temporal Logic Preferences. In Intl. Conf. on Intelligent Robots and Systems (IROS).Google ScholarCross Ref
Lars Lindemann and Dimos V. Dimarogonas. 2019. Control Barrier Functions for Signal Temporal Logic Tasks. IEEE Control Systems Letters 3, 1 (2019), 96–101.Google ScholarCross Ref
Oded Maler and Dejan Nickovic. 2004. Monitoring Temporal Properties of Continuous Signals. In Formal Techniques, Modelling and Analysis of Timed and Fault-Tolerant Systems, Yassine Lakhnech and Sergio Yovine (Eds.). Springer Berlin Heidelberg, 152–166.Google Scholar
Lucas Maystre and Matthias Grossglauser. 2017. Just Sort It! A Simple and Effective Approach to Active Preference Learning. In Proceedings of the 34th International Conference on Machine Learning. PMLR, 2344–2353. ISSN: 2640-3498.Google Scholar
Noushin Mehdipour, Cristian-Ioan Vasile, and Calin Belta. 2021. Specifying User Preferences Using Weighted Signal Temporal Logic. IEEE Control Systems Letters 5, 6 (2021), 2006–2011.Google ScholarCross Ref
Mohammad Naghshvar, Tara Javidi, and Kamalika Chaudhuri. 2012. Noisy bayesian active learning. In 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 1626–1633.Google ScholarCross Ref
Daniel Neider and Ivan Gavran. 2018. Learning Linear Temporal Properties. In 2018 Formal Methods in Computer Aided Design (FMCAD). 1–10.Google Scholar
Petter Nilsson, Omar Hussien, Ayca Balkan, Yuxiao Chen, Aaron D. Ames, Jessy W. Grizzle, Necmiye Ozay, Huei Peng, and Paulo Tabuada. 2016. Correct-by-Construction Adaptive Cruise Control: Two Approaches. IEEE Trans. on Control Systems Technology 24, 4 (2016), 1294–1307.Google ScholarCross Ref
Dorsa Sadigh, Anca Dragan, Shankar Sastry, and Sanjit Seshia. 2017. Active Preference-Based Learning of Reward Functions. In Robotics: Science and Systems XIII. Robotics: Science and Systems Foundation.Google Scholar
Burr Settles. 2012. Active Learning. Springer International Publishing.Google Scholar
G. Venkatesh. 2011. Temporal Logic with Preferences and Reasoning About Games. In Proof, Computation and Agency: Logic at the Crossroads, Johan van Benthem, Amitabha Gupta, and Rohit Parikh (Eds.). Springer Netherlands, 241–258.Google Scholar
Nils Wilde, Dana Kulić, and Stephen L. Smith. 2020. Active Preference Learning using Maximum Regret. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 10952–10959. ISSN: 2153-0866.Google ScholarDigital Library
Ruixuan Yan, Agung Julius, Maria Chang, Achille Fokoue, Tengfei Ma, and Rosario Uceda-Sosa. 2021. STONE: Signal Temporal Logic Neural Network for Time Series Classification. In 2021 Intl. Conf. on Data Mining Workshops (ICDMW). 778–787.Google Scholar

Index Terms

Incorporating Logic in Online Preference Learning for Safe Personalization of Autonomous Vehicles

Recommendations

Poster Abstract: Safety Guaranteed Preference Learning Approach for Autonomous Vehicles
HSCC '23: Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control

In this work, we propose a safety-guaranteed personalization for autonomous vehicles by incorporating Signal Temporal Logic (STL) into preference learning problem. We propose a new variant of STL called Parametric Weighted Signal Temporal Logic with a ...
Read More
Numerical Analysis of Tractor Accidents using Driving Simulator for Autonomous Driving Tractor
ICMRE'19: Proceedings of the 5th International Conference on Mechatronics and Robotics Engineering

Autonomous driving of automobiles is a hot research topic in recent years. The autonomous driving tractor also has been studied in the agricultural field as well as an autonomous driving automobile. On the other hand, tractor accidents frequently occur ...
Read More
Autonomous Driving: Investigating the Feasibility of Bimodal Take-Over Requests

Autonomous vehicles will need de-escalation strategies to compensate when reaching system limitations. Car-driver handovers can be considered one possible method to deal with system boundaries. The authors suggest a bimodal auditory and visual handover ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HSCC '24: Proceedings of the 27th ACM International Conference on Hybrid Systems: Computation and Control
May 2024
307 pages
ISBN:9798400705229
DOI:10.1145/3641513
Editors:
Erika Ábrahám,
Manuel Mazo
Copyright © 2024 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 May 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Results Reproduced / v1.1
Author Tags
autonomous driving
preference learning
temporal logic
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate153of373submissions,41%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 3
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Incorporating Logic in Online Preference Learning for Safe Personalization of Autonomous Vehicles

HSCC '24: Proceedings of the 27th ACM International Conference on Hybrid Systems: Computation and Control

ABSTRACT

References

Cited By

Index Terms

Recommendations

Poster Abstract: Safety Guaranteed Preference Learning Approach for Autonomous Vehicles

Numerical Analysis of Tractor Accidents using Driving Simulator for Autonomous Driving Tractor

Autonomous Driving: Investigating the Feasibility of Bimodal Take-Over Requests