research-article

Towards safe machine learning for CPS: infer uncertainty from training data

Authors:

Arvind EaswaranAuthors Info & Claims

ICCPS '19: Proceedings of the 10th ACM/IEEE International Conference on Cyber-Physical Systems

Pages 249 - 258

https://doi.org/10.1145/3302509.3311038

Published: 16 April 2019 Publication History

Abstract

Machine learning (ML) techniques are increasingly applied to decision-making and control problems in Cyber-Physical Systems among which many are safety-critical, e.g., chemical plants, robotics, autonomous vehicles. Despite the significant benefits brought by ML techniques, they also raise additional safety issues because 1) most expressive and powerful ML models are not transparent and behave as a black box and 2) the training data which plays a crucial role in ML safety is usually incomplete. An important technique to achieve safety for ML models is "Safe Fail", i.e., a model selects a reject option and applies the backup solution, a traditional controller or a human operator for example, when it has low confidence in a prediction.

Data-driven models produced by ML algorithms learn from training data, and hence they are only as good as the examples they have learnt. As pointed in [17], ML models work well in the "training space" (i.e., feature space with sufficient training data), but they could not extrapolate beyond the training space. As observed in many previous studies, a feature space that lacks training data generally has a much higher error rate than the one that contains sufficient training samples [31]. Therefore, it is essential to identify the training space and avoid extrapolating beyond the training space. In this paper, we propose an efficient Feature Space Partitioning Tree (FSPT) to address this problem. Using experiments, we also show that, a strong relationship exists between model performance and FSPT score.

References

[1]

{n. d.}. Breast Cancer Wisconsin (Diagnostic) Data Set. https://www.kaggle.com/uciml/breast-cancer-wisconsin-data/home

[2]

{n. d.}. Quality Prediction in a Mining Process. https://www.kaggle.com/edumagalhaes/quality-prediction-in-a-mining-process

[3]

{n. d.}. SARCOS. http://www.gaussianprocess.org/gpml/data/

[4]

Joshua Attenberg, Panos Ipeirotis, and Foster Provost. 2015. Beat the Machine: Challenging Humans to Find a Predictive Model's "Unknown Unknowns". J. Data and Information Quality 6, 1 (2015), 1:1--1:17.

Digital Library

[5]

Peter L Bartlett and Marten H Wegkamp. 2008. Classification with a reject option using a hinge loss. Journal of Machine Learning Research 9, Aug (2008), 1823--1840.

Digital Library

[6]

Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, et al. 2016. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316 (2016).

[7]

Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5--32.

Digital Library

[8]

Leo Breiman, Jerome H Friedman, Richard A Olshen, and Charles J Stone. 1984. Classification and regression trees. Wadsworth & Brooks/Cole Advanced Books & Software.

[9]

Dua Dheeru and Efi Karra Taniskidou. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml

[10]

Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning. 1050--1059.

Digital Library

[11]

Muriel Gevrey, Ioannis Dimopoulos, and Sovan Lek. 2003. Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological modelling 160, 3 (2003), 249--264.

[12]

Radu Herbei and Marten H Wegkamp. 2006. Classification with reject option. Canadian Journal of Statistics 34, 4 (2006), 709--721.

[13]

Achin Jain, Truong X Nghiem, Manfred Morari, and Rahul Mangharam. 2018. Learning and control using gaussian processes: towards bridging machine learning and controls for physical systems. In Proceedings of the 9th ACM/IEEE International Conference on Cyber-Physical Systems. IEEE Press, 140--149.

Digital Library

[14]

Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).

[15]

Jing Lei, Max G'Sell, Alessandro Rinaldo, Ryan J Tibshirani, and Larry Wasserman. 2018. Distribution-free predictive inference for regression. J. Amer. Statist. Assoc. (2018), 1--18.

[16]

Henry C Lin, Izhak Shafran, Todd E Murphy, Allison M Okamura, David D Yuh, and Gregory D Hager. 2005. Automatic detection and segmentation of robot-assisted surgical motions. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 802--810.

Digital Library

[17]

Gary Marcus. 2018. Deep learning: A critical appraisal. arXiv preprint arXiv:1801.00631 (2018).

[18]

Radford M Neal. 2012. Bayesian learning for neural networks. Vol. 118. Springer Science & Business Media.

[19]

Sebastian Nusser, Clemens Otte, and Werner Hauptmann. 2008. Interpretable ensembles of local models for safety-related applications. In ESANN. 301--306.

[20]

John Paisley, David Blei, and Michael Jordan. 2012. Variational Bayesian inference with stochastic search. arXiv preprint arXiv:1206.6430 (2012).

Digital Library

[21]

Kush R. Varshney and Homa Alemzadeh. 2016. On the Safety of Machine Learning: Cyber-Physical Systems, Decision Sciences, and Data Products. 5 (10 2016).

[22]

Sameer Singh Ribeiro, Marco Tulio and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. (2016).

[23]

Christian Robert. 2014. Machine learning, a probabilistic perspective.

[24]

Rick Salay, Rodrigo Queiroz, and Krzysztof Czarnecki. 2017. An Analysis of ISO 26262: Using Machine Learning Safely in Automotive Software. CoRR abs/1709.02435 (2017). arXiv:1709.02435 http://arxiv.org/abs/1709.02435

[25]

Matthias Seeger. 2004. Gaussian processes for machine learning. International journal of neural systems 14, 02 (2004), 69--106.

[26]

Sakshi Udeshi, Pryanshu Arora, and Sudipta Chattopadhyay. 2018. Automated directed fairness testing. Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering - ASE 2018 (2018).

Digital Library

[27]

Kush R Varshney, Ryan J Prenger, Tracy L Marlatt, Barry Y Chen, and William G Hanley. 2013. Practical ensemble classification error bounds for different operating points. IEEE Transactions on Knowledge and Data Engineering 25, 11 (2013), 2590--2601.

Digital Library

[28]

Vladimir Vovk, Alex Gammerman, and Glenn Shafer. 2005. Algorithmic Learning in a Random World. Springer-Verlag, Berlin, Heidelberg.

Digital Library

[29]

Vladimir Vovk, Ilia Nouretdinov, Alex Gammerman, et al. 2009. On-line predictive linear regression. The Annals of Statistics 37, 3 (2009), 1566--1590.

[30]

Gary M Weiss. 1995. Learning with rare cases and small disjuncts. (1995), 558--565.

Digital Library

[31]

Gary M Weiss. 2004. Mining with rarity: a unifying framework. ACM Sigkdd Explorations Newsletter 6, 1 (2004), 7--19.

Digital Library

[32]

Brian D Williamson, Peter B Gilbert, Noah Simon, and Marco Carone. 2017. Non-parametric variable importance assessment using machine learning techniques. (2017).

Cited By

Amir GMaayan OZelazny TKatz GSchapira M(2024)Verifying the Generalization of Deep Learning to Out-of-Distribution DomainsJournal of Automated Reasoning10.1007/s10817-024-09704-768:3Online publication date: 3-Aug-2024
https://doi.org/10.1007/s10817-024-09704-7
Qin XXia YZutshi AFan CDeshmukh J(2023)Statistical Verification using Surrogate Models and Conformal Inference and a Comparison with Risk-Aware VerificationACM Transactions on Cyber-Physical Systems10.1145/36351608:2(1-25)Online publication date: 5-Dec-2023
https://dl.acm.org/doi/10.1145/3635160
Dey SLee SHong JLanperne MPark JCerny TShahriar H(2023)A Multi-layered Collaborative Framework for Evidence-driven Data Requirements Engineering for Machine Learning-based Safety-critical SystemsProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing10.1145/3555776.3577647(1404-1413)Online publication date: 27-Mar-2023
https://dl.acm.org/doi/10.1145/3555776.3577647
Show More Cited By

Index Terms

Towards safe machine learning for CPS: infer uncertainty from training data
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Classification and regression trees
2. General and reference
  1. Cross-computing tools and techniques
    1. Reliability

Recommendations

Ergo, SMIRK is safe: a safety case for a machine learning component in a pedestrian automatic emergency brake system
Abstract
Integration of machine learning (ML) components in critical applications introduces novel challenges for software certification and verification. New safety standards and technical guidelines are under development to support the safety of ML-based ...
Towards Safe Weakly Supervised Learning
In this paper, we study weakly supervised learning where a large amount of data supervision is not accessible. This includes i) <italic>incomplete</italic> supervision, where only a small subset of labels is given, such as semi-supervised learning and ...
Safe exploration for interactive machine learning
NIPS'19: Proceedings of the 33rd International Conference on Neural Information Processing Systems

In Interactive Machine Learning (IML), we iteratively make decisions and obtain noisy observations of an unknown function. While IML methods, e.g., Bayesian optimization and active learning, have been successful in applications, on real-world systems they ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICCPS '19: Proceedings of the 10th ACM/IEEE International Conference on Cyber-Physical Systems

April 2019

367 pages

ISBN:9781450362856

DOI:10.1145/3302509

General Chairs:
Xue Liu
McGill University, Canada
,
Paulo Tabuada
University of California at Los Angeles
,
Program Chairs:
Miroslav Pajic
Duke University
,
Linda Bushnell
University of Washington

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGBED: ACM Special Interest Group on Embedded Systems

In-Cooperation

IEEE-CS\TCRT: TC on Real-Time Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 April 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICCPS '19

Sponsor:

SIGBED

ICCPS '19: ACM/IEEE 10th International Conference on Cyber-Physical Systems

April 16 - 18, 2019

Quebec, Montreal, Canada

Acceptance Rates

Overall Acceptance Rate 25 of 91 submissions, 27%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
579
Total Downloads

Downloads (Last 12 months)45
Downloads (Last 6 weeks)1

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Amir GMaayan OZelazny TKatz GSchapira M(2024)Verifying the Generalization of Deep Learning to Out-of-Distribution DomainsJournal of Automated Reasoning10.1007/s10817-024-09704-768:3Online publication date: 3-Aug-2024
https://doi.org/10.1007/s10817-024-09704-7
Qin XXia YZutshi AFan CDeshmukh J(2023)Statistical Verification using Surrogate Models and Conformal Inference and a Comparison with Risk-Aware VerificationACM Transactions on Cyber-Physical Systems10.1145/36351608:2(1-25)Online publication date: 5-Dec-2023
https://dl.acm.org/doi/10.1145/3635160
Dey SLee SHong JLanperne MPark JCerny TShahriar H(2023)A Multi-layered Collaborative Framework for Evidence-driven Data Requirements Engineering for Machine Learning-based Safety-critical SystemsProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing10.1145/3555776.3577647(1404-1413)Online publication date: 27-Mar-2023
https://dl.acm.org/doi/10.1145/3555776.3577647
Maruf MAzim AAuluck NSahi M(2023)Towards Safe Online Machine Learning Model Training and Inference on Edge Networks2023 International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA58977.2023.00161(1082-1089)Online publication date: 15-Dec-2023
https://doi.org/10.1109/ICMLA58977.2023.00161
Cai FKoutsoukos X(2023)Real-time detection of deception attacks in cyber-physical systemsInternational Journal of Information Security10.1007/s10207-023-00677-z22:5(1099-1114)Online publication date: 29-Mar-2023
https://doi.org/10.1007/s10207-023-00677-z
McCarthy AGhadafi EAndriotis PLegg P(2022)Functionality-Preserving Adversarial Machine Learning for Robust Classification in Cybersecurity and Intrusion Detection Domains: A SurveyJournal of Cybersecurity and Privacy10.3390/jcp20100102:1(154-190)Online publication date: 17-Mar-2022
https://doi.org/10.3390/jcp2010010
Catak FYue TAli S(2022)Uncertainty-aware Prediction Validator in Deep Learning Models for Cyber-physical System DataACM Transactions on Software Engineering and Methodology10.1145/352745131:4(1-31)Online publication date: 28-Mar-2022
https://dl.acm.org/doi/10.1145/3527451
Alimonda NGuidotto LMalandri LMercorio FMezzanzanica MTosi G(2022)A Survey on XAI for Cyber Physical Systems in Medicine2022 IEEE International Conference on Metrology for Extended Reality, Artificial Intelligence and Neural Engineering (MetroXRAINE)10.1109/MetroXRAINE54828.2022.9967673(265-270)Online publication date: 26-Oct-2022
https://doi.org/10.1109/MetroXRAINE54828.2022.9967673
Rahiminasab ZYuhas MEaswaran A(2022)Out of Distribution Reasoning by Weakly-Supervised Disentangled Logic Variational Autoencoder2022 6th International Conference on System Reliability and Safety (ICSRS)10.1109/ICSRS56243.2022.10067434(169-178)Online publication date: 23-Nov-2022
https://doi.org/10.1109/ICSRS56243.2022.10067434
Qin XXian YZutshi AFan CDeshmukh J(2022)Statistical Verification of Cyber-Physical Systems using Surrogate Models and Conformal Inference2022 ACM/IEEE 13th International Conference on Cyber-Physical Systems (ICCPS)10.1109/ICCPS54341.2022.00017(116-126)Online publication date: May-2022
https://doi.org/10.1109/ICCPS54341.2022.00017
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten