Concepts in Quality Assessment for Machine Learning - From Test Data to Arguments

Ishikawa, Fuyuki

doi:10.1007/978-3-030-00847-5_39

Fuyuki Ishikawa ORCID: orcid.org/0000-0001-7725-2618²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11157))

Included in the following conference series:

International Conference on Conceptual Modeling

2397 Accesses
13 Citations

Abstract

There have been active efforts to use machine learning (ML) techniques for the development of smart systems, e.g., driving support systems with image recognition. However, the behavior of ML components, e.g., neural networks, is inductively derived from training data and thus uncertain and imperfect. Quality assessment heavily depends on and is restricted by a test data set or what has been tried among an enormous number of possibilities. Given this unique nature, we propose a MLQ framework for assessing the quality of ML components and ML-based systems. We introduce concepts to capture activities and evidences for the assessment and support the construction of arguments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We avoid the confusion by calling this as a “model” as in the ML community.

References

Dreossi, T., Donzé, A., Seshia, S.A.: Compositional falsification of cyber-physical systems with machine learning components. In: Barrett, C., Davies, M., Kahsai, T. (eds.) NFM 2017. LNCS, vol. 10227, pp. 357–372. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57288-8_26
Chapter Google Scholar
Dreossi, T., Ghosh, S., Seshia, S., Sangiovani-Vincentelli, A.: Systematic testing of convolutional neural networks for autonomous driving. In: ICML 2017 Workshop on Reliable Machine Learning in the Wild, August 2017
Google Scholar
Burton, S., Gauerhof, L., Heinzemann, C.: Making the case for safety of machine learning in highly automated driving. In: Tonetta, S., Schoitsch, E., Bitsch, F. (eds.) SAFECOMP 2017. LNCS, vol. 10489, pp. 5–16. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66284-8_1
Chapter Google Scholar
Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (ICLR), May 2015
Google Scholar
Gunning, D.: Explainable artificial intelligence (XAI). In: IJCAI 2016 Workshop on Deep Learning for Artificial Intelligence (DLAI), July 2016
Google Scholar
Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 3–29. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_1
Chapter Google Scholar
Ishikawa, F., Matsuno, Y.: Continuous argument engineering: Tackling uncertainty in machine learning based systems. In: The 6th International Workshop on Assurance Cases for Software-Intensive Systems (ASSURE 2018), September 2018
Google Scholar
Jarman, D.C., Zhou, Z.Q., Chen, T.Y.: Metamorphic testing for Adobe data analytics software. In: The 2nd International Workshop on Metamorphic Testing, pp. 21–27, May 2017
Google Scholar
Kelly, T., Weaver, R.: The goal structuring notation - a safety argument notation. In: Dependable Systems and Networks 2004 Workshop on Assurance Cases, July 2004
Google Scholar
Pei, K., Cao, Y., Yang, J., Jana, S.: DeepXplore: automated whitebox testing of deep learning systems. In: The 26th Symposium on Operating Systems Principles (SOSP 2017), pp. 1–18, October 2017
Google Scholar
Sculley, D., et al.: Machine learning: the high interest credit card of technical debt. In: NIPS 2014 Workshop on Software Engineering for Machine Learning (SE4ML), December 2014
Google Scholar
Tokuda, H., Yonezawa, T., Nakazawa, J.: Monitoring dependability of city-scale IoT using D-case. In: 2014 IEEE World Forum on Internet of Things (WF-IoT), pp. 371–372, March 2014
Google Scholar
Xie, X., Ho, J.W., Murphy, C., Kaiser, G., Xu, B., Chen, T.Y.: Testing and validating machine learning classifiers by metamorphic testing. J. Syst. Softw. 84(4), 544–558 (2011)
Article Google Scholar

Download references

Acknowledgments

This work is partially supported by the ERATO HASUO Metamathematics for Systems Design Project (No. JPMJER1603), JST. We are thankful to the industry researchers and engineers who gave deep insights into the difficulties in the engineering of ML and practices in the case-study scenario.

Author information

Authors and Affiliations

National Institute of Informatics, Tokyo, Japan
Fuyuki Ishikawa

Authors

Fuyuki Ishikawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fuyuki Ishikawa .

Editor information

Editors and Affiliations

Lucentia, University of Alicante, Alicante, Spain
Juan C. Trujillo
Miami University, Oxford, OH, USA
Karen C. Davis
Renmin University of China, Beijing, China
Xiaoyong Du
Northwestern Polytechnical University, Xian, China
Zhanhuai Li
Department of Computer Science, National University of Singapore, Singapore, Singapore
Tok Wang Ling
Department of Computer Science and Technology, Tsinghua University, Beijing, Beijing, China
Guoliang Li
National University of Singapore, Singapore, Singapore
Mong Li Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ishikawa, F. (2018). Concepts in Quality Assessment for Machine Learning - From Test Data to Arguments. In: Trujillo, J., et al. Conceptual Modeling. ER 2018. Lecture Notes in Computer Science(), vol 11157. Springer, Cham. https://doi.org/10.1007/978-3-030-00847-5_39

Download citation

DOI: https://doi.org/10.1007/978-3-030-00847-5_39
Published: 26 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00846-8
Online ISBN: 978-3-030-00847-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Concepts in Quality Assessment for Machine Learning - From Test Data to Arguments