Abstract
There have been active efforts to use machine learning (ML) techniques for the development of smart systems, e.g., driving support systems with image recognition. However, the behavior of ML components, e.g., neural networks, is inductively derived from training data and thus uncertain and imperfect. Quality assessment heavily depends on and is restricted by a test data set or what has been tried among an enormous number of possibilities. Given this unique nature, we propose a MLQ framework for assessing the quality of ML components and ML-based systems. We introduce concepts to capture activities and evidences for the assessment and support the construction of arguments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We avoid the confusion by calling this as a “model” as in the ML community.
References
Dreossi, T., Donzé, A., Seshia, S.A.: Compositional falsification of cyber-physical systems with machine learning components. In: Barrett, C., Davies, M., Kahsai, T. (eds.) NFM 2017. LNCS, vol. 10227, pp. 357–372. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57288-8_26
Dreossi, T., Ghosh, S., Seshia, S., Sangiovani-Vincentelli, A.: Systematic testing of convolutional neural networks for autonomous driving. In: ICML 2017 Workshop on Reliable Machine Learning in the Wild, August 2017
Burton, S., Gauerhof, L., Heinzemann, C.: Making the case for safety of machine learning in highly automated driving. In: Tonetta, S., Schoitsch, E., Bitsch, F. (eds.) SAFECOMP 2017. LNCS, vol. 10489, pp. 5–16. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66284-8_1
Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (ICLR), May 2015
Gunning, D.: Explainable artificial intelligence (XAI). In: IJCAI 2016 Workshop on Deep Learning for Artificial Intelligence (DLAI), July 2016
Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 3–29. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_1
Ishikawa, F., Matsuno, Y.: Continuous argument engineering: Tackling uncertainty in machine learning based systems. In: The 6th International Workshop on Assurance Cases for Software-Intensive Systems (ASSURE 2018), September 2018
Jarman, D.C., Zhou, Z.Q., Chen, T.Y.: Metamorphic testing for Adobe data analytics software. In: The 2nd International Workshop on Metamorphic Testing, pp. 21–27, May 2017
Kelly, T., Weaver, R.: The goal structuring notation - a safety argument notation. In: Dependable Systems and Networks 2004 Workshop on Assurance Cases, July 2004
Pei, K., Cao, Y., Yang, J., Jana, S.: DeepXplore: automated whitebox testing of deep learning systems. In: The 26th Symposium on Operating Systems Principles (SOSP 2017), pp. 1–18, October 2017
Sculley, D., et al.: Machine learning: the high interest credit card of technical debt. In: NIPS 2014 Workshop on Software Engineering for Machine Learning (SE4ML), December 2014
Tokuda, H., Yonezawa, T., Nakazawa, J.: Monitoring dependability of city-scale IoT using D-case. In: 2014 IEEE World Forum on Internet of Things (WF-IoT), pp. 371–372, March 2014
Xie, X., Ho, J.W., Murphy, C., Kaiser, G., Xu, B., Chen, T.Y.: Testing and validating machine learning classifiers by metamorphic testing. J. Syst. Softw. 84(4), 544–558 (2011)
Acknowledgments
This work is partially supported by the ERATO HASUO Metamathematics for Systems Design Project (No. JPMJER1603), JST. We are thankful to the industry researchers and engineers who gave deep insights into the difficulties in the engineering of ML and practices in the case-study scenario.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Ishikawa, F. (2018). Concepts in Quality Assessment for Machine Learning - From Test Data to Arguments. In: Trujillo, J., et al. Conceptual Modeling. ER 2018. Lecture Notes in Computer Science(), vol 11157. Springer, Cham. https://doi.org/10.1007/978-3-030-00847-5_39
Download citation
DOI: https://doi.org/10.1007/978-3-030-00847-5_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00846-8
Online ISBN: 978-3-030-00847-5
eBook Packages: Computer ScienceComputer Science (R0)