Dataset Diversity for Metamorphic Testing of Machine Learning Software

Nakajima, Shin

doi:10.1007/978-3-030-13651-2_2

Shin Nakajima¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11392))

Included in the following conference series:

International Workshop on Structured Object-Oriented Formal Language and Method

874 Accesses
7 Citations

Abstract

Machine learning software is non-testable in that training results are not available in advance. The metamorphic testing, using pseudo oracle, is promising for software testing of such machine learning programs. Machine learning software, indeed, works on a collection of a large number of data, and thus slight changes in the input training dataset have a large impact on training results. This paper proposes a new metamorphic testing method applicable to neural network learning models. Key ideas are dataset diversity as well as behavioral oracle. Dataset diversity takes into account the dataset dependency of training results, and provides a new way of generating follow-up test inputs. Behavioral oracle monitors changes of certain statistical indicators as training processes proceed and is a basis of metamorphic relations to be checked. The proposed method is illustrated with a case of software testing of neural network programs to classify handwritten numbers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Programs are regarded as functions for simplicity.
2.
We assume here supervised learning problems.
3.
Statistics for \(\mathbf{V}\) are similar.
4.
Discussions on \(g[{\mu }_{w}]\) are similar.

References

Ammann, P., Knight, J.C.: Data diversity: an approach to software fault tolerance. IEEE TC 37(4), 418–425 (1988)
Google Scholar
Ammann, P., Offutt, J.: Introduction to Software Testing. Cambridge University Press, Cambridge (2008)
Book Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Boston (2006). https://doi.org/10.1007/978-1-4615-7566-5
Book MATH Google Scholar
Brinker, K.: Incorporating diversity in active learning with support vector machines. In: Proceedings of 20th ICML, pp. 59–66 (2003)
Google Scholar
Chen, T.Y., Cheung, S.C., Yiu, S.M.: Metamorphic testing - a new approach for generating next test cases. HKUST-CS98-01, The Hong Kong University of Science and Technology (1998)
Google Scholar
Chen, T.Y., et al.: Metamorphic testing: a review of challenges and opportunities. ACM Comput. Surv. 51(1), 1–27 (2018). Article no. 4
Article MathSciNet Google Scholar
Davis, M.D., Weyuker, E.J.: Pseudo-oracles for non-testable programs. In: Proceedings of ACM, pp. 254–257 (1981)
Google Scholar
Dijkstra, E.W.: The Humber Programmer - ACM Turing Award Lecture (1972)
Google Scholar
Goodfellow, I.J., Shelens, J., Szegedy, C.: Explaining and harnessing adversarial examples. CoRR abs/1412.6572 (2014)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press, Cambridge (2016)
MATH Google Scholar
Haykin, S.: Neural Networks and Learning Machines, 3rd edn. Pearson India, Noida (2016)
Google Scholar
Nakajima, S., Bui, H.N.: Dataset coverage for testing machine learning computer programs. In: Proceedings of 23rd APSEC, pp. 297–304 (2016)
Google Scholar
Nakajima, S.: Generalized Oracle for testing machine learning computer programs. In: Proceedings of SEFM Workshops, pp. 174–179 (2017)
Chapter Google Scholar
Nakajima, S.: Quality assurance of machine learning software. In: Proceedings of GCCE 2018 (2018, to appear)
Google Scholar
Narodytska, N., Kasiviswanathan, S.P.: Simple black-box adversarial attacks on deep neural networks. In: Proceedings of CVPR Workshop, pp. 1310–1318 (2017)
Google Scholar
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of Asia CCS 2017, pp. 506–519 (2017)
Google Scholar
Pei, K., Cao, Y., Yang, J., Jana, S.: DeepXplore: automated whitebox testing of deep learning systems. In: Proceedings of 26th SOSP, pp. 1–18 (2017)
Google Scholar
Quinonero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D. (eds.): Dataset Shift in Machine Learning. The MIT Press, Cambridge (2009)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. CoRR abs/1312.6199 (2013)
Google Scholar
Su, J., Vargas, D.V., Sakurai, K: One pixel attack for fooling deep neural networks. CoRR abs/1710.08864 (2017)
Google Scholar
Tian, Y., Pei, K., Jana, S., Ray, B.: DeepTest: automated testing of deep-neural-network-driven autonomous cars. In: Proceedings of the 40th ICSE, pp. 303–314 (2018)
Google Scholar
Weyuker, E.J.: On testing non-testable programs. Comput. J. 25(4), 465–470 (1982)
Article Google Scholar
Xie, X., Ho, J.W.K., Murphy, C., Kaiser, G., Xu, B., Chen, T.Y.: Testing and validating machine learning classifiers by metamorphic testing. J. Syst. Softw. 84(4), 544–558 (2011)
Article Google Scholar

Download references

Acknowledgment

The author expresses his sincere thanks to Professor T.Y. Chen (Swinburne University) for valuable comments on an early draft. The author is partially supported by JSPS KAKENHI Grant Number JP18H03224.

Author information

Authors and Affiliations

National Institute of Informatics, Tokyo, Japan
Shin Nakajima

Authors

Shin Nakajima
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shin Nakajima .

Editor information

Editors and Affiliations

Xidian University, Xi’an, China
Zhenhua Duan
Hosei University, Tokyo, Japan
Shaoying Liu
Xidian University, Xi’an, China
Cong Tian
Nihon University, Tokyo, Japan
Fumiko Nagoya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nakajima, S. (2019). Dataset Diversity for Metamorphic Testing of Machine Learning Software. In: Duan, Z., Liu, S., Tian, C., Nagoya, F. (eds) Structured Object-Oriented Formal Language and Method. SOFL+MSVL 2018. Lecture Notes in Computer Science(), vol 11392. Springer, Cham. https://doi.org/10.1007/978-3-030-13651-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-13651-2_2
Published: 09 February 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13650-5
Online ISBN: 978-3-030-13651-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics