BallPri: test cases prioritization for deep neuron networks via tolerant ball in variable space

Jia, Chengyu; Chen, Jinyin; Li, Xiaohao; Zheng, Haibin; Zhang, Luxin

doi:10.1007/s10515-025-00498-5

BallPri: test cases prioritization for deep neuron networks via tolerant ball in variable space

Published: 06 March 2025

Volume 32, article number 29, (2025)
Cite this article

Automated Software Engineering Aims and scope Submit manuscript

Chengyu Jia²,
Jinyin Chen^1,2,
Xiaohao Li²,
Haibin Zheng^1,2 &
…
Luxin Zhang³

103 Accesses
Explore all metrics

Abstract

Deep neural networks (DNNs) have gained widespread adoption in various applications, including some safety-critical domains such as autonomous driving. However, despite their impressive capabilities and outstanding performance, DNNs could also exhibit incorrect behaviors that may lead to serious accidents. As a result, it requires security assurance urgently when applied to safety-critical applications. Deep testing has been developed as an effective technique for detecting incorrectness in DNN behaviors and improving their robustness when necessary, but it needs a large amount of labeled test cases that are expensive to obtain due to the labor-intensive data labeling process. Test case prioritization has been proposed to identify more error-exposed test cases earlier in advance, and several techniques such as DeepGini and PRIMA have been developed that achieve effective and efficient prioritization for classification tasks. However, these methods still face challenges such as unreliable validity, limited application scenarios, and high time complexity. To tackle these issues, we present a novel test prioritization method BallPri by using tolerant ball in variable space for DNNs. It extracts tolerant ball of different test cases and use minimum non-parametric likelihood ratio (MinLR) to further enlarge the difference of distribution in variable space, to achieve effective and general test cases prioritizing. Extensive experiments on benchmark datasets and models validate that BallPri outperforms the state-of-the-art methods in three key aspects: (1) Effective—it leverages tolerant ball in variable space to identify malicious bug-revealing inputs. BallPri significantly improves 47.83% prioritization effectiveness and 37.27% prioritization efficiency on average compared with baselines. (2) Extensible—it can be applied to various tasks, data and models. We verify the superiority of BallPri on classification and regression task, convolutional neural network and recurrent neural network model, image, text and speech dataset. (3) Efficient—it achieves a low time complexity compared with existing methods. We further evaluate BallPri against potential adaptive attacks and provide guidance for its accuracy and robustness. The open-source code of BallPri could be downloaded at https://github.com/lixiaohaao/BallPri.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Neuron importance-aware coverage analysis for deep neural network testing

Article 25 July 2024

D-Score: A White-Box Diagnosis Score for CNNs Based on Mutation Operators

DeepFault: Fault Localization for Deep Neural Networks

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

Data availability

No datasets were generated or analysed during the current study

Notes

References

Al-Qadasi, H., Wu, C., Falcone, Y., Bensalem, S.: Deepabstraction: 2-level prioritization for unlabeled test inputs in deep neural networks. In: 2022 IEEE International Conference On Artificial Intelligence Testing (AITest), pp. 64–71. IEEE (2022)
Benz, P., Zhang, C., Imtiaz, T., Kweon, I.S.: Double targeted universal adversarial perturbations. In: Ishikawa, H., Liu, C., Pajdla, T., Shi, J. (eds.) Computer Vision - ACCV 2020—15th Asian Conference on Computer Vision, Kyoto, Japan, November 30–December 4, 2020, Revised Selected Papers, Part IV. Lecture Notes in Computer Science, vol. 12625, pp. 284–300. Springer, Kyoto, Japan (2020). https://doi.org/10.1007/978-3-030-69538-5_18
Byun, T., Sharma, V., Vijayakumar, A., Rayadurgam, S., Cofer, D.: Input prioritization for testing neural networks. In: 2019 IEEE International Conference On Artificial Intelligence Testing (AITest), pp. 63–70. IEEE (2019)
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP). IEEE (2017)
Chen, J., Ge, J., Zheng, H.: Actgraph: prioritization of test cases based on deep neural network activation graph. Autom. Softw. Eng. 30(2), 28 (2023)
Article MATH Google Scholar
Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., Li, J.: Boosting adversarial attacks with momentum. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9185–9193 (2018)
Duarte, D., Nex, F., Kerle, N., Vosselman, G.: Satellite image classification of building damages using airborne and satellite image samples in a deep learning approach. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 4(2), 1–9 (2018)
Google Scholar
Feng, Y., Shi, Q., Gao, X., Wan, J., Fang, C., Chen, Z.: Deepgini: prioritizing massive tests to enhance the robustness of deep neural networks. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 177–188 (2020)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, pp. 7–9 (2015). arxiv: 1412.6572
Guo, J., Jiang, Y., Zhao, Y., Chen, Q., Sun, J.: Dlfuzz: differential fuzzing testing of deep learning systems. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 739–743 (2018)
Harel-Canada, F., Wang, L., Gulzar, M.A., Gu, Q., Kim, M.: Is neuron coverage a meaningful measure for testing deep neural networks? In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 851–862 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hong, J.: Why is artificial intelligence blamed more? Analysis of faulting artificial intelligence for self-driving car accidents in experimental settings. Int. J. Hum. Comput. Interact. 36(18), 1768–1774 (2020). https://doi.org/10.1080/10447318.2020.1785693
Article MATH Google Scholar
Kim, J., Feldt, R., Yoo, S.: Guiding deep learning system testing using surprise adequacy. In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 1039–1049. IEEE (2019)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
Article MATH Google Scholar
Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial examples in the physical world. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Workshop Track Proceedings. OpenReview.net, Toulon, France (2017). https://openreview.net/forum?id=HJGU3Rodl
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791
Article MATH Google Scholar
Lee, S., Cha, S., Lee, D., Oh, H.: Effective white-box testing of deep neural networks with adaptive neuron-selection strategy. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 165–176 (2020)
Li, C., Ma, X., Jiang, B., Li, X., Zhang, X., Liu, X., Cao, Y., Kannan, A., Zhu, Z.: Deep speaker: an end-to-end neural speaker embedding system, pp. 1–8 (2017). CoRR arxiv: 1705.02304
Li, Y., Li, M., Lai, Q., Liu, Y., Xu, Q.: Testrank: Bringing order into unlabeled test instances for deep learning tasks. Adv. Neural Inf. Process. Syst. 34, 20874–20886 (2021)
Google Scholar
Ma, L., Juefei-Xu, F., Zhang, F., Sun, J., Xue, M., Li, B., Chen, C., Su, T., Li, L., Liu, Y., et al.: Deepgauge: multi-granularity testing criteria for deep learning systems. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 120–131 (2018)
Ma, W., Papadakis, M., Tsakmalis, A., Cordy, M., Traon, Y.L.: Test selection for deep learning systems. ACM Trans. Softw. Eng. Methodol. 30(2), 13–11322 (2021). https://doi.org/10.1145/3417330
Article MATH Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations (2018)
Moosavi-Dezfooli, S.-M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2574–2582 (2016)
Mukhamadiyev, A., Khujayarov, I., Djuraev, O., Cho, J.: Automatic speech recognition method based on deep learning approaches for Uzbek language. Sensors 22(10), 3683 (2022). https://doi.org/10.3390/s22103683
Article Google Scholar
Nasery, A., Thakur, S., Piratla, V., De, A., Sarawagi, S.: Training for the future: a simple gradient interpolation loss to generalize along time. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, Virtual, pp. 19198–19209 (2021). https://proceedings.neurips.cc/paper/2021/hash/a02ef8389f6d40f84b50504613117f88-Abstract.html
Odena, A., Olsson, C., Andersen, D., Goodfellow, I.: Tensorfuzz: debugging neural networks with coverage-guided fuzzing. In: International Conference on Machine Learning, pp. 4901–4911. PMLR (2019)
Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS &P), pp. 372–387. IEEE (2016)
Pavlitskaya, S., Yıkmış, Ş., Zöllner, J.M.: Is neuron coverage needed to make person detection more robust? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2889–2897 (2022)
Pei, K., Cao, Y., Yang, J., Jana, S.: Deepxplore: automated whitebox testing of deep learning systems. Commun. ACM 62(11), 137–145 (2019). https://doi.org/10.1145/3361566
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (2015). arxiv: 1409.1556
Teng, Q., Liu, Z., Song, Y., Han, K., Lu, Y.: A survey on the interpretability of deep learning in medical diagnosis. Multimed. Syst. 28(6), 2335–2355 (2022). https://doi.org/10.1007/s00530-022-00960-4
Article MATH Google Scholar
Tian, Y., Pei, K., Jana, S., Ray, B.: Deeptest: automated testing of deep-neural-network-driven autonomous cars. In: Proceedings of the 40th International Conference on Software Engineering, pp. 303–314 (2018)
Van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9(11), 1–27 (2008)
MATH Google Scholar
Wang, Z., You, H., Chen, J., Zhang, Y., Dong, X., Zhang, W.: Prioritizing test inputs for deep neural networks via mutation analysis. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 397–409. IEEE (2021)
Wei, Z., Wang, H., Ashraf, I., Chan, W.: Predictive mutation analysis of test case prioritization for deep neural networks. In: 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), pp. 682–693. IEEE (2022)
Weiss, M., Tonella, P.: Simple techniques work surprisingly well for neural network test prioritization and active learning (replicability study). In: Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 139–150 (2022)
Wen, L., Jo, K.: Deep learning-based perception systems for autonomous driving: a comprehensive survey. Neurocomputing 489, 255–270 (2022). https://doi.org/10.1016/j.neucom.2021.08.155
Article MATH Google Scholar
Wicker, M., Huang, X., Kwiatkowska, M.: Feature-guided black-box safety testing of deep neural networks. In: Tools and Algorithms for the Construction and Analysis of Systems: 24th International Conference, TACAS 2018, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2018, Thessaloniki, Greece, April 14–20, 2018, Proceedings, Part I 24, pp. 408–426. Springer (2018)
Xie, X., Ma, L., Juefei-Xu, F., Xue, M., Chen, H., Liu, Y., Zhao, J., Li, B., Yin, J., See, S.: Deephunter: a coverage-guided fuzz testing framework for deep neural networks. In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 146–157 (2019)
Yan, R., Chen, Y., Gao, H., Yan, J.: Test case prioritization with neuron valuation based pattern. Sci. Comput. Program. 215, 102761 (2022)
Article MATH Google Scholar
Yang, X., Liu, W., Zhang, S., Liu, W., Tao, D.: Targeted attention attack on deep learning models in road sign recognition. IEEE Internet Things J. 8(6), 4980–4990 (2021). https://doi.org/10.1109/JIOT.2020.3034899
Article MATH Google Scholar
You, H., Wang, Z., Chen, J., Liu, S., Li, S.: Regression fuzzing for deep learning systems. In: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pp. 82–94. IEEE (2023)
Zhang, F., Hu, X., Ma, L., Zhao, J.: Deeprover: A query-efficient blackbox attack for deep neural networks. In: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1384–1394 (2023)
Zheng, H., Chen, J., Jin, H.: Certpri: certifiable prioritization for deep neural networks via movement cost in feature space. In: 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 1–13. IEEE (2023)

Download references

Acknowledgements

This research was supported by the Zhejiang Provincial Natural Science Foundation (No. LDQ23F020001), the National Natural Science Foundation of China (Nos. 62072406, 62406286, 62103374), Key Research and Development Program of Zhejiang Province (No. 2022C01018).

Author information

Authors and Affiliations

Institute of Cyberspace Security, Zhejiang University of Technology, Hangzhou, 310023, Zhejiang, China
Jinyin Chen & Haibin Zheng
College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, Zhejiang, China
Chengyu Jia, Jinyin Chen, Xiaohao Li & Haibin Zheng
National Key Laboratory of Electromagnetic Space Security, The 36th Research Institute of China Electronics Technology Group Corporation, Jiaxing, 100048, Zhejiang, China
Luxin Zhang

Authors

Chengyu Jia
View author publications
You can also search for this author inPubMed Google Scholar
Jinyin Chen
View author publications
You can also search for this author inPubMed Google Scholar
Xiaohao Li
View author publications
You can also search for this author inPubMed Google Scholar
Haibin Zheng
View author publications
You can also search for this author inPubMed Google Scholar
Luxin Zhang
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Chengyu Jia: Conceptualization, Data curation, Funding acquisition, Methodology, Resources, Writing-review & editing, Supervision, Data curation, Formal analysis. Jinyin Chen: Conceptualization, Data curation, Methodology, Writing-original draft, Visualization, Supervision. Xiaohao Li : Conceptualization, Data curation, Funding acquisition, Methodology, Resources, Writing-review & editing, Supervision, Data curation, Formal analysis. Haibin Zheng: Methodology, Writing-original draft, Validation. Luxin Zhang: Methodology, Validation.

Corresponding author

Correspondence to Haibin Zheng.

Ethics declarations

Conflict of interest

The authors declare no competing interests

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jia, C., Chen, J., Li, X. et al. BallPri: test cases prioritization for deep neuron networks via tolerant ball in variable space. Autom Softw Eng 32, 29 (2025). https://doi.org/10.1007/s10515-025-00498-5

Download citation

Received: 29 April 2024
Accepted: 02 February 2025
Published: 06 March 2025
DOI: https://doi.org/10.1007/s10515-025-00498-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BallPri: test cases prioritization for deep neuron networks via tolerant ball in variable space

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Neuron importance-aware coverage analysis for deep neural network testing

D-Score: A White-Box Diagnosis Score for CNNs Based on Mutation Operators

DeepFault: Fault Localization for Deep Neural Networks

Explore related subjects

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now