Skip to main content

Robustness-Enhanced Assertion Generation Method Based on Code Mutation and Attack Defense

  • Conference paper
  • First Online:
Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2023)

Abstract

Writing high-quality unit tests plays a crucial role in discovering and diagnosing early-stage errors and preventing their further propagation throughout the development cycle. However, the low readability of existing automated test case tools hinders developers from directly using them. In addition, current approaches exhibit sensitivity to individual words in the input code, often producing completely different results for minor changes in the input code. To tackle these problems, we propose AssertGen, a powerful Java assertion generation model that maintains consistent output for minor variations in code snippets. Inspired by software mutation testing, we propose 11 heuristic strategies for code mutation, aiming to generate variant code that is human-readable but misleading to the model, by making minor changes to code text or structural information. Then, we use the variant code to attack the model to test the model’s robustness. We observe that the variant based on variable names (VM), the mutation based on method names (FM), and the mutation method False_Control_Flow, which adds additional control flow, have the greatest impact on the quality of generated assertions by the model. To enhance the robustness of AssertGen, we use multiple mutations to expand the original dataset, allowing the model to learn how to counter the instability caused by mutations during the training process. Experiment results show our assertion generation model achieves a BLEU score of 60.08 and a perfect prediction rate of 47.91%, surpassing previous work significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/c2nes/javalang.

  2. 2.

    https://github.com/huggingface/transformers.

  3. 3.

    https://pytorch.org/.

  4. 4.

    https://github.com/rjust/defects4j.

References

  1. Zhu, H., Hall, P.A., May, J.H.: Software unit test coverage and adequacy. ACM Comput. Surv. (CSUR) 29(4), 366–427 (1997)

    Article  Google Scholar 

  2. Cohn, M.: Succeeding with agile: software development using Scrum. Pearson Education (2010)

    Google Scholar 

  3. Runeson, P.: A survey of unit testing practices. IEEE Softw. 23(4), 22–29 (2006)

    Article  Google Scholar 

  4. Olan, M.: Unit testing: test early, test often. J. Comput. Sci. Coll. 19(2), 319–328 (2003)

    Google Scholar 

  5. Watson, C., Tufano, M., Moran, K., Bavota, G., Poshyvanyk, D.: On learning meaningful assert statements for unit test cases. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020, pp. 1398–1409 (2020)

    Google Scholar 

  6. Klammer, C., Kern, A.: Writing unit tests: It’s now or never! In: 2015 IEEE Eighth International Conference on Software Testing, Verification and Validation Workshops (ICSTW). IEEE, pp. 1–4 (2015)

    Google Scholar 

  7. Fraser, G., Arcuri, A.: Evosuite: automatic test suite generation for object-oriented software. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, pp. 416–419 (2011)

    Google Scholar 

  8. Pacheco, C., Ernst, M.D.: Randoop: feedback-directed random testing for Java. In: Companion to the 22nd ACM SIGPLAN Conference on Object- Oriented Programming Systems and Applications Companion, pp. 815–816 (2007)

    Google Scholar 

  9. Almasi, M.M., Hemmati, H., Fraser, G., Arcuri, A., Benefelds, J.: An industrial evaluation of unit test generation: finding real faults in a financial application. In: 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP). IEEE, pp. 263–272 (2017)

    Google Scholar 

  10. Shamshiri, S.: Automated unit test generation for evolving software. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, pp. 1038–1041 (2015)

    Google Scholar 

  11. Zhang, J., Panthaplackel, S., Nie, P., Li, J.J., Gligoric, M.: Coditt5: pretraining for source code and natural language editing. In: 37th IEEE/ACM International Conference on Automated Software Engineering, pp. 1–12 (2022)

    Google Scholar 

  12. Fukumoto, D., Kashiwa, Y., Hirao, T., Fujiwara, K., Iida, H.: An empirical investigation on the performance of domain adaptation for t5 code completion. In: 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 693–697. IEEE (2023)

    Google Scholar 

  13. Xia, C.S., Wei, Y., Zhang, L.: Automated program repair in the era of large pre-trained language models. In: Proceedings of the 45th International Conference on Software Engineering (ICSE 2023). Association for Computing Machinery (2023)

    Google Scholar 

  14. Kolak, S.D., Martins, R., Le Goues, C., Hellendoorn, V.J.: Patch generation with language models: Feasibility and scaling behavior. In: Deep Learning for Code Workshop (2022)

    Google Scholar 

  15. Prenner, J.A., Babii, H., Robbes, R.: Can openai’s codex fix bugs? an evaluation on quixbugs. In: Proceedings of the Third International Workshop on Automated Program Repair, pp. 69–75 (2022)

    Google Scholar 

  16. White, J., Hays, S., Fu, Q., Spencer-Smith, J., Schmidt, D.C.: Chatgpt prompt patterns for improving code quality, refactoring, requirements elicitation, and software design, arXiv preprint arXiv:2303.07839 (2023)

  17. Jiang, X., Zheng, Z., Lyu, C., Li, L., Lyu, L.: Treebert: a tree-based pre-trained model for programming language. In: Uncertainty in Artificial Intelligence. PMLR, pp. 54–63 (2021)

    Google Scholar 

  18. Wan, Y., Zhao, W., Zhang, H., Sui, Y., Xu, G., Jin, H.: What do they capture? a structural analysis of pre-trained language models for source code. In: Proceedings of the 44th International Conference on Software Engineering, pp. 2377–2388 (2022)

    Google Scholar 

  19. Wang, Y., Wang, W., Joty, S., Hoi, S.C.: Codet5: identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859 (2021)

  20. Lu, S., et al.: Codexglue: a machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664 (2021)

  21. Feng, Z., et al.: Codebert: a pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020)

  22. Fu, M., Tantithamthavorn, C., Le, T., Nguyen, V., Phung, D.: Vulrepair: a t5-based automated software vulnerability repair. In: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 935–947 (2022)

    Google Scholar 

  23. Fan, G., et al.: Dialog summarization for software collaborative platform via tuning pre-trained models. J. Syst. Softw., 111763 (2023)

    Google Scholar 

  24. Imai, S.: Is github copilot a substitute for human pair-programming? an empirical study. In: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings, pp. 319–321 (2022)

    Google Scholar 

  25. Pearce, H., Tan, B., Ahmad, B., Karri, R., Dolan-Gavitt, B.: Can openai codex and other large language models help us fix security bugs? arXiv preprint arXiv:2112.02125 (2021)

  26. Pearce, H., Tan, B., Krishnamurthy, P., Khorrami, F., Karri, R., Dolan Gavitt, B.: Pop quiz! can a large language model help with reverse engineering? arXiv preprint arXiv:2202.01142 (2022)

  27. Sarsa, S., Denny, P., Hellas, A., Leinonen, J.: Automatic generation of programming exercises and code explanations using large language models. In: Proceedings of the 2022 ACM Conference on International Computing Education Research-Volume 1, pp. 27–43 (2022)

    Google Scholar 

  28. Zhang, Z., Zhang, H., Shen, B., Gu, X.: Diet code is healthy: simplifying programs for pre-trained models of code. In: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1073–1084 (2022)

    Google Scholar 

  29. Li, Z., Wang, C., Liu, Z., Wang, H., Wang, S., Gao, C.: Cctest: testing and repairing code completion systems. arXiv preprint arXiv:2208.08289 (2022)

  30. Ojdanic, M., Soremekun, E., Degiovanni, R., Papadakis, M., Le Traon, Y.: Mutation testing in evolving systems: studying the relevance of mutants to code evolution. ACM Trans. Softw. Eng. Methodol. 32(1), 1–39 (2023)

    Article  Google Scholar 

  31. Harman, M., McMinn, P.: A theoretical and empirical study of search-based testing: Local, global, and hybrid search. IEEE Trans. Software Eng. 36(2), 226–247 (2009)

    Article  Google Scholar 

  32. Blasi, A., Gorla, A., Ernst, M.D., Pezz‘e, M.: Call me maybe: using nlp to automatically generate unit test cases respecting temporal constraints. In: 37th IEEE/ACM International Conference on Automated Software Engineering, pp. 1–11 (2022)

    Google Scholar 

  33. Delgado-Perez, A., Ramirez, A., Valle-Gomez, K.J., Medina-Bulo, I., Romero, J.R.: Interevo-tr: Interactive evolutionary test generation with readability assessment. IEEE Trans. Softw. Eng. (2022)

    Google Scholar 

  34. Ernst, M.D., et al.: The daikon system for dynamic detection of likely invariants. Sci. Comput. Program. 69(1–3), 35–45 (2007)

    Article  MathSciNet  Google Scholar 

  35. Csallner, C., Tillmann, N., Smaragdakis, Y.: Dysy: dynamic symbolic execution for invariant inference. In: Proceedings of the 30th International Conference on Software Engineering, pp. 281–290 (2008)

    Google Scholar 

  36. Xiao, X., Li, S., Xie, T., Tillmann, N.: Characteristic studies of loop problems for structural test generation via symbolic execution. In: 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, pp. 246–256 (2013)

    Google Scholar 

  37. Zeller, A., Gopinath, R., B̄ohme, M., Fraser, G., Holler, C.: The fuzzing book (2019)

    Google Scholar 

  38. Pacheco, C., Lahiri, S.K., Ernst, M.D., Ball, T.: Feedback-directed random test generation. In: 29th International Conference on Software Engineering (ICSE’07), pp. 75–84. IEEE (2007)

    Google Scholar 

  39. Shamshiri, S., Just, R., Rojas, J.M., Fraser, G., McMinn, P., Arcuri, A.: Do automatically generated unit tests find real faults? an empirical study of effectiveness and challenges (t). In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, pp. 201–211 (2015)

    Google Scholar 

  40. White, R., Krinke, J.: Testnmt: function-to-test neural machine translation. In: Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering, pp. 30–33 (2018)

    Google Scholar 

  41. Tufano, M., Drain, D., Svyatkovskiy, A., Deng, S.K., Sundaresan, N.: Unit test case generation with transformers and focal context

    Google Scholar 

  42. White, R., Krinke, J: Reassert: deep learning for assert generation. arXiv preprint arXiv:2011.09784 (2020)

  43. Villmow, J., Depoix, J., Ulges, A.: Contest: a unit test completion benchmark featuring context. In: Proceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021), pp. 17–25 (2021)

    Google Scholar 

  44. Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International conference on machine learning. Pmlr, pp. 1310–1318 (2013)

    Google Scholar 

  45. Yu, S., Wang, T., Wang, J.: Data augmentation by program transformation. J. Syst. Softw. 190, 111304 (2022)

    Article  Google Scholar 

  46. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

    Google Scholar 

  47. Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)

    Google Scholar 

  48. Yang, G., Zhou, Y., Yang, W., Yue, T., Chen, X., Chen, T.: How important are good method names in neural code generation? a model robustness perspective. arXiv preprint arXiv:2211.15844 (2022)

  49. Dinella, E., Ryan, G., Mytkowicz, T., Lahiri, S.K.: Toga: a neural method for test oracle generation. In: Proceedings of the 44th International Conference on Software Engineering, pp. 2130–2141 (2022)

    Google Scholar 

Download references

Acknowledgments

This project was funded by the National Natural Science Foundation of China (62032016, 61832014).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongyue Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, M. et al. (2024). Robustness-Enhanced Assertion Generation Method Based on Code Mutation and Attack Defense. In: Gao, H., Wang, X., Voros, N. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 562. Springer, Cham. https://doi.org/10.1007/978-3-031-54528-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-54528-3_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-54527-6

  • Online ISBN: 978-3-031-54528-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics