Skip to main content
Log in

Two experiments for evaluating the impact of Hamcrest and AssertJ on assertion development

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

Test automation enables continuous testing, a cornerstone of agile methods, and DevOps. Assertions play a fundamental role in test automation, and recently competing assertion libraries for unit testing frameworks, such as, for example, JUnit or TestNG, emerged. Thus, it is imperative to gauge assertion libraries in terms of developer/tester productivity, allowing SQA managers and software testers to select the best. The goal of this work is comparing two assertion libraries having a different approach (matchers vs. fluent assertions) w.r.t. two dependent variables: correctness of developed assertions and time to develop them. We conducted two controlled experiments with Bachelor students in Computer Science and Master students in Computer Engineering. AssertJ (fluent assertions approach) is compared with Hamcrest (matchers), in a test development scenario with the Java language where 672 assertions were developed by 48 students overall. The results show that (a) adopting AssertJ improves the tester’s productivity significantly during the development of assertions only for Bachelor students, and (b) time of developing assertions is similar using AssertJ or Hamcrest in both the categories of participants. Testers and SQA managers selecting assertion libraries for their organizations should consider as first choice AssertJ in case of inexperienced developers/testers since our study shows that it increases the productivity of Bachelor students more than Hamcrest.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. https://www.infosys.com/it-services/validation-solutions/white-papers/documents/choosing-right-automation-tool.pdf

  2. http://hamcrest.org/

  3. https://joel-costigliola.github.io/assertj/

  4. https://trends.google.com/trends/explore?q=Assertj,Hamcrest

  5. https://junit.org

  6. https://redmonk.com/fryan/2018/03/26/a-look-at-unit-testing-frameworks/

  7. https://junit.org/junit5/docs/snapshot/user-guide/#writing-tests

  8. This example, as the following ones, is taken from https://joel-costigliola.github.io/assertj/

  9. https://www.linkup.com/

  10. The O*NET taxonomy is a catalog of hundreds of occupational definitions developed under the sponsorship of the US Department of Labor/Employment and Training Administration in the 1990s, available as a free online database.

  11. A hundred hits out of almost 5 million rows may suggest a tepid interest for both assertion libraries. But the abstraction level of job ads is quite high, and most often technological categories are used instead of specific brands.

  12. On GitHub users star other users’ repositories to express appreciation.

  13. Watchers are users who asked to be notified of repository changes.

  14. https://searchcode.com/

  15. The information needed to apply the selection criteria is not completely accessible via the GitHub API. For example, it is not possible to directly ask for the number of commits, but only for a list of commits that may be divided into different pages, thus requiring several calls to the API. So, we retrieved the required information from the repository GitHub home page, using Requests-HTML for Python (https://html.python-requests.org/).

  16. https://github.com/Arkni/json-to-csv

  17. BSc developed 107 correct assertions with Hamcrest (i.e., the 37% of all the required) and 133 with AssertJ (i.e., the 46%). Thus, adopting AssertJ allowed to develop 26 correct assertions more, which are about 24% of 107.

  18. Note that when filling in the post-experiment questionnaire, participants were not aware of the correctness of their assertions. Thus, they answered considering how many assertions they developed, assuming to be correct even when they were not. That explains why, for instance, BSc students found no differences in PQ1: with both treatments, they developed about the same number of assertions (see Table 5), though the number of correct ones is different between AssertJ and Hamcrest.

  19. https://www.seleniumhq.org/projects/webdriver/

  20. https://eyeautomate.com/

  21. https://www.blazemeter.com/blog/hamcrest-vs-assertj-assertion-frameworks-which-one-should-you-choose, https://www.ontestautomation.com/three-practices-for-creating-readable-test-code/, https://www.developer.com/java/article.php/3901236/

  22. https://www.linkup.com/

References

  • Ardito, L., Coppola, R., Morisio, M., Torchiano, M. (2019). Espresso vs. EyeAutomate: an experiment for the comparison of two generations of Android GUI testing. In Proceedings of the evaluation and assessment on software engineering. EASE 2019 (pp. 13–22). ACM. https://doi.org/10.1145/3319008.3319022.

  • Basili, V.R., Caldiera, G., Rombach, H.D. (1994). The goal question metric approach. In Encyclopedia of software engineering: Wiley.

  • Bavota, G., Qusef, A., Oliveto, R., De Lucia, A., Binkley, D. (2015). Are test smells really harmful? An empirical study. Empirical Software Engineering, 20(4), 1052–1094. https://doi.org/10.1007/s10664-014-9313-0.

    Article  Google Scholar 

  • Booth, D.J., & Parkinson, K. (2011). Pelagic larval duration is similar across 23 of latitude for two species of butterflyfish (chaetodontidae) in eastern australia. Coral Reefs, 30(4), 1071. https://doi.org/10.1007/s00338-011-0815-6.

  • Briand, L.C. (2007). A critical analysis of empirical research in software testing. In Proceedings of 1st international symposium on empirical software engineering and measurement. ESEM 2007 (pp. 1–8). https://doi.org/10.1109/ESEM.2007.40.

  • Caprile, B., & Tonella, P. (1999). Nomen est omen: analyzing the language of function identifiers. In Proceedings of 6th working conference on reverse engineering. WCRE 1999 (pp. 112–122). IEEE. https://doi.org/10.1109/WCRE.1999.806952.

  • Ceccato, M., & Scandariato, R. (2016). Static analysis and penetration testing from the perspective of maintenance teams. In Proceedings of 10th international symposium on empirical software engineering and measurement. ESEM 2016 (pp. 25:1–25:6). ACM. https://doi.org/10.1145/2961111.2962611.

  • Ceccato, M., Marchetto, A., Mariani, L., Nguyen, C.D., Tonella, P. (2015). Do automatically generated test cases make debugging easier? An experimental assessment of debugging. ACM Transactions on Software Engineering and Methodology, 25(1). https://doi.org/10.1145/2768829.

  • Cerioli, M., Leotta, M., Ricca, F. (2020). What 5 million job advertisements tell us about testing: a preliminary empirical investigation. In Proceedings of 35th symposium on applied computing. SAC (pp.1586–1594). ACM. https://doi.org/10.1145/3341105.3373961.

  • Contan, A., Dehelean, C., Miclea, L. (2018). Test automation pyramid from theory to practice. In Proceedings of international conference on automation, quality and testing, robotics. AQTR 2018 (pp. 1–5). https://doi.org/10.1109/AQTR.2018.8402699.

  • Deursen, A., Moonen, L.M., Bergh, A., Kok, G. (2001). Refactoring test code. Tech. rep., Amsterdam, The Netherlands.

  • Dobslaw, F., Feldt, R., Michaelsson, D., Haar, P., de Oliveira Neto, F.G., Torkar, R. (2019). Estimating return on investment for GUI test automation tools. arXiv:https://arxiv.org/abs/1907.0347.

  • Garousi, V., & Mäntylä, M.V. (2016). A systematic literature review of literature reviews in software testing. Information and Software Technology, 80, 195–216. https://doi.org/10.1016/j.infsof.2016.09.002.

    Article  Google Scholar 

  • Gonzalez, D., Prentice, S., Mirakhorli, M. (2018). A fine-grained approach for automated conversion of junit assertions to english. In Proceedings of 4th international workshop on NLP for software engineering. NL4SE 2018 (pp. 14–17). ACM. https://doi.org/10.1145/3283812.3283819.

  • Hamill, P. (2005). Unit test frameworks - a language-independent overview. O’Reilly.

  • Harrold, M.J. (2000). Testing: a roadmap. In Proceedings of 22nd international conference on software engineering. ICSE 2000 (pp. 61–72). ACM. https://doi.org/10.1145/336512.336532.

  • Juristo, N., Moreno, A.M., Vegas, S. (2004). Reviewing 25 years of testing technique experiments. Empirical Software Engineering, 9(1), 7–44. https://doi.org/10.1023/B:EMSE.0000013513.48963.1b.

    Article  Google Scholar 

  • Kalliamvakou, E., Gousios, G., Blincoe, K., Singer, L., German, D.M., Damian, D. (2014). The promises and perils of mining GitHub. In Proceedings of 11th working conference on mining software repositories. MSR 2014 (pp. 92–101). ACM. https://doi.org/10.1145/2597073.2597074.

  • Kamimura, M., & Murphy, G.C. (2013). Towards generating human-oriented summaries of unit test cases. In Proceedings of 21st international conference on program comprehension. ICPC 2013 (pp. 215–218). https://doi.org/10.1109/ICPC.2013.6613851.

  • Karahasanoviæ, A., Anda, B., Arisholm, E., Hove, S.E., Jørgensen, M., Sjøberg, D.I., Welland, R. (2005). Collecting feedback during software engineering experiments. Empirical Software Engineering, 10(2), 113–147. https://doi.org/10.1007/s10664-004-6189-4.

    Article  Google Scholar 

  • Kitchenham, B.A., Dyba, T., Jorgensen, M. (2004). Evidence-based software engineering. In Proceedings of 26th international conference on software engineering. ICSE 2004 (pp. 273–281). IEEE. https://doi.org/10.1109/ICSE.2004.1317449.

  • Kitchenham, B., Al-Khilidar, H., Babar, M.A., Berry, M., Cox, K., Keung, J., Kurniawati, F., Staples, M., Zhang, H., Zhu, L. (2008). Evaluating guidelines for reporting empirical software engineering studies. Empirical Software Engineering, 13(1), 97–121. https://doi.org/10.1007/s10664-007-9053-5.

    Article  Google Scholar 

  • Kumar, D., & Mishra, K. (2016). The impacts of test automation on software’s cost, quality and time to market. Procedia Computer Science, 79, 8–15. https://doi.org/10.1016/j.procs.2016.03.003.

    Article  Google Scholar 

  • Leotta, M., Clerissi, D., Ricca, F., Tonella, P. (2016). Approaches and tools for automated end-to-end web testing. Advances in Computers, 101, 193–237. https://doi.org/10.1016/bs.adcom.2015.11.007.

    Article  Google Scholar 

  • Leotta, M., Cerioli, M., Olianas, D., & Ricca, F. (2018). Fluent vs basic assertions in Java: an empirical study. In Proceedings of 11th international conference on the quality of information and communications technology. QUATIC 2018 (pp. 184–192). IEEE. https://doi.org/10.1109/QUATIC.2018.00036.

  • Leotta, M., Cerioli, M., Olianas, D., & Ricca, F. (2019). Hamcrest vs AssertJ: an empirical assessment of tester productivity. In M. Piattini, P. Rupino da Cunha, I. García Rodríguez de Guzmán, & R. Pérez-Castillo (Eds.), Proceedings of 12th international conference on the quality of information and communications technology (QUATIC 2019), CCIS (Vol. 1010, pp. 161–176). Springer. https://doi.org/10.1007/978-3-030-29238-6_12.

  • Leotta, M., Biagiola, M., Ricca, F., Ceccato, M., Tonella, P. (2020). A family of experiments to assess the impact of page object pattern in web test suite development. In Proceedings of 13th IEEE international conference on software testing, verification and validation. ICST 2020: IEEE.

  • Motulsky, H. (2010). Intuitive biostatistics: a non-mathematical guide to statistical thinking. Oxford University Press.

  • Nelder, J.A., & Wedderburn, R.W.M. (1972). Generalized linear models. Journal of the Royal Statistical Society. Series A (General), 135(3), 370–384. https://doi.org/10.2307/2344614.

  • Oppenheim, A.N. (2000). Questionnaire design, interviewing and attitude measurement. Bloomsbury Publishing.

  • Ramler, R., & Wolfmaier, K. (2006). Economic perspectives in test automation: balancing automated and manual testing with opportunity cost. In Proceedings of 1st international workshop on automation of software test. AST 2006 (pp. 85–91): ACM. https://doi.org/10.1145/1138929.1138946.

  • Shull, F., Basili, V., Carver, J., Maldonado, J.C., Travassos, G.H., Mendonca, M., Fabbri, S. (2002). Replicating software engineering experiments: addressing the tacit knowledge problem. In Proceedings of international symposium on empirical software engineering (pp. 7–16). https://doi.org/10.1109/ISESE.2002.1166920.

  • Thummalapenta, S., Marri, M.R., Xie, T., Tillmann, N., & de Halleux, J. (2011). Retrofitting unit tests for parameterized unit testing. In D. Giannakopoulou, & F. Orejas (Eds.), Proceedings of 14th international conference on fundamental approaches to software engineering (FASE 2011), LNCS (Vol. 6603, pp. 294–309). Springer. https://doi.org/10.1007/978-3-642-19811-3_21.

  • Vendome, C., Bavota, G., Penta, M.D., Linares-Vásquez, M., German, D., Poshyvanyk, D. (2017). License usage and changes: a large-scale study on GitHub. Empirical Software Engineering, 22(3), 1537–1577. https://doi.org/10.1007/s10664-016-9438-4.

    Article  Google Scholar 

  • Wohlin, C., Runeson, P., Höst, M., Ohlsson, M., Regnell, B., Wesslén, A. (2000). Experimentation in software engineering - an introduction. Kluwer Academic Publishers.

  • Zhang, Y., & Mesbah, A. (2015). Assertions are strongly correlated with test suite effectiveness. In Proceedings of 10th joint meeting on foundations of software engineering. ESEC/FSE 2015 (pp. 214–224): ACM. https://doi.org/10.1145/2786805.2786858.

Download references

Acknowledgements

We want to show our gratitude to LinkUpFootnote 22 for providing us the possibility to access their database of global job listings directly from company websites (i.e., the raw data used for elaborating the results reported in Section 2.3).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maurizio Leotta.

Ethics declarations

Disclaimer

Responsibility for any information and result reported in this paper lies entirely with the Authors. LinkUp was not involved in any form of data analysis.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the: Topical Collection on Quality Management for Information Systems

Guest Editors: Mario Piattini, Ignacio García Rodríguez de Guzmán, Ricardo Pérez del Castillo

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Leotta, M., Cerioli, M., Olianas, D. et al. Two experiments for evaluating the impact of Hamcrest and AssertJ on assertion development. Software Qual J 28, 1113–1145 (2020). https://doi.org/10.1007/s11219-020-09507-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-020-09507-0

Keywords

Navigation