ABSTRACT
Developers are continuously implementing changes to meet demands coming from users. In the context of test-driven development, before any new code is added, a test case should be written to make sure new changes do not introduce bugs. During this process, developers and testers might adopt bad design choices, which may lead to the introduction of the so-called Test Smells in the code. Test Smells are bad solutions for implementing or designing test code. We perform a broader study to investigate the participants’ perceptions about the presence of Test Smells. We analyze whether certain factors related to the participant’ profiles concerning background and experience may influence their perception of Test Smells. Also, we analyze if the heuristics adopted by developers influence their perceptions about the existence of Test Smells. We analyze commits of open source projects to identify the introduction of Test Smells. Then, we conduct an empirical study with 25 participants that evaluate instances of 10 different smell types. For each Test Smell type, we analyze the agreement among participants, and we assess the influence of different factors on the participants’ evaluations. Altogether, more than 1250 evaluations were made. The results indicate that participants present a low agreement on detecting all 10 Test Smells types analyzed in our study. The results also suggest that factors related to background and experience do not have a consistent effect on the agreement among the participants. On the other hand, the results indicate that the agreement is consistently influenced by specific heuristics employed by participants. Our findings reveal that the participants detect Test Smells in significantly different ways. As a consequence, these findings introduce some questions concerning the results of previous studies that do not consider the different perceptions of participants on detecting Test Smells.
- Gabriele Bavota, Abdallah Qusef, Rocco Oliveto, Andrea De Lucia, and Dave Binkley. 2015. Are test smells really harmful? an empirical study. Empirical Software Engineering 20, 4 (2015), 1052–1094.Google ScholarDigital Library
- Jonathan Immanuel Brachthäuser, Sukyoung Ryu, Nathaniel Nystrom, Jonas De Bleser, Dario Di Nucci, and Coen De Roover. 2019. SoCRATES: Scala radar for test smells. Proceedings of the Tenth ACM SIGPLAN Symposium on Scala (2019), 22–26. https://doi.org/10.1145/3337932.3338815Google ScholarDigital Library
- Everton Cavalcante, Francisco Dantas, Thais Batista, Elvys Soares, Márcio Ribeiro, Guilherme Amaral, Rohit Gheyi, Leo Fernandes, Alessandro Garcia, Baldoino Fonseca, and André Santos. 2020. Refactoring Test Smells: A Perspective from Open-Source Developers. Proceedings of the 5th SAST (2020), 50–59. https://doi.org/10.1145/3425174.3425212Google ScholarDigital Library
- Jonas De Bleser, Dario Di Nucci, and Coen De Roover. 2019. Assessing diffusion and perception of test smells in scala projects. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, 457–467.Google ScholarDigital Library
- Prem Devanbu, Myra Cohen, Thomas Zimmermann, Anthony Peruma, Khalid Almalki, Christian D Newman, Mohamed Wiem Mkaouer, Ali Ouni, and Fabio Palomba. 2020. tsDetect: an open source test smells detection tool. Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (2020), 1650–1654. https://doi.org/10.1145/3368089.3417921Google ScholarDigital Library
- Joseph L Fleiss. 1971. Measuring nominal scale agreement among many raters.Psychological bulletin 76, 5 (1971), 378.Google Scholar
- Mário Hozano, Alessandro Garcia, Baldoino Fonseca, and Evandro Costa. 2018. Are you smelling it? Investigating how similar developers detect code smells. Information and Software Technology 93 (2018), 130–146.Google ScholarDigital Library
- Nildo Silva Junior, Larissa Rocha, Luana Almeida Martins, and Ivan Machado. 2020. A survey on test practitioners’ awareness of test smells. arXiv preprint arXiv:2003.05613 (2020).Google Scholar
- J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159–174.Google Scholar
- Mika V Mantyla. 2005. An experiment on subjective evolvability evaluation of object-oriented software: explaining factors and interrater agreement. In 2005 International Symposium on Empirical Software Engineering, 2005. IEEE, 10–pp.Google ScholarCross Ref
- Mika V Mäntylä and Casper Lassenius. 2006. Subjective evaluation of software evolvability using code smells: An empirical study. Empirical Software Engineering 11, 3 (2006), 395–431.Google ScholarDigital Library
- Luana Martins, Heitor Costa, and Ivan Machado. 2023. On the diffusion of test smells and their relationship with test code quality of Java projects. Journal of Software: Evolution and Process (2023). https://doi.org/10.1002/smr.2532Google ScholarDigital Library
- Anthony Peruma, Khalid Almalki, Christian D Newman, Mohamed Wiem Mkaouer, Ali Ouni, and Fabio Palomba. 2020. Tsdetect: An open source test smells detection tool. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1650–1654.Google ScholarDigital Library
- Anthony Peruma, Khalid Saeed Almalki, Christian D Newman, Mohamed Wiem Mkaouer, Ali Ouni, and Fabio Palomba. 2019. On the distribution of test smells in open source android applications: An exploratory study. (2019).Google Scholar
- Brittany Reid, Markus Wagner, Marcelo d’Amorim, and Christoph Treude. 2022. Software Engineering User Study Recruitment on Prolific: An Experience Report. arXiv preprint arXiv:2201.05348 (2022).Google Scholar
- Railana Santana, Daniel Fernandes, Denivan Campos, Larissa Soares, Rita Maciel, and Ivan Machado. 2021. Understanding practitioners’ strategies to handle test smells: a multi-method study. Brazilian Symposium on Software Engineering (2021), 49–53. https://doi.org/10.1145/3474624.3474639Google ScholarDigital Library
- Railana Santana, Daniel Fernandes, Denivan Campos, Larissa Soares, Rita Maciel, and Ivan Machado. 2021. Understanding practitioners’ strategies to handle test smells: a multi-method study. In Brazilian Symposium on Software Engineering. 49–53.Google ScholarDigital Library
- Railana Santana, Luana Martins, Tássio Virgínio, Larissa Soares, Heitor Costa, and Ivan Machado. 2022. Refactoring Assertion Roulette and Duplicate Assert test smells: a controlled experiment. arXiv (2022). https://doi.org/10.48550/arxiv.2207.05539 arXiv:2207.05539Google ScholarCross Ref
- C.B. Seaman. 1999. Qualitative methods in empirical studies of software engineering. IEEE Transactions on Software Engineering 25, 4 (1999), 557–572. https://doi.org/10.1109/32.799955Google ScholarDigital Library
- Elvys Soares, Márcio Ribeiro, Guilherme Amaral, Rohit Gheyi, Leo Fernandes, Alessandro Garcia, Baldoino Fonseca, and André Santos. 2020. Refactoring test smells: A perspective from open-source developers. In Proceedings of the 5th Brazilian Symposium on Systematic and Automated Software Testing. 50–59.Google ScholarDigital Library
- Elvys Soares, Marcio Ribeiro, Rohit Gheyi, Guilherme Amaral, and Andre Medeiros Santos. 2022. Refactoring Test Smells With JUnit 5: Why Should Developers Keep Up-to-Date. IEEE Transactions on Software Engineering PP, 99 (2022), 1–1. https://doi.org/10.1109/tse.2022.3172654Google ScholarDigital Library
- Davide Spadini, Fabio Palomba, Andy Zaidman, Magiel Bruntink, and Alberto Bacchelli. 2018. On the relation of test smells to software code quality. In 2018 IEEE international conference on software maintenance and evolution (ICSME). IEEE, 1–12.Google ScholarCross Ref
- Arie Van Deursen, Leon Moonen, Alex Van Den Bergh, and Gerard Kok. 2001. Refactoring test code. In Proceedings of the 2nd international conference on extreme programming and flexible processes in software engineering (XP2001). Citeseer, 92–95.Google Scholar
- Tássio Virgínio, Luana Almeida Martins, Larissa Rocha Soares, Railana Santana, Heitor Costa, and Ivan Machado. 2020. An empirical study of automatically-generated tests from the perspective of test smells. In Proceedings of the 34th Brazilian Symposium on Software Engineering. 92–96.Google ScholarDigital Library
- Tássio Virgínio, Luana Martins, Railana Santana, Adriana Cruz, Larissa Rocha, Heitor Costa, and Ivan Machado. 2021. On the test smells detection: an empirical study on the JNose Test accuracy. Journal of Software Engineering Research and Development 9 (2021). https://doi.org/10.5753/jserd.2021.1893Google ScholarCross Ref
- Claes Wohlin, Per Runeson, Martin Hst, Magnus C. Ohlsson, Bjrn Regnell, and Anders Wessln. 2012. Experimentation in Software Engineering. Springer Publishing Company, Incorporated.Google ScholarCross Ref
Index Terms
- Do you see any problem? On the Developers Perceptions in Test Smells Detection
Recommendations
Are test smells really harmful? An empirical study
Bad code smells have been defined as indicators of potential problems in source code. Techniques to identify and mitigate bad code smells have been proposed and studied. Recently bad test code smells (test smells for short) have been put forward as a ...
Automated Detection of Test Fixture Strategies and Smells
ICST '13: Proceedings of the 2013 IEEE Sixth International Conference on Software Testing, Verification and ValidationDesigning automated tests is a challenging task. One important concern is how to design test fixtures, i.e. code that initializes and configures the system under test so that it is in an appropriate state for running particular automated tests. Test ...
Analyzing Test Smells Refactoring from a Developers Perspective
SBQS '22: Proceedings of the XXI Brazilian Symposium on Software QualityTest smells represent a set of poorly designed tests, which can harm a test code’s maintenance and quality criteria. Although fundamental steps to understand test smells have been investigated, there is still an evident lack of studies evaluating the ...
Comments