Towards an operationalization of test-driven development skills: An industrial empirical study
Introduction
Test-driven development (TDD) is a software development technique in which the development is guided by writing unit tests. It was popularized in the late 1990s as part of Extreme Programming [1]. A developer using TDD follows four steps:
- 1.
Write a unit test for the functionality she wants to add.
- 2.
Run the unit test to make sure it fails.
- 3.
Write only enough production code to make the test to pass.
- 4.
Refactor both production and test code, and re-run the tests.
TDD is claimed to yield better results than traditional approaches to software development (e.g., when unit tests are written after the intended functionality is considered completed by the development team) in terms of developers’ productivity, external quality (e.g., reduced number of defects), maintainability, and extensibility [2], [3]. However, empirical investigations of the effects of TDD are contrasting [4], [5], arguing that the results are influenced by several variables (e.g., academic vs. industrial settings), including the skills of developers.
Literature reviews on TDD conclude that the application of the technique—and subsequently the manifestation of its postulated benefits—requires some skills [5], [6]; however, these studies do not indicate what these skills are. We started our investigation on skills with students in a previous study [7]. In that context, we looked at their pre-existing knowledge regarding two practical skills: proficiency with programming language and unit testing (UT). When the subjects tackled a small programming task using TDD, we found that such skills had little impact on their productivity—defined as the output (e.g., parts of the task completed) per unit of effort (e.g., time to complete the task). No significant relationship was observed regarding the quality of the software they produced—e.g., the defects found in the parts of the task which were completed by the subjects. In the same study, we acknowledged that other skills must be present in order for TDD developers to achieve the benefits advocated by TDD supporters.
With these motivations based on existing literature and our previous work, we incorporate in this study another practical skill, which we call TDD process conformance, along with programming language and unit test skills. TDD process conformance represents the ability of a developer to follow the TDD cycle. Together, these three skills represent our TDD skill set. Further, we used a more realistic task to overcome the limitations of small programming tasks, and recruited professional developers for the study. Consequently, the research goal of this work is the following:
In our previous studies [7], [8], [9] we have investigated the role that each skill plays individually with student subjects working on toy tasks. We now focus on the impact the skills have, when taken together, on the outcomes of interest, by performing a quasi-experiment involving 43 professional software developers (30 after mortality) without prior working experience in TDD. The developers were trained during a week-long workshop and then asked to implement new features of a legacy system using TDD. Finally, we evaluated the composite effect of their skills on their performance in terms of external quality and productivity. Hence, we contribute to the existing knowledge by:
- •
Empirically investigating an anecdotal claim: that is, TDD requires skills to manifest benefits, with professional developers.
- •
Building a model for quality and productivity that takes into account a set of practical skills (Section 3)
- •
Providing initial empirical evidence that further investigation of the proposed TDD skill set are worth pursuing (Section 5)
The strong points of our study lie in the settings (Section 4) in which it was conducted. In particular, we:
- •
Analyze data collected from professional software developers.
- •
Utilize a near real-world, brown-field task, rather than a toy, green-field, task (see Section 4.2 and Appendix B).
- •
Quantify process conformance analytically, rather than relying on self reports.
The rest of the paper is organized as follows. In Section 2 we present the existing literature related to our research, in Section 3 we define the TDD skill set used in our study. Section 4 explains the details of our empirical study design. Sections 5 and 6 report the results and associated discussions. We address the threats to the validity of our study in Section 7. We conclude the paper in Section 8.
Section snippets
Related work
Test-driven development has been the subject of several secondary studies. The systematic literature review by Turhan et al. [5]—covering 32 empirical studies—found positive effects on external quality, whereas the productivity results were inconclusive, when TDD was used across different settings. The meta-analysis by Rafique and Misic [4] is of interest when looking at how experience works with the postulated TDD effects. The work covers 10 years of TDD publications, from 2000 to 2011, in 25
A skill set for TDD
Our goal in this paper is to make a holistic analysis of the skills rather than focusing on them individually. Therefore, we include three skills, i.e., programming and testing skills as well as TDD process conformance, to define a TDD skill set.
Although existing literature acknowledges that skills matter when applying TDD, none indicates the necessary ones. For example, Causevic et al. [6] identified the lack of developers’ skills as one of the main impediments to the adoption of TDD by
Study definition
An overview of the study is presented in Fig. 1. The study seeks the answers to the research questions presented in Section 4.1. We recruited subjects from two companies, in the context of a workshop about UT and TDD (Section 4.2). We assessed the subjects’ skills in Java development and UT at the beginning of the workshop. During the workshop the subjects carried out a brown-field, real-world task (Section 4.3). Subsequently, we collected the necessary data to extract TDD process
Results
In this section, we first report the descriptive statistics of the data, and provide a sanity check in order to proceed with clustering and ANOVA. All the statistical tests use
Discussion
We investigated two research hypotheses in which we argue that a difference in terms of external quality (HQLTY) and productivity (HPROD) exists among three TDD skill set groups.Our TDD skill set includes two different kind of skills: a-priori knowledge of concepts necessary to apply TDD (i.e., Java programming language and UT); and in process skill, i.e., the level of conformance to the TDD process. We first clustered the subjects according to their skills’ set, then we applied statistical
Threats to validity
In this section, we explain the main threats to the validity of our study following Wohlin et al. [41], along with the countermeasures we took when possible. Moreover, we suggest some actions that researchers willing to replicate this study could take to limit some of the threats. The types of validity threats are prioritized, in increasing order, following Cook and Campbell’s [42] guidelines. In particular, since this study is part of an effort to apply research in industry, we give more
Conclusions
In this work, we studied 30 professional software developers applying TDD to add new features to a legacy system close to real-world complexity. We contributed to the existing knowledge by operationalizing developers’ test-driven development skills, not only according to their a priori abilities (i.e., Java programming and UT), but also including their capacity to follow the test-driven development cycle. We clustered the subjects according to such skill set and compared them in terms of
Acknowledgements
This research is partially supported by the Academy of Finland with decision no.: 278354, and by Finnish Distinguished Professor (Fi.Di.Pro.) programme, ESEIL. The first author would like to acknowledge the Nokia Foundation and ISACA Finland chapter for the support provided in completing this work. We would like to acknowledge Dr. Lucas Layman who significantly contributed to the design of the task used in the study. We would like to acknowledge the anonymous reviewers for their helpful
References (50)
- et al.
Considering rigor and relevance when evaluating test driven development: A systematic review
Inform. Softw. Technol.
(2014) - et al.
Besouro: A framework for exploring compliance rules in automatic {TDD} behavior assessment
Inform. Softw. Technol.
(2015) Test-driven Development: By Example
(2002)Test Driven Development: A Practical Guide
(2003)Aim, fire
IEEE Softw.
(2001)- et al.
The effects of test-driven development on external quality and productivity: A meta-analysis
IEEE Trans. Softw. Eng.
(2013) - et al.
How effective is test driven development?
- et al.
Factors limiting industrial adoption of test driven development: A systematic review
2011 IEEE Fourth International Conference on Software Testing, Verification and Validation (ICST)
(2011) - et al.
On the effects of programming and testing skills on external quality and productivity in a test-driven development context
Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering (EASE’15)
(2015) - et al.
On the role of tests in test-driven development: A differentiated and partial replication
Empir. Softw. Eng.
(2013)
Impact of process conformance on the effects of test-driven development
8th ACM/IEEE International Symposium (ESEM’14)
Effects of developer experience on learning and applying unit test-driven development
IEEE Trans. Softw. Eng.
The effect of experience on the test-driven development process
Empir. Softw. Eng.
Topic selection in industry experiments
Proceedings of the 2nd International Workshop on Conducting Empirical Studies in Industry (CESI’14)
Are students representatives of professionals in software engineering experiments?
Proceedings of the 37th International Conference on Software Engineering (ICSE’15)
Empirical analysis of programming language adoption
SIGPLAN Not.
Conformance factor in test-driven development: Initial results from an enhanced replication
Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (EASE’14)
Experimental and Quasi-Experimental Designs for Generalized Causal Inference
Multisite randomized controlled trials in health services research: Scientific challenges and operational issues
Med. Care
Statistical power and optimal design for multisite randomized trials.
Psychol. Methods
Learning test-driven development by counting lines
IEEE Softw.
Coding Dojo: An environment for learning and sharing Agile practices
Agile Conference (AGILE’08)
Operational definition and automated inference of test-driven development with Zorro
Autom. Softw. Eng.
The role of process measurement in test-driven development
Tool supported detection and judgment of nonconformance in process execution
3rd International Symposium on Empirical Software Engineering and Measurement (ESEM’09)
Cited by (20)
Studying test-driven development and its retainment over a six-month time span
2021, Journal of Systems and SoftwareCitation Excerpt :A number of primary studies, like experiments or case studies, have been conducted on TDD (Fucci et al., 2016; Erdogmus et al., 2005; George and Williams, 2004; Bhat and Nagappan, 2006; Nagappan et al., 2008). Their results, gathered and combined in a number of secondary studies (Karac and Turhan, 2018; Bissi et al., 2016; Fucci et al., 2015; Turhan et al., 2010; Munir et al., 2014; Rafique and Mišić, 2013), do not fully support the claimed benefits of TDD (i.e., while some primary studies have shown that TDD allows improving quality of software products and/or developers’ productivity, other primary studies have not). Some researchers have conjectured that long-term observations are needed to see the claimed benefits of TDD and/or to better understand this development approach; therefore, they have recommended taking a longitudinal approach when investigating TDD (Fucci et al., 2015; Munir et al., 2014; Shull et al., 2010; Müller and Höfer, 2007)—i.e., studying TDD over a time span.
Findings from a multi-method study on test-driven development
2017, Information and Software TechnologyCitation Excerpt :The participants in our study were trained using a similar material, and over a similar time span as in the study presented in Fucci and Turhan [34]. A study with professionals using TDD [35] leveraged process conformance, along with other metrics related to the developer’ skills, to investigate the impact of the practice on the quality of the software as well as the developer’ productivity. In industrial settings, the authors showed that developer’ conformance was close to 75%; however, 13% of the time the process was not followed at all.
A Two-stage Method of Synchronization Prediction Framework in TDD
2022, Arabian Journal for Science and EngineeringConstruct validity in software engineering
2021, TechRxivThe effect of Test-Driven Development and Behavior-Driven Development on Project Success Factors: A Systematic Literature Review Based Study
2021, Proceedings of: 2020 International Conference on Computer, Control, Electrical, and Electronics Engineering, ICCCEEE 2020