Towards an operationalization of test-driven development skills: An industrial empirical study

https://doi.org/10.1016/j.infsof.2015.08.004Get rights and content

Abstract

Context: The majority of the empirical studies on Test-driven development (TDD) are concerned with verifying or refuting the effectiveness of the technique over a traditional approach, and they tend to neglect whether the subjects possess the necessary skills to apply TDD, though they argue such skills are necessary.

Objective: We evaluate a set of minimal, a priori and in process skills necessary to apply TDD. We determine whether variations in external quality (i.e., number of defects) and productivity (i.e., number of features implemented) can be associated with different clusters of the TDD skills’ set.

Method: We executed a quasi-experiment involving 30 practitioners from industry. We first grouped the participants according to their TDD skills’ set (consisting of a priori experience on programming and testing as well as in-process TDD conformance) into three levels (Low-Medium-High) using k-means clustering. We then applied ANOVA to compare the clusters in terms of external quality and productivity, and conducted post-hoc pairwise analysis.

Results: We did not observe a statistically significant difference between the clusters either for external software quality (F(2,27=1.44,p=.260), or productivity (F(2,27)=3.02,p=.065). However, the analysis of the effect sizes and their confidence intervals shows that the TDD skills’ set is a factor that could account for up to 28% of the external quality, and 38% for productivity.

Conclusion: We have reason to conclude that focusing on the improvement of TDD skills’ set investigated in this study could benefit software developers in improving their baseline productivity and the external quality of the code they produce. However, replications are needed to overcome the issues related with the statistical power of this study. We suggest practical insights for future work to investigate the phenomenon further.

Introduction

Test-driven development (TDD) is a software development technique in which the development is guided by writing unit tests. It was popularized in the late 1990s as part of Extreme Programming [1]. A developer using TDD follows four steps:

  • 1.

    Write a unit test for the functionality she wants to add.

  • 2.

    Run the unit test to make sure it fails.

  • 3.

    Write only enough production code to make the test to pass.

  • 4.

    Refactor both production and test code, and re-run the tests.

TDD is claimed to yield better results than traditional approaches to software development (e.g., when unit tests are written after the intended functionality is considered completed by the development team) in terms of developers’ productivity, external quality (e.g., reduced number of defects), maintainability, and extensibility [2], [3]. However, empirical investigations of the effects of TDD are contrasting [4], [5], arguing that the results are influenced by several variables (e.g., academic vs. industrial settings), including the skills of developers.

Literature reviews on TDD conclude that the application of the technique—and subsequently the manifestation of its postulated benefits—requires some skills [5], [6]; however, these studies do not indicate what these skills are. We started our investigation on skills with students in a previous study [7]. In that context, we looked at their pre-existing knowledge regarding two practical skills: proficiency with programming language and unit testing (UT). When the subjects tackled a small programming task using TDD, we found that such skills had little impact on their productivity—defined as the output (e.g., parts of the task completed) per unit of effort (e.g., time to complete the task). No significant relationship was observed regarding the quality of the software they produced—e.g., the defects found in the parts of the task which were completed by the subjects. In the same study, we acknowledged that other skills must be present in order for TDD developers to achieve the benefits advocated by TDD supporters.

With these motivations based on existing literature and our previous work, we incorporate in this study another practical skill, which we call TDD process conformance, along with programming language and unit test skills. TDD process conformance represents the ability of a developer to follow the TDD cycle. Together, these three skills represent our TDD skill set. Further, we used a more realistic task to overcome the limitations of small programming tasks, and recruited professional developers for the study. Consequently, the research goal of this work is the following:

In our previous studies [7], [8], [9] we have investigated the role that each skill plays individually with student subjects working on toy tasks. We now focus on the impact the skills have, when taken together, on the outcomes of interest, by performing a quasi-experiment involving 43 professional software developers (30 after mortality) without prior working experience in TDD. The developers were trained during a week-long workshop and then asked to implement new features of a legacy system using TDD. Finally, we evaluated the composite effect of their skills on their performance in terms of external quality and productivity. Hence, we contribute to the existing knowledge by:

  • Empirically investigating an anecdotal claim: that is, TDD requires skills to manifest benefits, with professional developers.

  • Building a model for quality and productivity that takes into account a set of practical skills (Section 3)

  • Providing initial empirical evidence that further investigation of the proposed TDD skill set are worth pursuing (Section 5)

The strong points of our study lie in the settings (Section 4) in which it was conducted. In particular, we:

  • Analyze data collected from professional software developers.

  • Utilize a near real-world, brown-field task, rather than a toy, green-field, task (see Section 4.2 and Appendix B).

  • Quantify process conformance analytically, rather than relying on self reports.

The rest of the paper is organized as follows. In Section 2 we present the existing literature related to our research, in Section 3 we define the TDD skill set used in our study. Section 4 explains the details of our empirical study design. Sections 5 and 6 report the results and associated discussions. We address the threats to the validity of our study in Section 7. We conclude the paper in Section 8.

Section snippets

Related work

Test-driven development has been the subject of several secondary studies. The systematic literature review by Turhan et al. [5]—covering 32 empirical studies—found positive effects on external quality, whereas the productivity results were inconclusive, when TDD was used across different settings. The meta-analysis by Rafique and Misic [4] is of interest when looking at how experience works with the postulated TDD effects. The work covers 10 years of TDD publications, from 2000 to 2011, in 25

A skill set for TDD

Our goal in this paper is to make a holistic analysis of the skills rather than focusing on them individually. Therefore, we include three skills, i.e., programming and testing skills as well as TDD process conformance, to define a TDD skill set.

Although existing literature acknowledges that skills matter when applying TDD, none indicates the necessary ones. For example, Causevic et al. [6] identified the lack of developers’ skills as one of the main impediments to the adoption of TDD by

Study definition

An overview of the study is presented in Fig. 1. The study seeks the answers to the research questions presented in Section 4.1. We recruited subjects from two companies, in the context of a workshop about UT and TDD (Section 4.2). We assessed the subjects’ skills in Java development and UT at the beginning of the workshop. During the workshop the subjects carried out a brown-field, real-world task (Section 4.3). Subsequently, we collected the necessary data to extract TDD process

Results

In this section, we first report the descriptive statistics of the data, and provide a sanity check in order to proceed with clustering and ANOVA. All the statistical tests use α=0.05.

Discussion

We investigated two research hypotheses in which we argue that a difference in terms of external quality (HQLTY) and productivity (HPROD) exists among three TDD skill set groups.Our TDD skill set includes two different kind of skills: a-priori knowledge of concepts necessary to apply TDD (i.e., Java programming language and UT); and in process skill, i.e., the level of conformance to the TDD process. We first clustered the subjects according to their skills’ set, then we applied statistical

Threats to validity

In this section, we explain the main threats to the validity of our study following Wohlin et al. [41], along with the countermeasures we took when possible. Moreover, we suggest some actions that researchers willing to replicate this study could take to limit some of the threats. The types of validity threats are prioritized, in increasing order, following Cook and Campbell’s [42] guidelines. In particular, since this study is part of an effort to apply research in industry, we give more

Conclusions

In this work, we studied 30 professional software developers applying TDD to add new features to a legacy system close to real-world complexity. We contributed to the existing knowledge by operationalizing developers’ test-driven development skills, not only according to their a priori abilities (i.e., Java programming and UT), but also including their capacity to follow the test-driven development cycle. We clustered the subjects according to such skill set and compared them in terms of

Acknowledgements

This research is partially supported by the Academy of Finland with decision no.: 278354, and by Finnish Distinguished Professor (Fi.Di.Pro.) programme, ESEIL. The first author would like to acknowledge the Nokia Foundation and ISACA Finland chapter for the support provided in completing this work. We would like to acknowledge Dr. Lucas Layman who significantly contributed to the design of the task used in the study. We would like to acknowledge the anonymous reviewers for their helpful

References (50)

  • H. Munir et al.

    Considering rigor and relevance when evaluating test driven development: A systematic review

    Inform. Softw. Technol.

    (2014)
  • K. Becker et al.

    Besouro: A framework for exploring compliance rules in automatic {TDD} behavior assessment

    Inform. Softw. Technol.

    (2015)
  • K. Beck

    Test-driven Development: By Example

    (2002)
  • D. Astels

    Test Driven Development: A Practical Guide

    (2003)
  • K. Beck

    Aim, fire

    IEEE Softw.

    (2001)
  • Y. Rafique et al.

    The effects of test-driven development on external quality and productivity: A meta-analysis

    IEEE Trans. Softw. Eng.

    (2013)
  • B. Turhan et al.

    How effective is test driven development?

  • A. Causevic et al.

    Factors limiting industrial adoption of test driven development: A systematic review

    2011 IEEE Fourth International Conference on Software Testing, Verification and Validation (ICST)

    (2011)
  • D. Fucci et al.

    On the effects of programming and testing skills on external quality and productivity in a test-driven development context

    Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering (EASE’15)

    (2015)
  • D. Fucci et al.

    On the role of tests in test-driven development: A differentiated and partial replication

    Empir. Softw. Eng.

    (2013)
  • D. Fucci et al.

    Impact of process conformance on the effects of test-driven development

    8th ACM/IEEE International Symposium (ESEM’14)

    (2014)
  • R. Latorre

    Effects of developer experience on learning and applying unit test-driven development

    IEEE Trans. Softw. Eng.

    (2014)
  • M.M. Müller et al.

    The effect of experience on the test-driven development process

    Empir. Softw. Eng.

    (2007)
  • A.T. Misirli et al.

    Topic selection in industry experiments

    Proceedings of the 2nd International Workshop on Conducting Empirical Studies in Industry (CESI’14)

    (2014)
  • I. Salman et al.

    Are students representatives of professionals in software engineering experiments?

    Proceedings of the 37th International Conference on Software Engineering (ICSE’15)

    (2014)
  • L.A. Meyerovich et al.

    Empirical analysis of programming language adoption

    SIGPLAN Not.

    (2013)
  • D. Fucci et al.

    Conformance factor in test-driven development: Initial results from an enhanced replication

    Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (EASE’14)

    (2014)
  • W.R. Shadish et al.

    Experimental and Quasi-Experimental Designs for Generalized Causal Inference

    (2003)
  • M. Weinberger et al.

    Multisite randomized controlled trials in health services research: Scientific challenges and operational issues

    Med. Care

    (2001)
  • S.W. Raudenbush et al.

    Statistical power and optimal design for multisite randomized trials.

    Psychol. Methods

    (2000)
  • B. Vodde et al.

    Learning test-driven development by counting lines

    IEEE Softw.

    (2007)
  • D.T. Sato et al.

    Coding Dojo: An environment for learning and sharing Agile practices

    Agile Conference (AGILE’08)

    (2008)
  • H. Kou et al.

    Operational definition and automated inference of test-driven development with Zorro

    Autom. Softw. Eng.

    (2010)
  • Y. Wang et al.

    The role of process measurement in test-driven development

  • N. Zazworka et al.

    Tool supported detection and judgment of nonconformance in process execution

    3rd International Symposium on Empirical Software Engineering and Measurement (ESEM’09)

    (2009)
  • Cited by (20)

    • Studying test-driven development and its retainment over a six-month time span

      2021, Journal of Systems and Software
      Citation Excerpt :

      A number of primary studies, like experiments or case studies, have been conducted on TDD (Fucci et al., 2016; Erdogmus et al., 2005; George and Williams, 2004; Bhat and Nagappan, 2006; Nagappan et al., 2008). Their results, gathered and combined in a number of secondary studies (Karac and Turhan, 2018; Bissi et al., 2016; Fucci et al., 2015; Turhan et al., 2010; Munir et al., 2014; Rafique and Mišić, 2013), do not fully support the claimed benefits of TDD (i.e., while some primary studies have shown that TDD allows improving quality of software products and/or developers’ productivity, other primary studies have not). Some researchers have conjectured that long-term observations are needed to see the claimed benefits of TDD and/or to better understand this development approach; therefore, they have recommended taking a longitudinal approach when investigating TDD (Fucci et al., 2015; Munir et al., 2014; Shull et al., 2010; Müller and Höfer, 2007)—i.e., studying TDD over a time span.

    • Findings from a multi-method study on test-driven development

      2017, Information and Software Technology
      Citation Excerpt :

      The participants in our study were trained using a similar material, and over a similar time span as in the study presented in Fucci and Turhan [34]. A study with professionals using TDD [35] leveraged process conformance, along with other metrics related to the developer’ skills, to investigate the impact of the practice on the quality of the software as well as the developer’ productivity. In industrial settings, the authors showed that developer’ conformance was close to 75%; however, 13% of the time the process was not followed at all.

    • A Two-stage Method of Synchronization Prediction Framework in TDD

      2022, Arabian Journal for Science and Engineering
    • The effect of Test-Driven Development and Behavior-Driven Development on Project Success Factors: A Systematic Literature Review Based Study

      2021, Proceedings of: 2020 International Conference on Computer, Control, Electrical, and Electronics Engineering, ICCCEEE 2020
    View all citing articles on Scopus
    View full text