Empirical research on concurrent software testing: A systematic mapping study

doi:10.1016/j.infsof.2018.08.017

Information and Software Technology

Volume 105, January 2019, Pages 226-251

https://doi.org/10.1016/j.infsof.2018.08.017 Get rights and content

Abstract

Background: Concurrent software testing is a costly and difficult task, especially due to the exponential increase in the test sequences caused by non-determinism. Such an issue has motivated researchers to develop testing techniques that select a subset of the input domain that has a high probability of revealing faults. Academics and industrial practitioners rarely use most concurrent software testing techniques because of the lack of data about their applicability. Empirical evidence can provide an important scientific basis for the strengths and weaknesses of each technique to help researchers and practitioners choose concurrent testing techniques appropriate for their environments.

Aim: This paper gathers and synthesizes empirical research on concurrent software testing to characterize the field and the types of empirical studies performed.

Method: We performed a systematic mapping study to identify and analyze empirical research on concurrent software testing techniques. We provide a detailed analysis of the studies and their design choices.

Results: The primary findings are: (1) there is a general lack of empirical validation of concurrent software testing techniques, (2) the type of evaluation method varies with the type of technique, (3) there are some key challenges to empirical study design in concurrent software testing, and (4) there is a dearth of controlled experiments in concurrent software testing.

Conclusions: There is little empirical evidence available about some specific concurrent testing techniques like model-based testing and formal testing. Overall, researchers need to perform more empirical work, especially real-world case studies and controlled experiments, to validate properties of concurrent software testing techniques. In addition, researchers need to perform more analyses and synthesis of the existing evidence. This paper is a first step in that direction.

Introduction

The availability of multicore processors and inexpensive clusters has increased the demand both for concurrent applications and for testing techniques for their validation. Modern business applications use concurrency to improve overall system performance, consequently, researchers have developed a variety of testing techniques for concurrent software. However, testing teams generally rely on their own knowledge and experience when choosing technique(s) for each project, resulting in the repeated selection of same technique(s), whether or not they are most appropriate.

One of the difficulties in transfer of knowledge and research results from academia to industry is the lack of evidence of the applicability of results and techniques to specific software projects [1]. Such evidence should be gathered via empirical software engineering (ESE) methods that provide insights into the benefits and limits of each technique [2]. Secondary studies on concurrent software testing focus on the categorization of testing techniques, methodologies, and tools [3], [4], [5], [6] (see Section 2 for more details). However, empirical validation for concurrent software testing techniques is still lacking [1], [2].

Researchers use controlled experiments, case studies, and surveys as empirical methods to evaluate new techniques and new research. These methods produce data that helps researchers and practitioners decide whether a technique is appropriate for a given context. These empirical methods also help researchers identify specific factors that impact on the effectiveness of techniques and lead to empirically-based decisions about research and practice [7]. Therefore, they provide an important scientific basis for software engineering [8].

This paper discusses a systematic mapping study to understanding the current state-of-the-art of the empirical research on concurrent software testing research. The overall goal is to gather and synthesize empirical research and help developers evaluate the strength of evidence for the findings. The systematic mapping aims at:

1.
providing an overview of the empirical studies about concurrent software testing;
2.
identifying the concurrent software testing techniques that have empirical studies and the type of validation approach used;
3.
analyzing the strength of the empirical studies and discussing the findings and limitations of the evidence;
4.
discussing the design of the empirical studies along with the challenges and research opportunities;
5.
identifying gaps in empirical research on concurrent software testing; and
6.
providing guidance for the design of empirical studies about concurrent software testing.

The remainder of the paper is organized as follows. Section 2 provides an overview of the related work. Section 3 describes the systematic mapping protocol. Section 4 discusses the strengths of the empirical studies identified in the mapping study. Section 5 answers the research question trough the study results. Section 6 addresses the way researchers conduct empirical studies in for concurrent software testing and provides a guide for the planning of new studies. Section 7 discusses the limitations and validity threats of the mapping study. Finally, Section 8 provides the paper conclusions and suggests future work.

Section snippets

Related work

This section addresses some fundamentals about concurrent software testing and provides an overview of prior work. It also provides a brief overview of the types of empirical studies relevant to this review.

Systematic mapping plan

The following subsections describe the steps of the mapping protocol based on the model by Petersen et al. [31].

Strength of empirical evidence

This section discusses the strengths of the empirical studies conducted to validate concurrent software testing techniques. We use the grouping in Table 6 [40], which differ based upon the information used for test data selection, to organize of the discussion.

The following subsections provide an overview of each type of technique, the main conclusions drawn from the studies about the techniques, and the limitations of those studies. Appendix B provides the details from each included study.

Results and synthesis

We organize this section around the research questions from Table 2 using the evidence from Section 4.6.

Designing empirical studies in concurrent software testing

The guidance in this section comes from our literature review and from our own experiences [95]. Throughout this section, we use examples from the literature review to suggest experimental content including study goals, study designs, subject programs, variables, and metrics. Therefore, this section provides an outline for designing studies about concurrent software testing techniques.

The following subsections describe the steps in the experimental study design process. We believe by providing

Threats to validity

This section describes the threats of our research.

Construct Validity threats result from the specific set of papers included in the mapping study. Our search string, choice of databases, and paper selection process may have inadvertently omitted relevant papers. We mitigated this threat by using a systematic process and periodic checking of results by a second author.

Internal Validity threats relate to the accuracy of conclusions on cause and effect data extracted from each study. We used a

Conclusions

This paper describes a systematic mapping that identifies and classifies empirical studies on concurrent software testing techniques. None of the existing secondary studies explicitly focused on the empirical validation of the proposed techniques. Therefore, this paper fills a gap in the literature. Our systematic mapping study includes 109 studies that contain empirical validation of concurrent software testing techniques. Based on those studies, we can draw the following conclusions:

1.
The

Acknowledgment

The authors acknowledge São Paulo Research Funding, FAPESP - Sao Paulo Research Funding for the financial support under process no. 2015/23653-5 and 2013/05046-9.

References (150)

S.R.S. Souza et al.
Empirical evaluation of a new composite approach to the coverage criteria and reachability testing of concurrent programs
Softw. Testing, Verif. Reliab.
(2015)
F. Shull et al.
Guide to Advanced Empirical Software Engineering
(2008)
C. Wohlin et al.
Experimentation in Software Engineering: An Introduction
(2000)
S. Hong et al.
Are concurrency coverage metrics effective for testing: a comprehensive empirical investigation
Softw. Testing Verif. Reliab.
(2015)
I.C. Society et al.
Guide to the software engineering body of knowledge (SWEBOK(r)): Version 3.0
(2014)
M. Musuvathi et al.
Fair stateless model checking
SIGPLAN Notes
(2008)
J. Turpie et al.
MultiOtter: Multiprocess Symbolic Execution
Technical Report
(2011)
Y. Eytani et al.
Towards a framework and a benchmark for testing tools for multi-threaded programs.
Concurrency Comput.
(2007)
N. Rungta et al.
Clash of the titans: tools and techniques for hunting bugs in concurrent programs
7th Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging
(2009)
N. Juristo et al.
Towards building a solid empirical body of knowledge in testing techniques
ACM SIGSOFT Softw. Eng. Notes
(2004)

N. Juristo et al.

Reviewing 25 years of testing technique experiments

Empir. Softw. Eng.

(2004)

V. Arora et al.

A systematic review of approaches for testing concurrent programs

Concurrency Comput.

(2015)

S.R.S. Souza et al.

Research in concurrent software testing: a systematic review

Proceedings of the Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging

(2011)

M.A.S. Brito et al.

Concurrent software testing: a systematic review

22nd IFIP International Conference on Testing Software and Systems

(2010)

A.A. Mamun et al.

Concurrent software testing: a systematic review and an evaluation of static analysis tool.

(2009)

D.E. Perry et al.

Empirical studies of software engineering: a roadmap

Proceedings of the Conference on The Future of Software Engineering

(2000)

C. Wohlin et al.

Experimentation in Software Engineering: An Introduction

(2000)

A. Grama et al.

Introduction to Parallel Computing

(2003)

Y. Lei et al.

Reachability testing of concurrent programs

IEEE Trans. Softw. Eng.

(2006)

C.-S.D. Yang

Program-based, Structural Testing of Shared Memory Parallel Programs

(1999)

M.A.S. Brito et al.

An empirical evaluation of the cost and effectiveness of structural testing criteria for concurrent programs.

International Conference on Computational Science, ICCS

(2013)

D. Kester et al.

How good is static analysis at finding concurrency bugs?

SCAM

(2010)

M. Gligoric et al.

Selective mutation testing for concurrent code.

E. Sherman et al.

Saturation-based testing of concurrent programs.

J. Yu et al.

Maple: a coverage-driven testing tool for multithreaded programs

SIGPLAN Not.

(2012)

K. Lu et al.

Efficient deterministic multithreading without global barriers

Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

(2014)

Y. Lei et al.

A combinatorial testing strategy for concurrent programs

Softw. Testing Verif. Reliab.

(2007)

K. Lu et al.

An efficient and flexible deterministic framework for multithreaded programs

J. Comput. Sci. Technol.

(2015)

N. Juristo et al.

Basics of Software Engineering Experimentation

(2010)

L. Briand et al.

Empirical studies of software testing techniques: challenges, practical strategies, and future research

SIGSOFT Softw. Eng. Notes

(2004)

M. Tyagi et al.

A review of empirical evaluation of software testing techniques with subjects

Int. J. Adv. Res. Comput. Eng. Technol.

(2014)

R.L. Van Horn

Empirical studies of management information systems

ACM Spec. Interest Group Manag. Inf. Syst. SIGMIS

(1973)

M.J. Cheon et al.

The evolution of empirical research in IS: a study in IS maturity

Inf. Manag.

(1993)

R.L. Glass et al.

The evolution of empirical research in is: a study in is maturity

Inf. Softw. Technol.

(2002)

R.K. Yin

Case Study Research: Design and Methods

(2002)

C. Zannier et al.

On the success of empirical studies in the international conference on software engineering

Proceedings of the 28th International Conference on Software Engineering

(2006)

K. Petersen et al.

Guidelines for conducting systematic mapping studies in software engineering : an update

Inf. Softw. Technol.

(2015)

J.S. Alowibdi et al.

An empirical study of data race detector tools

2013 25th Chinese Control and Decision Conference (CCDC)

(2013)

M. Gligoric et al.

Efficient mutation testing of multithreaded code

Softw. Testing, Verif. Reliab.

(2013)

S. Lu et al.

Learning from mistakes: a comprehensive study on real world concurrency bug characteristics

Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems

(2008)

P.V. Koppol et al.

Incremental integration testing of concurrent programs

IEEE Trans. Softw. Eng.

(2002)

B. Kitchenham et al.

Guidelines for Performing Systematic Literature Reviews in Software Engineering

Technical Report

(2007)

J. Cohen

A coefficient of agreement for nominal scales

Educ. Psychol. Meas.

(1960)

T. Dybå et al.

Empirical studies of agile software development: a systematic review

Inf. Softw. Technol.

(2008)

P.G. Joisha et al.

On a technique for transparently empowering classical compiler optimizations on multithreaded code

ACM Trans. Program. Lang. Syst.

(2012)

M. Ganai et al.

Dtam: dynamic taint analysis of multi-threaded programs for relevancy

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering

(2012)

S. Tasiran et al.

Location pairs: a test coverage metric for shared-memory concurrent programs

Empir. Softw. Eng.

(2012)

S. Hong et al.

The impact of concurrent coverage metrics on testing effectiveness

2013 IEEE Sixth International Conference on Software Testing, Verification and Validation

(2013)

T. Sheng et al.

Racez: a lightweight and non-invasive race detection tool for production applications

International Conference on Software Engineering ICSE

(2011)

Y. Cai et al.

Effective and precise dynamic detection of hidden races for java programs

Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering

(2015)

Cited by (0)

View full text

Empirical research on concurrent software testing: A systematic mapping study

Abstract

Introduction

Section snippets

Related work

Systematic mapping plan

Strength of empirical evidence

Results and synthesis

Designing empirical studies in concurrent software testing

Threats to validity

Conclusions

Acknowledgment

Softw. Testing, Verif. Reliab.

Softw. Testing Verif. Reliab.

SIGPLAN Notes

Concurrency Comput.

Towards building a solid empirical body of knowledge in testing techniques

ACM SIGSOFT Softw. Eng. Notes

Reviewing 25 years of testing technique experiments

Empir. Softw. Eng.

A systematic review of approaches for testing concurrent programs

Concurrency Comput.

Research in concurrent software testing: a systematic review

Proceedings of the Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging

Concurrent software testing: a systematic review

22nd IFIP International Conference on Testing Software and Systems

Concurrent software testing: a systematic review and an evaluation of static analysis tool.

Empirical studies of software engineering: a roadmap

Proceedings of the Conference on The Future of Software Engineering

Experimentation in Software Engineering: An Introduction

Introduction to Parallel Computing

Reachability testing of concurrent programs

IEEE Trans. Softw. Eng.

Program-based, Structural Testing of Shared Memory Parallel Programs

An empirical evaluation of the cost and effectiveness of structural testing criteria for concurrent programs.

International Conference on Computational Science, ICCS

How good is static analysis at finding concurrency bugs?

SCAM

Selective mutation testing for concurrent code.

Saturation-based testing of concurrent programs.

Maple: a coverage-driven testing tool for multithreaded programs

SIGPLAN Not.

Efficient deterministic multithreading without global barriers

Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

A combinatorial testing strategy for concurrent programs

Softw. Testing Verif. Reliab.

An efficient and flexible deterministic framework for multithreaded programs

J. Comput. Sci. Technol.

Basics of Software Engineering Experimentation

Empirical studies of software testing techniques: challenges, practical strategies, and future research

SIGSOFT Softw. Eng. Notes

A review of empirical evaluation of software testing techniques with subjects

Int. J. Adv. Res. Comput. Eng. Technol.

Empirical studies of management information systems

ACM Spec. Interest Group Manag. Inf. Syst. SIGMIS

The evolution of empirical research in IS: a study in IS maturity

Inf. Manag.

The evolution of empirical research in is: a study in is maturity

Inf. Softw. Technol.

Case Study Research: Design and Methods

On the success of empirical studies in the international conference on software engineering

Proceedings of the 28th International Conference on Software Engineering

Guidelines for conducting systematic mapping studies in software engineering : an update

Inf. Softw. Technol.

An empirical study of data race detector tools

2013 25th Chinese Control and Decision Conference (CCDC)

Efficient mutation testing of multithreaded code

Softw. Testing, Verif. Reliab.

Learning from mistakes: a comprehensive study on real world concurrency bug characteristics

Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems

Incremental integration testing of concurrent programs

IEEE Trans. Softw. Eng.

Guidelines for Performing Systematic Literature Reviews in Software Engineering

Technical Report

A coefficient of agreement for nominal scales

Educ. Psychol. Meas.

Empirical studies of agile software development: a systematic review

Inf. Softw. Technol.

On a technique for transparently empowering classical compiler optimizations on multithreaded code

ACM Trans. Program. Lang. Syst.