Skip to main content
Log in

Empirical Research in Software Engineering — A Literature Survey

  • Survey
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Empirical research is playing a significant role in software engineering (SE), and it has been applied to evaluate software artifacts and technologies. There have been a great number of empirical research articles published recently. There is also a large research community in empirical software engineering (ESE). In this paper, we identify both the overall landscape and detailed implementations of ESE, and investigate frequently applied empirical methods, targeted research purposes, used data sources, and applied data processing approaches and tools in ESE. The aim is to identify new trends and obtain interesting observations of empirical software engineering across different sub-fields of software engineering. We conduct a mapping study on 538 selected articles from January 2013 to November 2017, with four research questions. We observe that the trend of applying empirical methods in software engineering is continuously increasing and the most commonly applied methods are experiment, case study and survey. Moreover, open source projects are the most frequently used data sources. We also observe that most of researchers have paid attention to the validity and the possibility to replicate their studies. These observations are carefully analyzed and presented as carefully designed diagrams. We also reveal shortcomings and demanded knowledge/strategies in ESE and propose recommendations for researchers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Shull F, Singer J, Sjøberg D I K. Guide to Advanced Empirical Software Engineering. Springer, 2008.

  2. Siegmund J, Siegmund N, Apel S. Views on internal and external validity in empirical software engineering. In Proc. the 37th International Conference on Software Engineering, May 2015, pp.9-19.

  3. Borgs A, Ferreira W, Barreiros E, Almeida A, Fonseca L, Teixeira E, Silva D, Alencar A, Soares S. Support mechanisms to conduct empirical studies in software engineering. In Proc. the 19th International Conference on Evaluation and Assessment in Software Engineering, April 2015, Article No. 22.

  4. Cosentino V, Izquierdo J L C, Cabot J. A systematic mapping study of software development with GitHub. IEEE Access, 2017, 5: 7173-7192.

    Article  Google Scholar 

  5. Bezerra R, Silva F, Santana A, Magalhaes C, Santos R. Replication of empirical studies in software engineering: An update of a systematic mapping study. In Proc. the 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, October 2015, pp.132-135.

  6. Zhang J, Wang X Y, Hao D, Xie B, Zhang L, Mei H. A survey on bug-report analysis. Science China Information Sciences, 2015, 58(2): 1-24.

    Article  Google Scholar 

  7. Zhang T, He J, Luo X, Chan A T S. A literature review of research in bug resolution: Tasks, challenges and future directions. The Computer Journal, 2016, 59(5): 741-773.

    Article  Google Scholar 

  8. Ahmad A, Brereton P, Andras P. A systematic mapping study of empirical studies on software cloud testing methods. In Proc. IEEE International Conference on Software Quality, Reliability and Security Companion, July 2017, pp.555-562.

  9. Zhang L, Pu M Y, Liu Y J et al. Empirical investigation of empirical research methods in software engineering. Journal of Software, 2018, 29(5): 1422-1450. (in Chinese)

  10. Wohlin C, Runeson P, Höst M, Ohlsson M C, Regnell B, Runeson P, Wesslén A. Experimentation in Software Engineering. Springer, 2012.

  11. Petersen K, Feldt R, Mujtaba S, Mattsson M. Systematic mapping studies in software engineering. In Proc. the 12th International Conference on Evaluation and Assessment in Software Engineering, June 2008, pp.68-77.

  12. Petticrew M, Roberts H. Systematic Reviews in the Social Sciences: A Practical Guide. John Wiley & Sons, 2008

  13. Bourque P, Fairley R E. Guide to the Software Engineering Body of Knowledge (3rd edition). IEEE Computer Society Press, 2014

  14. Delgado D, Martinez A. Cost effectiveness of unit testing a case study in a financial institution. In Proc. the 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, October 2013, pp.340-347.

  15. Cook T D, Cambell D T. Quasi-Experiment: Design and Analysis Issues for Field Setting. Houghton Mifflin, 1979.

  16. Robert J M. Experimental and quasi-experimental designs for generalized causal inference. Journal of Policy Analysis and Management, 2003, 22(2): 330-332.

    Article  MathSciNet  Google Scholar 

  17. Runeson P, Ḧost M. Guidelines for conducting and reporting case study research in software engineering. Empirical Software Engineering, 2009, 14(2): 131-164.

    Article  Google Scholar 

  18. Haller I, Slowinska A, Bos H. Scalable data structure detection and classification for C/C++ binaries. Empirical Software Engineering, 2016, 21(3): 778-810.

    Article  Google Scholar 

  19. Molléri J S, Petersen K, Mendes E. Survey guidelines in software engineering: An annotated review. In Proc. the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2016, Article No. 58.

  20. Bao L F, Li J, Xing Z C, Wang X Y, Xia X, Zhou B. Extracting and analyzing time-series HCI data from screen-captured task videos. Empirical Software Engineering, 2017, 22(1): 134-174.

    Article  Google Scholar 

  21. Petersen K, Vakkalanka S, Kuzniarz L. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and Software Technology, 2015, 64: 1-18.

    Article  Google Scholar 

  22. Juristo N, Vegas S. Using differences among replications of software engineering experiments to gain knowledge. In Proc. the 3rd International Symposium on Empirical Software Engineering and Measurement, October 2009, pp.356-366.

  23. Monteiro C V, Silva F Q, Capretz L F. The innovative behaviour of software engineers: Findings from a pilot case study. In Proc. the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2016, Article No. 7.

  24. Wang Y. Characterizing developer behavior in cloud based IDEs. In Proc. the 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, November 2017, pp.48-57.

  25. Octaviano F R, Felizardo K R, Maldonado J C, Fabbri S C P F. Semi-automatic selection of primary studies in systematic literature reviews: Is it reasonable? Empirical Software Engineering, 2015, 20(6): 1898-1917.

    Article  Google Scholar 

  26. Heeager L T, Rose J. Optimising agile development practices for the maintenance operation: Nine heuristics. Empirical Software Engineering, 2015, 20(6): 1762-1784.

    Article  Google Scholar 

  27. Shin Y, Williams L. Can traditional fault prediction models be used for vulnerability prediction? Empirical Software Engineering, 2013, 18(1): 25-59.

    Article  Google Scholar 

  28. Raja U. All complaints are not created equal: Text analysis of open source software defect reports. Empirical Software Engineering, 2013, 18(1): 117-138.

    Article  Google Scholar 

  29. Albayrak Ö, Carver J C. Investigation of individual factors impacting the effectiveness of requirements inspections: A replicated experiment. Empirical Software Engineering, 2014, 19(1): 241-266.

    Article  Google Scholar 

  30. Estler H C, Nordio M, Furia C A, Meyer B, Schneider J. Agile vs. structured distributed software development: A case study. Empirical Software Engineering, 2014, 19(5): 1197-1224.

    Article  Google Scholar 

  31. Chen N, Hoi S C, Xiao X. Software process evaluation: A machine learning framework with application to defect management process. Empirical Software Engineering, 2014, 19(6): 1531-1564.

    Article  Google Scholar 

  32. Chen J, Xiao J, Wang Q, Osterweil L J, Li M. Perspectives on refactoring planning and practice: An empirical study. Empirical Software Engineering, 2016, 21(3): 1397-1436.

    Article  Google Scholar 

  33. Unterkalmsteiner M, Gorschek T, Feldt R, Lavesson N. Large-scale information retrieval in software engineering: An experience report from industrial application. Empirical Software Engineering, 2016, 21(6): 2324-2365.

    Article  Google Scholar 

  34. Capiluppi A, Izquierdo-Cortázar D. Effort estimation of FLOSS projects: A study of the Linux kernel. Empirical Software Engineering, 2013, 18(1): 60-88.

    Article  Google Scholar 

  35. Fucci D, Turhan B. On the role of tests in test-driven development: A differentiated and partial replication. Empirical Software Engineering, 2014, 19(2): 277-302.

    Article  Google Scholar 

  36. Mcburney PW, Mcmillan C. An empirical study of the textual similarity between source code and source code summaries. Empirical Software Engineering, 2016, 21(1): 17-42.

    Article  Google Scholar 

  37. Mcilroy S, Ali N, Khalid H, Hassan A E. Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Software Engineering, 2016, 21(3): 1067-1106.

    Article  Google Scholar 

  38. Šmite D, Wohlin C, Galvina Z, Prikladnicki R. An empirically based terminology and taxonomy for global software engineering. Empirical Software Engineering, 2014, 19(1): 105-153.

    Article  Google Scholar 

  39. Greiler M, Deursen A V. What your plug-in test suites really test: An integration perspective on test suite understanding. Empirical Software Engineering, 2013, 18(5): 859-900.

    Article  Google Scholar 

  40. Callaú O, Robbes R, Tanter É, Röthlisberger D. How (and why) developers use the dynamic features of programming languages: The case of small-talk. Empirical Software Engineering, 2013, 18(6): 1156-1194.

    Article  Google Scholar 

  41. Cheung W T, Ryu S, Kim S. Development nature matters: An empirical study of code clones in JavaScript applications. Empirical Software Engineering, 2016, 21(2): 517-564.

    Article  Google Scholar 

  42. Ceccato M, Capiluppi A, Falcarin P, Boldyreff C. A large study on the effect of code obfuscation on the quality of java code. Empirical Software Engineering, 2015, 20(6): 1486-1524.

    Article  Google Scholar 

  43. Arcuri A, Fraser G. Parameter tuning or default values? An empirical investigation in search-based software engineering. Empirical Software Engineering, 2013, 18(3): 594-623.

    Article  Google Scholar 

  44. Tian Y, Lo D, Xia X, Sun C N. Automated prediction of bug report priority using multi-factor analysis. Empirical Software Engineering, 2015, 20(5): 1354-1383.

    Article  Google Scholar 

  45. Dit B, Revelle M, Poshyvanyk D. Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empirical Software Engineering, 2013, 18(2): 277-309.

    Article  Google Scholar 

  46. Bavota G, Lucia A D, Marcus A, Oliveto R. Automating extract class refactoring: An improved method and its evaluation. Empirical Software Engineering, 2014, 19(6): 1617-1664.

    Article  Google Scholar 

  47. Zhu J, Zhou M, Mockus A. Patterns of folder use and project popularity: A case study of GitHub repositories. In Proc. the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2014, Article No. 30.

  48. Al-Subaihin A A, Sarro F, Black S, Capra M, Harman M, Jia Y, Zhang Y. Clustering mobile apps based on mined textual features. In Proc. the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, September 2016, Article No. 38.

  49. Mcilroy S, Ali N, Hassan A E. Fresh apps: An empirical study of frequently-updated mobile apps in the Google play store. Empirical Software Engineering, 2016, 21(3): 1346-1370.

    Article  Google Scholar 

  50. Allix K, Bissyandé T F, Jérome Q, Klein J, State R, Traon Y L. Empirical assessment of machine learning-based malware detectors for Android — Measuring the gap between in-the-lab and in-the-wild validation. Empirical Software Engineering, 2016, 21(1): 183-211.

    Article  Google Scholar 

  51. Fraser G, Arcuri A. 1600 faults in 100 projects: Automatically finding faults while achieving high coverage with EvoSuite. Empirical Software Engineering, 2015, 20(3): 611-639.

    Article  Google Scholar 

  52. Vasilescu B, Serebrenik A, Goeminne M, Mens T. On the variation and specialisation of workload: A case study of the GNOME ecosystem community. Empirical Software Engineering, 2014, 19(4): 955-1008.

    Article  Google Scholar 

  53. Xia X, Bao L F, Lo D, Kochhar P S, Hassan A E, Z Xing Z C. What do developers search for on the Web? Empirical Software Engineering, 2017, 22(6): 3149-3185.

    Article  Google Scholar 

  54. Kosti M V, Feldt R, Angelis L. Archetypal personalities of software engineers and their work preferences: A new perspective for empirical studies. Empirical Software Engineering, 2016, 21(4): 1509-1532.

    Article  Google Scholar 

  55. Yin R K. Case Study Research: Design and Methods (4th edition). Sage Publications, 2009.

  56. William B J, Carver J C. Examination of the software architecture change characterization scheme using three empirical studies. Empirical Software Engineering, 2014, 19(3): 419-464.

    Article  Google Scholar 

  57. Schulz T, Radliński L, Gorges T, Rosenstiel W. Predicting the flow of defect correction effort using a Bayesian net-work model. Empirical Software Engineering, 2013, 18(3): 435-477.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Jiang.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(PDF 106 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, L., Tian, JH., Jiang, J. et al. Empirical Research in Software Engineering — A Literature Survey. J. Comput. Sci. Technol. 33, 876–899 (2018). https://doi.org/10.1007/s11390-018-1864-x

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-018-1864-x

Keywords

Navigation