Abstract
The evidence-based software engineering approach advocates the use of evidence from empirical studies to support the decisions on the adoption of software technologies by practitioners in the software industry. To this end, many guidelines have been proposed to contribute to the execution and repeatability of literature reviews, and to the confidence of their results, especially regarding systematic literature reviews (SLR). To investigate similarities and differences, and to characterize the challenges and pitfalls of the planning and generated results of SLR research protocols dealing with the same research question and performed by similar teams of novice researchers in the context of the software engineering field. We qualitatively compared (using Jaccard and Kappa coefficients) and evaluated (using DARE) same goal SLR research protocols and outcomes undertaken by similar research teams. Seven similar SLR protocols regarding quality attributes for use cases executed in 2010 and 2012 enabled us to observe unexpected differences in their planning and execution. Even when the participants reached some agreement in the planning, the outcomes were different. The research protocols and reports allowed us to observe six challenges contributing to the divergences in the results: researchers’ inexperience in the topic, researchers’ inexperience in the method, lack of clearness and completeness of the papers, lack of a common terminology regarding the problem domain, lack of research verification procedures, and lack of commitment to the SLR. According to our findings, it is not possible to rely on results of SLRs performed by novices. Also, similarities at a starting or intermediate step during different SLR executions may not directly translate to the next steps, since non-explicit information might entail differences in the outcomes, hampering the repeatability and confidence of the SLR process and results. Although we do have expectations that the presence and follow-up of a senior researcher can contribute to increasing SLRs’ repeatability, this conclusion can only be drawn upon the existence of additional studies on this topic. Yet, systematic planning, transparency of decisions and verification procedures are key factors to guarantee the reliability of SLRs.
Similar content being viewed by others
Notes
References
Babar, M. A., Zhang, H (2009) Systematic literature reviews in software engineering: preliminary results from interviews with researchers. Proceedings of the 3rd International Symposium on Empirical Software Engineering and Measurement. Lake Buena Vista: IEEE
Basili VR (1992) Software modeling and measurement: the goal/question/metric paradigm. Technical Report. University of Maryland at College Park: College Park, MD, p 24
Biolchini J et al (2005) Systematic review in software engineering. Federal University of Rio de Janeiro. Rio de Janeiro, p 31. (RT-ES 679/05). Available at: http://www.cos.ufrj.br/uploadfile/es67905.pdf. Accessed 17 Aug 2017
Brereton P (2011) A study of computing undergraduates undertaking a systematic literature review. IEEE Trans Educ 54(4):558–563
Carver JC et al (2013) Identifying barriers to the systematic literature. Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. Baltimore: IEEE, p 203–213
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46
Condori-Fernandez N et al (2009) A systematic mapping study on empirical evaluation of software requirements specifications techniques. Proceedings of the 3rd International Symposium on Empirical Software Engineering and Measurement. Lake Buena Vista: IEEE, p 502–505
Corbin, J.; Strauss, A. (2007) Basics of qualitative research: techniques and procedures for developing grounded theory. 3. ed. [S.l.]: Thousand Oaks, SAGE Publicationse. ISBN 978-1412906449
Dias Neto AC et al (2007) Characterization of model-based software testing approaches. PESC/COPPE/UFRJ. Rio de Janeiro. (ES-713/07). Available at: http://www.cos.ufrj.br/uploadfile/1188491168.pdf. Accessed 17 Aug 2017
Diest O, Grimán A, Juristo N (2009) Developing search strategies for detecting relevant experiments. Empir Softw Eng 14(5):513–539
Dybå T, Kitchenham B, Jørgensen M (2005) Evidence-based software engineering for practitioners. IEEE Softw 22(1):58–65
Fantechi A et al (2002) Application of linguistic techniques for use case analysis. Proceedings of the IEEE Joint International Conference on Requirements Engineering. Essen: IEEE, p 157–164
Garousi V, Eskandar MM, Herkiloglu K (2016) Industry–academia collaborations in software testing: experience and success stories from Canada and Turkey. Softw Qual J:1–53. https://doi.org/10.1007/s11219-016-9319-5
Hassler E et al (2014) Outcomes of a community workshop to identify and rank barriers to the systematic literature review process. Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering. London: ACM. No. 31
Jaccard P (1912) The distribution of the flora in the alpine zone. New Phytol 11(2):37–50
Kasoju A, Petersen K, Mäntylä MV (2013) Analyzing an automotive testing process with evidence-based software engineering. Inf Softw Technol 55(7):1237–1259. https://doi.org/10.1016/j.infsof.2013.01.005
Kitchenham B; Charters S (2007) Guidelines for performing systematic literature reviews in software engineering. Keele University and University of Durham. Keele/Durham, p 65. (EBSE-2007-01)
Kitchenham B et al (2011) Repeatability of systematic literature reviews. Proceedings of the 15th International Conference on Evaluation and Assessment in Software Engineering. Durham: IEEE, p 46–55
Kitchenham B; Brereton P; Budgen, D (2012) Mapping study completeness and reliability - a case study. Proceedings of the 16th International Conference on Evaluation and Assessment in Software Engineering. Ciudad Real: IET, p 126–135
Kuhrmann M, Fernández DM, Daneva M (2017) On the pragmatic design of literature studies in software engineering: an experience-based guideline. Empir Softw Eng, Springer, US, pp 2852–2891. https://doi.org/10.1007/s10664-016-9492-y
Lavallée M, Robillard P-N, Mirsalari R (2014) Performing systematic literature reviews with novices: an iterative approach. IEEE Trans Educ 57(3):175–181
López L, Costal D, Ayala CP, Franch X, Annosi MC, Glott R, Haaland K (2015) Adoption of OSS components: A goal-oriented approach. Data Knowl Eng 99:17–38. https://doi.org/10.1016/j.datak.2015.06.007
Losavio F et al (2004) Designing quality architecture: incorporating ISO standards into the unified process. Inf Syst Manag 21(1):27–44
MacDonell S et al (2010) How reliable are systematic reviews in empirical software engineering? IEEE Trans Softw Eng 36(5):676–687
Munir H, Moayyed M, Petersen K (2014) Considering rigor and relevance when evaluating test driven development: a systematic review. Inf Softw Technol 56(4):375–394
NHS Centre for Reviews and Dissemination, University of York (2002) The Database of Abstracts of Reviews of Effects (DARE). Effect Mat 6(2):1–4
Oates BJ, Capper G (2009) Using systematic reviews and evidence-based software engineering with masters students. Proceedings of the 13th International Conference on Evaluation and Assessment in Software Engineering. Durham: British Computer. Society:79–87
Pai M et al (2004) Systematic reviews and meta-analyses: an illustrated, step-by-step guide. Natl Med J India 17(2):86–95
Petersen K; Ali NB (2011) Identifying strategies for study selection in systematic reviews and maps. Proceedings of the 5th International Symposium on Empirical Software Engineering and Measurement. Banff: IEEE, p 351–354
Petersen K et al (2008) Systematic mapping studies in software engineering. Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering. Bari: British Computer Society
Petersen K, Vakkalanka S, Kuzniarz L (2015) Guidelines for conducting systematic mapping studies in software engineering: an update. Inf Softw Technol 64(1):1–18
Phalp KT, Vincent J, Cox K (2007) Assessing the quality of use case descriptions. Softw Qual J 15(1):69–97
Preiss O; Wegmann A; Wong J (2001) On quality attribute based software engineering. Proceedings of the 27th Euromicro Conference. Warsaw: IEEE, p 114–120
Rago A, Marcos C, Diaz-Pace JA (2013) Uncovering quality-attribute concerns in use case specifications via early aspect mining. Requir Eng 18(1):67–84
Rainer A, Hall T, Baddoo N (2006) A preliminary empirical investigation of the use of evidence based software engineering by undergraduate students. Proceedings of the 10th International Conference on Evaluation and Assessment in Software Engineering. Keele: British Computer. Society:91–100
Ramos R et al (2009) Quality improvement for use case model. Proceedings of the 23rd Brazilian Symposium on Software Engineering. Fortaleza: IEEE, p 187–195
Riaz M et al (2010) Experiences conducting systematic reviews from novices' perspective. Proceedings of the 14th International Conference on Evaluation and Assessment in Software Engineering. Swinton: British Computer. Society:44–53
Shull F, Rus I, Basili V (2000) How perspective-based reading can improve requirements. Computer 33(7):73–39
Travassos GH et al (2008) An environment to support large scale experimentation in software engineering. Proceedings of the 13rd IEEE International Conference on Engineering of Complex Computer Systems. Belfast: IEEE, p 193–202
Ulziit B, Warraich ZA, Gencel C, Petersen K (2015) A conceptual framework of challenges and solutions for managing global software maintenance. Journal of Software: Evolution and Process 27(10):763–792. https://doi.org/10.1002/smr.1720
Viera AJ, Garrett JM (2005) Understanding interobserver agreement: the kappa statistic. Fam Med 37(5):360–363
Wohlin C (2014) Writing for synthesis of evidence in empirical software engineering. Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. Torino: ACM. No. 46
Wohlin C et al (2013) On the reliability of mapping studies in software engineering. J Syst Softw 86(10):2594–2610
Zhang H, Babar MA (2010) On searching relevant studies in software engineering. Proceedings of the 14th International Conference on Evaluation and Assessment in Software Engineering. Swindon: ACM, p 111-120
Zhang H, Babar MA, Tell P (2011) Identifying relevant studies in software engineering. Journal. Inf Softw Technol 53(6):625–637
Acknowledgments
We thank Daniela Cruzes, Marcela Genero, Martin Höst, Natalia Juristo, Nelly Condori-Fernandez, Oscar Dieste and Oscar Pastor for the initial discussions at ISERN 2009 that started this work; Vitor Faria Monteiro for his contribution to the original protocol planning; David Budgen for suggestions regarding an earlier version of this study report; all students for their engagement during the Experimental Software Engineering course in 2010 and 2012, and also the CNPq and CAPES for supporting this research. Prof. Travassos is a CNPq Researcher.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Emerson Murphy-Hill
Appendices
Appendix 1
Appendix 2
Appendix 3
Appendix 4
Appendix 5
Appendix 6
Appendix 7
Appendix 8
Appendix 9
Appendix 10
Appendix 11
Rights and permissions
About this article
Cite this article
Ribeiro, T.V., Massollar, J. & Travassos, G.H. Challenges and pitfalls on surveying evidence in the software engineering technical literature: an exploratory study with novices. Empir Software Eng 23, 1594–1663 (2018). https://doi.org/10.1007/s10664-017-9556-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-017-9556-7