research-article

An Empirical Study on Continuous Integration Trends, Topics and Challenges in Stack Overflow

Authors:

Mohamed Wiem MkaouerAuthors Info & Claims

EASE '23: Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering

Pages 141 - 151

https://doi.org/10.1145/3593434.3593485

Published: 14 June 2023 Publication History

Abstract

During the last few years, Continuous Integration (CI) has become a common practice in open-source and industrial environments to reduce the scope for errors and increase the speed to market through the automated build and test processes. However, despite this wide adoption throughout the years, little is known about the challenges developers discuss. Analyzing the discussions of developers is required to understand what researchers, educators and practitioners should focus on, and how discussion communities can be helpful to shed the light on CI challenges. In this study, we examine Stack Overflow (SO), the most popular crowd-sourced forum, to understand the challenges developers face in the CI context. We collect a corpus of 27,728 CI-related developers posts from SO and analyze those posts through a mixed method with quantitative and qualitative analyzes. To study the trends of CI discussions, we investigated the metadata of CI questions, users and tags. Then, we extract the CI main topics using Latent Dirichlet Allocation (LDA) tuned with Genetic Algorithm (GA). Finally, we investigate the most popular and difficult topics faced by developers based on unanswered questions to get further insights into CI challenges. The LDA clustering reveals that developers face challenges with six main topics namely Build, Testing, Version Control, Configuration, Deployment, and CI Culture. Particularly, we found that the build topic is the most popular among the studied topics and that version control and testing topics are the most difficult for the SO community. Our study uncovers insights about CI challenges and adds evidence to existing knowledge about CI issues related especially to software build.

References

[1]

2023. Replication package. https://figshare.com/s/9682e9a121a2e51730ad.

[2]

Rabe Abdalkareem, Suhaib Mujahid, Emad Shihab, and J. Rilling. 2019. Which Commits Can Be CI Skipped?IEEE Transactions on Software Engineering (2019).

[3]

Ahmad Abdellatif, Diego Costa, Khaled Badran, Rabe Abdalkareem, and Emad Shihab. 2020. Challenges in chatbot development: A study of stack overflow posts. In 17th International Conference on Mining Software Repositories. 174–185.

Digital Library

[4]

Eman Abdullah AlOmar, Diego Barinas, Jiaqian Liu, Mohamed Wiem Mkaouer, Ali Ouni, and Christian Newman. 2020. An exploratory study on how software reuse is discussed in stack overflow. In International Conference on Software and Software Reuse. Springer, 292–303.

Digital Library

[5]

Sebastian Baltes, Lorik Dumani, Christoph Treude, and Stephan Diehl. 2018. Sotorrent: Reconstructing and analyzing the evolution of stack overflow posts. In 15th international conference on mining software repositories. 319–330.

Digital Library

[6]

Anton Barua, Stephen W Thomas, and Ahmed E Hassan. 2014. What are developers talking about? an analysis of topics and trends in stack overflow. Empirical Software Engineering 19, 3 (2014), 619–654.

Digital Library

[7]

Moritz Beller, Georgios Gousios, and Andy Zaidman. 2017. Oops, my tests broke the build: An explorative analysis of Travis CI with GitHub. In IEEE/ACM International Conference on Mining Software Repositories. 356–367.

Digital Library

[8]

João Helis Bernardo, Daniel Alencar da Costa, and Uirá Kulesza. 2018. Studying the impact of adopting continuous integration on the delivery time of pull requests. In International Conference on Mining Software Repositories. 131–141.

Digital Library

[9]

David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. the Journal of machine Learning research 3 (2003), 993–1022.

[10]

Adam Debbiche, Mikael Dienér, and Richard Berntsson Svensson. 2014. Challenges when adopting continuous integration: A case study. In International Conference on Product-Focused Software Process Improvement. Springer, 17–32.

[11]

Omar Elazhary, Colin Werner, Ze Shi Li, Derek Lowlind, Neil A Ernst, and Margaret-Anne Storey. 2021. Uncovering the benefits and challenges of continuous integration practices. IEEE Transactions on Software Engineering (2021).

[12]

Martin Fowler. 2006. Continuous Integration. https://www.martinfowler.com/articles/continuousIntegration.html, . Accessed: 2020-01-01.

[13]

Taher Ahmed Ghaleb, Daniel Alencar da Costa, and Ying Zou. 2019. An empirical study of the long duration of continuous integration builds. Empirical Software Engineering (2019), 1–38.

[14]

Volker Gruhn, Christoph Hannebauer, and Christian John. 2013. Security of public continuous integration services. In 9th International Symposium on open collaboration. 1–10.

Digital Library

[15]

Sayed Y Hashimi and S. I. Hashimi. 2006. The Unified Build Engine: MSBuild. Deploying. NET Applications: Learning MSBuild and ClickOnce (2006), 21–43.

[16]

Foyzul Hassan and Xiaoyin Wang. 2017. Change-aware build prediction model for stall avoidance in continuous integration. In ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 157–162.

Digital Library

[17]

Michael Hilton, Nicholas Nelson, Timothy Tunnell, Darko Marinov, and Danny Dig. 2017. Trade-offs in continuous integration: assurance, security, and flexibility. In 11th Joint Meeting on Foundations of Software Engineering. 197–207.

Digital Library

[18]

Michael Hilton, Timothy Tunnell, Kai Huang, Darko Marinov, and Danny Dig. 2016. Usage, Costs, and Benefits of Continuous Integration in Open-source Projects. In Int. Conference on Automated Software Engineering. 426–437.

Digital Library

[19]

Eero Laukkanen, Juha Itkonen, and Casper Lassenius. 2017. Problems, causes and solutions when adopting continuous delivery—A systematic literature review. Information and Software Technology 82 (2017), 55–79.

[20]

Eero Laukkanen, Maria Paasivaara, and Teemu Arvonen. 2015. Stakeholder perceptions of the adoption of continuous integration–a case study. In 2015 agile conference. IEEE, 11–20.

[21]

Yang Luo, Yangyang Zhao, Wanwangying Ma, and Lin Chen. 2017. What are the Factors Impacting Build Breakage?. In 14th Web Information Systems and Applications Conference (WISA). 139–142.

[22]

Moses Openja, Bram Adams, and Foutse Khomh. 2020. Analysis of Modern Release Engineering Topics:–A Large-Scale Study using StackOverflow–. In International Conference on Software Maintenance and Evolution. 104–114.

[23]

Anthony Peruma, Steven Simmons, Eman Abdullah AlOmar, Christian D Newman, Mohamed Wiem Mkaouer, and Ali Ouni. 2022. How do i refactor this? An empirical study on refactoring trends and topics in Stack Overflow. Empirical Software Engineering 27, 1 (2022), 1–43.

Digital Library

[24]

Gustavo Pinto, Fernando Castor, Rodrigo Bonifacio, and Marcel Rebouças. 2018. Work practices and challenges in continuous integration: A survey with Travis CI users. Software: Practice and Experience 48, 12 (2018), 2223–2236.

[25]

Akond Rahman, Amritanshu Agrawal, Rahul Krishna, and Alexander Sobran. 2018. Characterizing the influence of continuous integration: Empirical results from 250+ open source and proprietary projects. In 4th ACM SIGSOFT International Workshop on Software Analytics. 8–14.

Digital Library

[26]

Thomas Rausch, Waldemar Hummer, Philipp Leitner, and Stefan Schulte. 2017. An empirical analysis of build failures in the continuous integration workflows of Java-based open-source software. In MSR. 345–355.

[27]

Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the space of topic coherence measures. In ACM international conference on Web search and data mining. 399–408.

Digital Library

[28]

Islem Saidani, Ali Ouni, and Mohamed Wiem Mkaouer. 2022. Improving the prediction of continuous integration build failures using deep learning. Automated Software Engineering 29, 1 (2022), 1–61.

Digital Library

[29]

Islem Saidani, Ali Ouni, Mohamed Wiem Mkaouer, and Fabio Palomba. 2021. On the impact of Continuous Integration on refactoring practice: An exploratory study on TravisTorrent. Information and Software Technology 138 (2021), 106618.

[30]

Islem Saidani, Ali Ouni, and Wiem Mkaouer. 2021. Detecting skipped commits in continuous integration using multi-objective evolutionary search. IEEE Transactions on Software Engineering (2021).

[31]

Mojtaba Shahin, Muhammad Ali Babar, and Liming Zhu. 2017. Continuous integration, delivery and deployment: a systematic review on approaches, tools, challenges and practices. IEEE Access 5 (2017), 3909–3943.

[32]

Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed Hassan, and Kenichi Matsumoto. 2018. The impact of automated parameter optimization on defect prediction models. IEEE Transactions on Software Engineering 45, 7 (2018), 683–711.

[33]

Bogdan Vasilescu, Yue Yu, Huaimin Wang, P. Devanbu, and Vladimir Filkov. 2015. Quality and Productivity Outcomes Relating to Continuous Integration in GitHub. In Joint Meeting on Foundations of Software Engineering. 805–816.

Digital Library

[34]

Carmine Vassallo, Sebastian Proksch, Harald C Gall, and Massimiliano Di Penta. 2019. Automated reporting of anti-patterns and decay in continuous integration. In 41st International Conference on Software Engineering. 105–115.

Digital Library

[35]

Ananto Setyo Wicaksono and Ahmad Afif Supianto. 2018. Hyper parameter optimization using genetic algorithm on machine learning methods for online news popularity prediction. Int. J. Adv. Comput. Sci. Appl 9, 12 (2018), 263–267.

[36]

David Gray Widder, Michael Hilton, Christian Kästner, and Bogdan Vasilescu. 2019. A conceptual replication of continuous integration pain points in the context of Travis CI. In 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 647–658.

Digital Library

[37]

Zheng Xie and Ming Li. 2018. Cutting the Software Building Efforts in Continuous Integration by Semi-Supervised Online AUC Optimization. In IJCAI. 2875–2881.

[38]

Li Yang and Abdallah Shami. 2020. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 415 (2020), 295–316.

[39]

Xin-Li Yang, David Lo, Xin Xia, Zhi-Yuan Wan, and Jian-Ling Sun. 2016. What security questions do developers ask? a large-scale study of stack overflow posts. Journal of Computer Science and Technology 31, 5 (2016), 910–924.

[40]

Mansooreh Zahedi, Roshan Namal Rajapakse, and Muhammad Ali Babar. 2020. Mining questions asked about continuous software engineering: A case study of stack overflow. In Evaluation and assessment in software engineering. 41–50.

[41]

Fiorella Zampetti, Carmine Vassallo, Sebastiano Panichella, Gerardo Canfora, Harald Gall, and Massimiliano Di Penta. 2020. An empirical characterization of bad practices in continuous integration. Empirical Software Engineering 25, 2 (2020), 1095–1135.

[42]

Yangyang Zhao, Alexander Serebrenik, Yuming Zhou, Vladimir Filkov, and Bogdan Vasilescu. 2017. The Impact of Continuous Integration on Other Software Development Practices: A Large-scale Empirical Study. In IEEE/ACM International Conference on Automated Software Engineering. 60–71.

Cited By

Naranjo-Armijo FAlmeida-Blacio J(2024)Transformación Digital y Sostenibilidad: Un Nuevo Paradigma en la Administración de EmpresasCódigo Científico Revista de Investigación10.55813/gaea/ccri/v5/nE3/3235:E3(365-391)Online publication date: 30-Apr-2024
https://doi.org/10.55813/gaea/ccri/v5/nE3/323
Genç AYurtseven AÖzyurt HÖzyurt Ö(2024)STACKOVERFLOW'DA "BIG DATA" İLE İLGİLİ GÖNDERİLERİN KONU MODELLEME VE BİRLİKTELİK ANALİZİ İLE ÖZELLİKLERİNİN ÇIKARILMASIEskişehir Osmangazi Üniversitesi Mühendislik ve Mimarlık Fakültesi Dergisi10.31796/ogummf.137561132:1(1257-1268)Online publication date: 22-Apr-2024
https://doi.org/10.31796/ogummf.1375611
Giwangkoro GNugroho Y(2024)Unveiling Research Trends in Stack Overflow: A Comprehensive Analysis of General Discussion Theme2024 International Conference on Smart Computing, IoT and Machine Learning (SIML)10.1109/SIML61815.2024.10578280(130-136)Online publication date: 6-Jun-2024
https://doi.org/10.1109/SIML61815.2024.10578280
Show More Cited By

Recommendations

What are developers talking about? An analysis of topics and trends in Stack Overflow

Programming question and answer (Q&A) websites, such as Stack Overflow, leverage the knowledge and expertise of users to provide answers to technical questions. Over time, these websites turn into repositories of software engineering knowledge. Such ...
An empirical study on stack overflow using topic analysis
MSR '15: Proceedings of the 12th Working Conference on Mining Software Repositories

Programming question and answer (Q&A) websites, such as Stack Overflow, gathered knowledge and expertise of developers from all over the world, this knowledge reflects some insight into the development activities. To comprehend the actual thoughts and ...
An empirical study of IoT topics in IoT developer discussions on Stack Overflow
Abstract
Internet of Things (IoT) is defined as the connection between places and physical objects (i.e., things) over the Internet via smart computing devices. It is a rapidly emerging paradigm that encompasses almost every aspect of our modern life, such ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

EASE '23: Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering

June 2023

544 pages

ISBN:9798400700446

DOI:10.1145/3593434

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

EASE '23

EASE '23: The International Conference on Evaluation and Assessment in Software Engineering

June 14 - 16, 2023

Oulu, Finland

Acceptance Rates

Overall Acceptance Rate 71 of 232 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
297
Total Downloads

Downloads (Last 12 months)101
Downloads (Last 6 weeks)3

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Naranjo-Armijo FAlmeida-Blacio J(2024)Transformación Digital y Sostenibilidad: Un Nuevo Paradigma en la Administración de EmpresasCódigo Científico Revista de Investigación10.55813/gaea/ccri/v5/nE3/3235:E3(365-391)Online publication date: 30-Apr-2024
https://doi.org/10.55813/gaea/ccri/v5/nE3/323
Genç AYurtseven AÖzyurt HÖzyurt Ö(2024)STACKOVERFLOW'DA "BIG DATA" İLE İLGİLİ GÖNDERİLERİN KONU MODELLEME VE BİRLİKTELİK ANALİZİ İLE ÖZELLİKLERİNİN ÇIKARILMASIEskişehir Osmangazi Üniversitesi Mühendislik ve Mimarlık Fakültesi Dergisi10.31796/ogummf.137561132:1(1257-1268)Online publication date: 22-Apr-2024
https://doi.org/10.31796/ogummf.1375611
Giwangkoro GNugroho Y(2024)Unveiling Research Trends in Stack Overflow: A Comprehensive Analysis of General Discussion Theme2024 International Conference on Smart Computing, IoT and Machine Learning (SIML)10.1109/SIML61815.2024.10578280(130-136)Online publication date: 6-Jun-2024
https://doi.org/10.1109/SIML61815.2024.10578280
Begoug MBessghaier NOuni AAlOmar EMkaouer M(2023)What Do Infrastructure-as-Code Practitioners Discuss: An Empirical Study on Stack Overflow2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)10.1109/ESEM56168.2023.10304847(1-12)Online publication date: 26-Oct-2023
https://doi.org/10.1109/ESEM56168.2023.10304847

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten