ABSTRACT
During the last few years, Continuous Integration (CI) has become a common practice in open-source and industrial environments to reduce the scope for errors and increase the speed to market through the automated build and test processes. However, despite this wide adoption throughout the years, little is known about the challenges developers discuss. Analyzing the discussions of developers is required to understand what researchers, educators and practitioners should focus on, and how discussion communities can be helpful to shed the light on CI challenges. In this study, we examine Stack Overflow (SO), the most popular crowd-sourced forum, to understand the challenges developers face in the CI context. We collect a corpus of 27,728 CI-related developers posts from SO and analyze those posts through a mixed method with quantitative and qualitative analyzes. To study the trends of CI discussions, we investigated the metadata of CI questions, users and tags. Then, we extract the CI main topics using Latent Dirichlet Allocation (LDA) tuned with Genetic Algorithm (GA). Finally, we investigate the most popular and difficult topics faced by developers based on unanswered questions to get further insights into CI challenges. The LDA clustering reveals that developers face challenges with six main topics namely Build, Testing, Version Control, Configuration, Deployment, and CI Culture. Particularly, we found that the build topic is the most popular among the studied topics and that version control and testing topics are the most difficult for the SO community. Our study uncovers insights about CI challenges and adds evidence to existing knowledge about CI issues related especially to software build.
- 2023. Replication package. https://figshare.com/s/9682e9a121a2e51730ad.Google Scholar
- Rabe Abdalkareem, Suhaib Mujahid, Emad Shihab, and J. Rilling. 2019. Which Commits Can Be CI Skipped?IEEE Transactions on Software Engineering (2019).Google Scholar
- Ahmad Abdellatif, Diego Costa, Khaled Badran, Rabe Abdalkareem, and Emad Shihab. 2020. Challenges in chatbot development: A study of stack overflow posts. In 17th International Conference on Mining Software Repositories. 174–185.Google ScholarDigital Library
- Eman Abdullah AlOmar, Diego Barinas, Jiaqian Liu, Mohamed Wiem Mkaouer, Ali Ouni, and Christian Newman. 2020. An exploratory study on how software reuse is discussed in stack overflow. In International Conference on Software and Software Reuse. Springer, 292–303.Google ScholarDigital Library
- Sebastian Baltes, Lorik Dumani, Christoph Treude, and Stephan Diehl. 2018. Sotorrent: Reconstructing and analyzing the evolution of stack overflow posts. In 15th international conference on mining software repositories. 319–330.Google ScholarDigital Library
- Anton Barua, Stephen W Thomas, and Ahmed E Hassan. 2014. What are developers talking about? an analysis of topics and trends in stack overflow. Empirical Software Engineering 19, 3 (2014), 619–654.Google ScholarDigital Library
- Moritz Beller, Georgios Gousios, and Andy Zaidman. 2017. Oops, my tests broke the build: An explorative analysis of Travis CI with GitHub. In IEEE/ACM International Conference on Mining Software Repositories. 356–367.Google ScholarDigital Library
- João Helis Bernardo, Daniel Alencar da Costa, and Uirá Kulesza. 2018. Studying the impact of adopting continuous integration on the delivery time of pull requests. In International Conference on Mining Software Repositories. 131–141.Google ScholarDigital Library
- David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. the Journal of machine Learning research 3 (2003), 993–1022.Google Scholar
- Adam Debbiche, Mikael Dienér, and Richard Berntsson Svensson. 2014. Challenges when adopting continuous integration: A case study. In International Conference on Product-Focused Software Process Improvement. Springer, 17–32.Google ScholarCross Ref
- Omar Elazhary, Colin Werner, Ze Shi Li, Derek Lowlind, Neil A Ernst, and Margaret-Anne Storey. 2021. Uncovering the benefits and challenges of continuous integration practices. IEEE Transactions on Software Engineering (2021).Google ScholarCross Ref
- Martin Fowler. 2006. Continuous Integration. https://www.martinfowler.com/articles/continuousIntegration.html, . Accessed: 2020-01-01.Google Scholar
- Taher Ahmed Ghaleb, Daniel Alencar da Costa, and Ying Zou. 2019. An empirical study of the long duration of continuous integration builds. Empirical Software Engineering (2019), 1–38.Google Scholar
- Volker Gruhn, Christoph Hannebauer, and Christian John. 2013. Security of public continuous integration services. In 9th International Symposium on open collaboration. 1–10.Google ScholarDigital Library
- Sayed Y Hashimi and S. I. Hashimi. 2006. The Unified Build Engine: MSBuild. Deploying. NET Applications: Learning MSBuild and ClickOnce (2006), 21–43.Google Scholar
- Foyzul Hassan and Xiaoyin Wang. 2017. Change-aware build prediction model for stall avoidance in continuous integration. In ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 157–162.Google ScholarDigital Library
- Michael Hilton, Nicholas Nelson, Timothy Tunnell, Darko Marinov, and Danny Dig. 2017. Trade-offs in continuous integration: assurance, security, and flexibility. In 11th Joint Meeting on Foundations of Software Engineering. 197–207.Google ScholarDigital Library
- Michael Hilton, Timothy Tunnell, Kai Huang, Darko Marinov, and Danny Dig. 2016. Usage, Costs, and Benefits of Continuous Integration in Open-source Projects. In Int. Conference on Automated Software Engineering. 426–437.Google ScholarDigital Library
- Eero Laukkanen, Juha Itkonen, and Casper Lassenius. 2017. Problems, causes and solutions when adopting continuous delivery—A systematic literature review. Information and Software Technology 82 (2017), 55–79.Google ScholarCross Ref
- Eero Laukkanen, Maria Paasivaara, and Teemu Arvonen. 2015. Stakeholder perceptions of the adoption of continuous integration–a case study. In 2015 agile conference. IEEE, 11–20.Google Scholar
- Yang Luo, Yangyang Zhao, Wanwangying Ma, and Lin Chen. 2017. What are the Factors Impacting Build Breakage?. In 14th Web Information Systems and Applications Conference (WISA). 139–142.Google ScholarCross Ref
- Moses Openja, Bram Adams, and Foutse Khomh. 2020. Analysis of Modern Release Engineering Topics:–A Large-Scale Study using StackOverflow–. In International Conference on Software Maintenance and Evolution. 104–114.Google ScholarCross Ref
- Anthony Peruma, Steven Simmons, Eman Abdullah AlOmar, Christian D Newman, Mohamed Wiem Mkaouer, and Ali Ouni. 2022. How do i refactor this? An empirical study on refactoring trends and topics in Stack Overflow. Empirical Software Engineering 27, 1 (2022), 1–43.Google ScholarDigital Library
- Gustavo Pinto, Fernando Castor, Rodrigo Bonifacio, and Marcel Rebouças. 2018. Work practices and challenges in continuous integration: A survey with Travis CI users. Software: Practice and Experience 48, 12 (2018), 2223–2236.Google ScholarCross Ref
- Akond Rahman, Amritanshu Agrawal, Rahul Krishna, and Alexander Sobran. 2018. Characterizing the influence of continuous integration: Empirical results from 250+ open source and proprietary projects. In 4th ACM SIGSOFT International Workshop on Software Analytics. 8–14.Google ScholarDigital Library
- Thomas Rausch, Waldemar Hummer, Philipp Leitner, and Stefan Schulte. 2017. An empirical analysis of build failures in the continuous integration workflows of Java-based open-source software. In MSR. 345–355.Google Scholar
- Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the space of topic coherence measures. In ACM international conference on Web search and data mining. 399–408.Google ScholarDigital Library
- Islem Saidani, Ali Ouni, and Mohamed Wiem Mkaouer. 2022. Improving the prediction of continuous integration build failures using deep learning. Automated Software Engineering 29, 1 (2022), 1–61.Google ScholarDigital Library
- Islem Saidani, Ali Ouni, Mohamed Wiem Mkaouer, and Fabio Palomba. 2021. On the impact of Continuous Integration on refactoring practice: An exploratory study on TravisTorrent. Information and Software Technology 138 (2021), 106618.Google ScholarCross Ref
- Islem Saidani, Ali Ouni, and Wiem Mkaouer. 2021. Detecting skipped commits in continuous integration using multi-objective evolutionary search. IEEE Transactions on Software Engineering (2021).Google Scholar
- Mojtaba Shahin, Muhammad Ali Babar, and Liming Zhu. 2017. Continuous integration, delivery and deployment: a systematic review on approaches, tools, challenges and practices. IEEE Access 5 (2017), 3909–3943.Google ScholarCross Ref
- Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed Hassan, and Kenichi Matsumoto. 2018. The impact of automated parameter optimization on defect prediction models. IEEE Transactions on Software Engineering 45, 7 (2018), 683–711.Google ScholarCross Ref
- Bogdan Vasilescu, Yue Yu, Huaimin Wang, P. Devanbu, and Vladimir Filkov. 2015. Quality and Productivity Outcomes Relating to Continuous Integration in GitHub. In Joint Meeting on Foundations of Software Engineering. 805–816.Google ScholarDigital Library
- Carmine Vassallo, Sebastian Proksch, Harald C Gall, and Massimiliano Di Penta. 2019. Automated reporting of anti-patterns and decay in continuous integration. In 41st International Conference on Software Engineering. 105–115.Google ScholarDigital Library
- Ananto Setyo Wicaksono and Ahmad Afif Supianto. 2018. Hyper parameter optimization using genetic algorithm on machine learning methods for online news popularity prediction. Int. J. Adv. Comput. Sci. Appl 9, 12 (2018), 263–267.Google Scholar
- David Gray Widder, Michael Hilton, Christian Kästner, and Bogdan Vasilescu. 2019. A conceptual replication of continuous integration pain points in the context of Travis CI. In 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 647–658.Google ScholarDigital Library
- Zheng Xie and Ming Li. 2018. Cutting the Software Building Efforts in Continuous Integration by Semi-Supervised Online AUC Optimization.. In IJCAI. 2875–2881.Google Scholar
- Li Yang and Abdallah Shami. 2020. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 415 (2020), 295–316.Google ScholarCross Ref
- Xin-Li Yang, David Lo, Xin Xia, Zhi-Yuan Wan, and Jian-Ling Sun. 2016. What security questions do developers ask? a large-scale study of stack overflow posts. Journal of Computer Science and Technology 31, 5 (2016), 910–924.Google ScholarCross Ref
- Mansooreh Zahedi, Roshan Namal Rajapakse, and Muhammad Ali Babar. 2020. Mining questions asked about continuous software engineering: A case study of stack overflow. In Evaluation and assessment in software engineering. 41–50.Google Scholar
- Fiorella Zampetti, Carmine Vassallo, Sebastiano Panichella, Gerardo Canfora, Harald Gall, and Massimiliano Di Penta. 2020. An empirical characterization of bad practices in continuous integration. Empirical Software Engineering 25, 2 (2020), 1095–1135.Google ScholarCross Ref
- Yangyang Zhao, Alexander Serebrenik, Yuming Zhou, Vladimir Filkov, and Bogdan Vasilescu. 2017. The Impact of Continuous Integration on Other Software Development Practices: A Large-scale Empirical Study. In IEEE/ACM International Conference on Automated Software Engineering. 60–71.Google Scholar
Recommendations
What are developers talking about? An analysis of topics and trends in Stack Overflow
Programming question and answer (Q&A) websites, such as Stack Overflow, leverage the knowledge and expertise of users to provide answers to technical questions. Over time, these websites turn into repositories of software engineering knowledge. Such ...
An empirical study on stack overflow using topic analysis
MSR '15: Proceedings of the 12th Working Conference on Mining Software RepositoriesProgramming question and answer (Q&A) websites, such as Stack Overflow, gathered knowledge and expertise of developers from all over the world, this knowledge reflects some insight into the development activities. To comprehend the actual thoughts and ...
An empirical study of IoT topics in IoT developer discussions on Stack Overflow
AbstractInternet of Things (IoT) is defined as the connection between places and physical objects (i.e., things) over the Internet via smart computing devices. It is a rapidly emerging paradigm that encompasses almost every aspect of our modern life, such ...
Comments