Abstract
Software developers rely on a fast build system to incrementally compile their source code changes and produce modified deliverables for testing and deployment. Header files, which tend to trigger slow rebuild processes, are most problematic if they also change frequently during the development process, and hence, need to be rebuilt often. In this paper, we propose an approach that analyzes the build dependency graph (i.e., the data structure used to determine the minimal list of commands that must be executed when a source code file is modified), and the change history of a software system to pinpoint header file hotspots—header files that change frequently and trigger long rebuild processes. Through a case study on the GLib, PostgreSQL, Qt, and Ruby systems, we show that our approach identifies header file hotspots that, if improved, will provide greater improvement to the total future build cost of a system than just focusing on the files that trigger the slowest rebuild processes, change the most frequently, or are used the most throughout the codebase. Furthermore, regression models built using architectural and code properties of source files can explain 32–57 % of these hotspots, identifying subsystems that are particularly hotspot-prone and would benefit the most from architectural refinement.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
The simulation was repeated for 20 and 50 % improvements, which yielded similar results.
References
Adams, B., De Schutter, K., Tromp, H., Meuter, W.: Design recovery and maintenance of build systems. In: Proceedings of the 23rd International Conference on Software Maintenance (ICSM), pp. 114–123 (2007)
Adams, B., Schutter, KD., Tromp, H., Meuter, WD.: The evolution of the linux build system. In: Electronic Communications of the ECEASST 8 (2008)
Adams, R., Tichy, W., Weinert, A.: The cost of selective recompilation and environment processing. Trans. Softw. Eng. Methodol. (TOSEM) 3(1), 3–28 (1994)
Al-Kofahi, J.M., Nguyen, H.V., Nguyen, A.T., Nguyen, T.T., Nguyen, T.N.: Detecting semantic changes in makefile build code. In: Proceedings of the 28th International Conference on Software Maintenance (ICSM), pp. 150–159 (2012)
Cataldo, M., Mockus, A., Roberts, J.A., Herbsleb, J.D.: Software dependencies, work dependencies, and their impact on failures. Trans. Softw. Eng. (TSE) 35(6), 864–878 (2009)
Chambers, J.M., Hastie, T.J. (eds.): Statistical Models in S, vol. 4. Wadsworth and Brooks/Cole, Pacific Grove (1992)
Dayani-Fard, H., Yu, Y., Mylopoulos, J., Andritsos, P.: Improving the build architecture of legacy C/C++ Software systems. In: Proceedings of the 8th International Conference on Fundamental Approaches to Software Engineering (FASE), pp. 96–110 (2005)
Feldman, S.: Make: a program for maintaining computer programs. Software 9(4), 255–265 (1979)
Fischer, A.R.H., Blommaert, F.J.J., Midden, C.J.H.: Monitoring and evaluation of time delay. Int. J. Hum. Comput. Interact. 19(2), 163–180 (2005)
Fox, J.: Applied Regression Analysis and Generalized Linear Models, 2nd edn. Sage Publications, Thousand Oaks (2008)
Hassan, A.E., Zhang, K.: Using decision trees to predict the certification result of a build. In: Proceedings of the 21st International Conference on Automated Software Engineering (ASE), pp. 189–198 (2006)
Hochstein, L., Jiao, Y.: The cost of the build tax in scientific software. In: Proceedings of the 5th International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 384–387 (2011)
Humble, J., Farley, D.: Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation. Addison-Wesley, New Jersey (2010)
Khomh, F., Chan, B., Zou, Y., Hassan, A.E.: An Entropy evaluation approach for triaging field crashes: a case study of mozilla firefox. In: Proceedings of the 18th Working Conference on Reverse Engineering (WCRE), pp. 261–270 (2011)
Kumfert, G., Epperly, T.: Software in the DOE: the hidden overhead of “The Build”. Techical Report UCRL-ID-147343, Lawrence Livermore National Laboratory, CA, USA (2002)
Kwan, I., Schröter, A., Damian, D.: Does socio-technical congruence have an effect on software build success? A study of coordination in a software project? Trans. Softw. Eng. (TSE) 37(3), 307–324 (2011)
Lakos, J.: Large-Scale C++ Software Design. Addison-Wesley, New Jersey (1996)
McIntosh, S., Adams, B., Nguyen, T.H.D., Kamei, Y., Hassan, A.E.: An empirical study of build maintenance effort. In: Proceedings of the 33rd International Conference on Software Engineering (ICSE), pp. 141–150 (2011)
McIntosh, S., Adams, B., Hassan, A.E.: The evolution of Java build systems. Empir. Softw. Eng. 17(4–5), 578–608 (2012)
McIntosh, S., Nagappan, M., Adams, B., Mockus, A., Hassan, A.E.: A large-scale empirical study of the relationship between build technology and build maintenance. Empir. Softw. Eng. (2015)
Mockus, A.: Organizational volatility and its effects on software defects. In: Proceedings of the 18th Symposium on the Foundations of Software Engineering (FSE), pp. 117–126 (2010)
Morgenthaler, J.D., Gridnev, M., Sauciuc, R., Bhansali, S.: Searching for build debt: experiences managing technical debt at google. In: Proceedings of the 3rd International Workshop on Managing Technical Debt (MTD), pp. 1–6 (2012)
Nadi, S., Holt, R.: Make it or break it: mining anomalies in linux kbuild. In: Proceedings of the 18th Working Conference on Reverse Engineering (WCRE), pp. 315–324 (2011)
Nadi, S., Holt, R.: Mining Kbuild to detect variability anomalies in linux. In: Proceedings of the 16th European Conference on Software Maintenance and Reengineering (CSMR), pp. 107–116 (2012)
Nadi, S., Dietrich, C., Tartler, R., Holt, R.C., Lohmann, D.: Linux variability anomalies: what causes them and how do they get fixed? In: Proceedings of the 10th Working Conference on Mining Software Repositories (MSR), pp. 111–120 (2013)
Neitsch, A., Wong, K., Godfrey, M.W.: Build system issues in multilanguage software. In: Proceedings of the 28th International Conference on Software Maintenance, pp. 140–149 (2012)
R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/
Shihab, E., Jiang, Z.M., Ibrahim, W.M., Adams, B., Hassan, A.E.: Understanding the Impact of code and process metrics on post-release defects: a case study on the eclipse project. In: Proceedings of the 4th International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 1–10 (2010)
van der Storm, T.: Component-based configuration, integration and delivery. Ph.D Thesis, University of Amsterdam (2007)
van der Storm, T.: Backtracking incremental continuous integration. In: Proceedings of the 12th European Conference on Software Maintenance and Reengineering (CSMR), pp. 233–242 (2008)
Tamrawi, A., Nguyen, H.A., Nguyen, H.V., Nguyen, T.: Build Code analysis with symbolic evaluation. In: Proceedings of the 34th International Conference on Software Engineering (ICSE), pp. 650–660 (2012)
Tu, Q., Godfrey, M.W.: The build-time software architecture view. In: Proceedings of the 17th International Conference on Software Maintenance (ICSM), pp. 398–407 (2001)
Vakilian, M., Sauciuc, R., Morgenthaler, J.D., Mirrokni, V.: Automated decomposition of build targets. In: Proceedings of the 37th International Conference on Software Engineering (ICSE), pp. 123–133 (2015)
Wolf, T., Schröter, A., Damian, D., Nguyen, T.: Predicting build failures using social network analysis on developer communication. In: Procedings of the 31st International Conference on Software Engineering (ICSE), pp. 1–11. Washington, DC (2009)
Yu, Y., Dayani-Fard, H., Mylopoulos, J.: Removing false code dependencies to speedup software build processes. In: Proceedings of the 13th IBM Centre for Advanced Studies Conference (CASCON), pp. 343–352 (2003)
Yu, Y., Dayani-Fard, H., Mylopoulos, J., Andritsos, P.: Reducing build time through precompilations for evolving large software. In: Proceedings of the 21st International Conference on Software Maintenance (ICSM), pp. 59–68 (2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
McIntosh, S., Adams, B., Nagappan, M. et al. Identifying and understanding header file hotspots in C/C++ build processes. Autom Softw Eng 23, 619–647 (2016). https://doi.org/10.1007/s10515-015-0183-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10515-015-0183-5