Abstract
Feature annotations (e.g., code fragments guarded by #ifdef C-preprocessor directives) control code extensions related to features. Feature annotations have long been said to be undesirable. When maintaining features that control many annotations, there is a high risk of ripple effects. Also, excessive use of feature annotations leads to code clutter, hinder program comprehension and harden maintenance. To prevent such problems, developers should monitor the use of feature annotations, for example, by setting acceptable thresholds. Interestingly, little is known about how to extract thresholds in practice, and which values are representative for feature-related metrics. To address this issue, we analyze the statistical distribution of three feature-related metrics collected from a corpus of 20 well-known and long-lived C-preprocessor-based systems from different domains. We consider three metrics: scattering degree of feature constants, tangling degree of feature expressions, and nesting depth of preprocessor annotations. Our findings show that feature scattering is highly skewed; in 14 systems (70 %), the scattering distributions match a power law, making averages and standard deviations unreliable limits. Regarding tangling and nesting, the values tend to follow a uniform distribution; although outliers exist, they have little impact on the mean, suggesting that central statistics measures are reliable thresholds for tangling and nesting. Following our findings, we then propose thresholds from our benchmark data, as a basis for further investigations.
Similar content being viewed by others
Notes
Data available at http://tuvalu.santafe.edu/~aaronc/powerlaws/data.htm.
Multilines are convenient when spanning a long line across multiple ones; during compilation, sequences of multilines are taken as a single line.
xterm change log is available at http://invisible-island.net/xterm/xterm.log.html.
References
Alves, T.L., Ypma, C., Visser, J.: Deriving Metric Thresholds from Benchmark Data. In: Proceedings of the International Conference on Software Maintenance, pp. 1–10. IEEE (2010)
Apel, S., Batory, D., Kästner, C., Saake, G.: Feature-Oriented Software Product Lines: Concepts and Implementation. Springer, Berlin (2013)
Apel, S., Leich, T., Saake, G.: Aspectual feature modules. IEEE Trans. Softw. Eng. 34(2), 162–180 (2008)
Baxter, G., Frean, M., Noble, J., Rickerby, M., Smith, H., Visser, M., Melton, H., Tempero, E.: Understanding the Shape of Java Software. In: Proceedings of the International Conference on Object-oriented Programming Systems, Languages, and Applications, pp. 397–412. ACM (2006)
Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data. Soc. Ind. Appl. Math. Rev. 51(4), 661–703 (2009)
Concas, G., Marchesi, M., Pinna, S., Serra, N.: Power-laws in a large object-oriented software system. IEEE Trans. Softw. Eng. 33(10), 687–708 (2007)
Eaddy, M., Zimmermann, T., Sherwood, K.D., Garg, V., Murphy, G.C., Nagappan, N., Aho, A.V.: Do crosscutting concerns cause defects? IEEE Trans. Softw. Eng. 34(4), 497–515 (2008)
Favre, J.M.: Preprocessors from an Abstract Point of View. In: Proceedings of the International Conference on Software Maintenance, pp. 287–296. IEEE (1996)
Ferreira, K., Bigonha, M., Bigonha, R., Mendes, L., Almeida, H.: Identifying thresholds for object-oriented software metrics. J. Syst. Softw. 85(2), 244–257 (2011)
Gillespie, C.S.: Fitting Heavy-Tailed Distributions: The PoweRlaw Package (2014). R package version 0.20.5
Gillespie, C.S.: The PoweRlaw Package: A General Overview (2014)
Hubert, M., Vandervieren, E.: An adjusted boxplot for skewed distributions. Comput. Stat. Data Anal. 52(12), 5186–5201 (2008)
Hunsen, C., Zhang, B., Siegmund, J., Kästner, C., Leßenich, O., Becker, M., Apel, S.: Preprocessor-based variability in open-source and industrial software systems: an empirical study. Empir. Softw. Eng. 1–34 (2015)
Jbara, A., Feitelson, D.: Characterization and Assessment of the Linux Configuration Complexity. In: International Working Conference on Source Code Analysis and Manipulation, pp. 11–20. IEEE (2013)
Kästner, C., Apel, S., Kuhlemann, M.: Granularity in Software Product Lines. In: Proceedings of the International Conference on Software Engineering, pp. 311–320. ACM (2008)
Kästner, C., Apel, S., Ostermann, K.: The Road to Feature Modularity? In: Proceedings of the International Workshop on Feature-Oriented Software Development, pp. 1–8. ACM (2011)
Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopes, C., Loingtier, J.M., Irwin, J.: Aspect-Oriented Programming. In: Proceedings of the European Conference on Object-Oriented Programming, pp. 220–242. Springer (1997)
Krone M.; Snelting, G.: On the Inference of Configuration Structures from Source Code. In: Proceedings of the International Conference on Software Engineering, pp. 49–57. IEEE (1994)
Liebig, J., Apel, S., Lengauer, C., Kästner, C., Schulze, M.: An Analysis of the Variability in Forty Preprocessor-Based Software Product Lines. In: Proceedings of the International Conference on Software Engineering, pp. 105–114. ACM (2010)
Liebig, J., Kästner, C., Apel, S.: Analyzing the Discipline of Preprocessor Annotations in 30 Million Lines of C Code. In: Proceedings of the International Conference on Aspect-Oriented Software Development, pp. 191–202. ACM (2011)
Louridas, P., Spinellis, D., Vlachos, V.: Power laws in software. ACM Trans. Softw. Eng. Methodol. 18, 1–26 (2008)
Newman, M.: Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46, 323–351 (2005)
Oliveira, P., Lima, F., Valente, M.T., Alexander, S.: RTTOOL: A Tool for Extracting Relative Thresholds for Source Code Metrics. In: Proceedings of the International Conference on Software Maintenance and Evolution (Tool Demo Track), pp. 1–4 (2014)
Oliveira, P., Valente, M., Paim Lima, F.: Extracting Relative Thresholds for Source Code Metrics. In: Proceedings of the International Conference on Software Maintenance, Reengineering and Reverse Engineering, pp. 254–263. IEEE (2014)
Passos, L., Guo, J., Teixeira, L., Czarnecki, K., Wasowski, A., Borba, P.: Coevolution of Variability Models and Related Artifacts: A Case Study from the Linux Kernel. In: Proceedings of the International Software Product Line Conference, pp. 91–100. ACM (2013)
Passos, L., Padilla, J., Berger, T., Apel, S., Czarnecki, K., Valente, M.T.: Feature Scattering in the Large: A Longitudinal Study of Linux Kernel Device Drivers. In: Proceedings of the International Conference on Modularity, pp. 1–12. ACM (2015)
Passos, L., Teixeira, L., Dintzner, N., Apel, S., Wasowski, A., Czarnecki, K., Borba, P., Guo, J.: Coevolution of variability models and related software artifacts. Empir. Softw. Eng. 1–50 (2015)
Queiroz, R., Passos, L., Valente, M.T., Apel, S., Czarnecki, K.: Does Feature Scattering Follow Power-Law Distributions? An Investigation of Five Pre-Processor-Based Systems. In: Proceedings of the International Workshop on Feature-Oriented Software Development (FOSD), pp. 23–29. ACM (2014)
Serebrenik, A., van den Brand, M.: Theil Index for Aggregation of Software Metrics Values. In: Proceedings of the International Conference on Software Maintenance, pp. 1–9. IEEE (2010)
Souza, L., Maia, M.: Do software Categories Impact Coupling Metrics? In: Proceedings of the Working Conference on Mining Software Repositories, pp. 217–220. IEEE (2013)
Spencer, H., Collyer, G.: #ifdef Considered Harmful, or Portability Experience with C News. In: Proceedings of the USENIX Technical Conference, pp. 185–197. USENIX Association (1992)
Sullivan, K., Griswold, W.G., Song, Y., Cai, Y., Shonle, M., Tewari, N., Rajan, H.: Information Hiding Interfaces for Aspect-Oriented Design. In: Proceedings of the International Symposium on Foundations of Software Engineering, pp. 166–175. ACM (2005)
Taube-Schock, C., Walker, R.J., Witten, I.H.: Can We Avoid High Coupling? In: Proceedings of the European Conference on Object-Oriented Programming, pp. 204–228. Springer (2011)
Valente, M.T., Borges, V., Passos, L.: A semi-automatic approach for extracting software product lines. IEEE Trans. Softw. Eng. 38(4), 737–754 (2012)
Vasa, R., Lumpe, M., Branchand, P., Nierstrasz, O.: Comparative Analysis of Evolving Software Systems Using the Gini Coefficient. In: Proceedings of the International Conference on Software Maintenance, pp. 179–188. IEEE (2009)
Vasilescu, B., Serebrenik, A., van den Brand, M.: You Can’t Control the Unfamiliar: A Study on the Relations Between Aggregation Techniques for Software Metrics. In: Proceedings of the International Conference on Software Maintenance, pp. 313–322. IEEE (2011)
Wheeldon, R., Counsell, S.: Power Law Distributions in Class Relationships. In: Proceedings of the International Working Conference on Source Code Analysis and Manipulation, pp. 45–54. IEEE (2003)
Zhang, F., Mockus, A., Zou, Y., Khomh, F., Hassan, A.E.: How does Context affect the Distribution of Software Maintainability Metrics? In: Proceedings of the International Conference on Software Maintainability, pp. 1–10. IEEE (2013)
Acknowledgments
We thank CNPq, CAPES, FAPEMIG, and the German Research Foundation (AP 206/4, AP 206/5, AP 206/6) for partially funding this project.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Prof. Andrzej Wąsowski and Thorsten Weyer.
Rights and permissions
About this article
Cite this article
Queiroz, R., Passos, L., Valente, M.T. et al. The shape of feature code: an analysis of twenty C-preprocessor-based systems. Softw Syst Model 16, 77–96 (2017). https://doi.org/10.1007/s10270-015-0483-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10270-015-0483-z