The shape of feature code: an analysis of twenty C-preprocessor-based systems

Queiroz, Rodrigo; Passos, Leonardo; Valente, Marco Tulio; Hunsen, Claus; Apel, Sven; Czarnecki, Krzysztof

doi:10.1007/s10270-015-0483-z

The shape of feature code: an analysis of twenty C-preprocessor-based systems

Theme Section Paper
Published: 16 July 2015

Volume 16, pages 77–96, (2017)
Cite this article

Software & Systems Modeling Aims and scope Submit manuscript

Rodrigo Queiroz¹,
Leonardo Passos²,
Marco Tulio Valente¹,
Claus Hunsen³,
Sven Apel³ &
…
Krzysztof Czarnecki²

532 Accesses
22 Citations
Explore all metrics

Abstract

Feature annotations (e.g., code fragments guarded by #ifdef C-preprocessor directives) control code extensions related to features. Feature annotations have long been said to be undesirable. When maintaining features that control many annotations, there is a high risk of ripple effects. Also, excessive use of feature annotations leads to code clutter, hinder program comprehension and harden maintenance. To prevent such problems, developers should monitor the use of feature annotations, for example, by setting acceptable thresholds. Interestingly, little is known about how to extract thresholds in practice, and which values are representative for feature-related metrics. To address this issue, we analyze the statistical distribution of three feature-related metrics collected from a corpus of 20 well-known and long-lived C-preprocessor-based systems from different domains. We consider three metrics: scattering degree of feature constants, tangling degree of feature expressions, and nesting depth of preprocessor annotations. Our findings show that feature scattering is highly skewed; in 14 systems (70 %), the scattering distributions match a power law, making averages and standard deviations unreliable limits. Regarding tangling and nesting, the values tend to follow a uniform distribution; although outliers exist, they have little impact on the mean, suggesting that central statistics measures are reliable thresholds for tangling and nesting. Following our findings, we then propose thresholds from our benchmark data, as a basis for further investigations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Static data-flow analysis for software product lines in C

Article Open access 25 March 2022

Software development with feature toggles: practices used by practitioners

Article 08 January 2021

Supporting Product Line Adoption by Combining Syntactic and Textual Feature Extraction

Notes

http://rodrigoqueiroz.bitbucket.org/sosym2015.html.
Data available at http://tuvalu.santafe.edu/~aaronc/powerlaws/data.htm.
Multilines are convenient when spanning a long line across multiple ones; during compilation, sequences of multilines are taken as a single line.
http://www.srcml.org/.
http://www.r-project.org/.
xterm change log is available at http://invisible-island.net/xterm/xterm.log.html.

References

Alves, T.L., Ypma, C., Visser, J.: Deriving Metric Thresholds from Benchmark Data. In: Proceedings of the International Conference on Software Maintenance, pp. 1–10. IEEE (2010)
Apel, S., Batory, D., Kästner, C., Saake, G.: Feature-Oriented Software Product Lines: Concepts and Implementation. Springer, Berlin (2013)
Book Google Scholar
Apel, S., Leich, T., Saake, G.: Aspectual feature modules. IEEE Trans. Softw. Eng. 34(2), 162–180 (2008)
Article Google Scholar
Baxter, G., Frean, M., Noble, J., Rickerby, M., Smith, H., Visser, M., Melton, H., Tempero, E.: Understanding the Shape of Java Software. In: Proceedings of the International Conference on Object-oriented Programming Systems, Languages, and Applications, pp. 397–412. ACM (2006)
Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data. Soc. Ind. Appl. Math. Rev. 51(4), 661–703 (2009)
MathSciNet MATH Google Scholar
Concas, G., Marchesi, M., Pinna, S., Serra, N.: Power-laws in a large object-oriented software system. IEEE Trans. Softw. Eng. 33(10), 687–708 (2007)
Article Google Scholar
Eaddy, M., Zimmermann, T., Sherwood, K.D., Garg, V., Murphy, G.C., Nagappan, N., Aho, A.V.: Do crosscutting concerns cause defects? IEEE Trans. Softw. Eng. 34(4), 497–515 (2008)
Article Google Scholar
Favre, J.M.: Preprocessors from an Abstract Point of View. In: Proceedings of the International Conference on Software Maintenance, pp. 287–296. IEEE (1996)
Ferreira, K., Bigonha, M., Bigonha, R., Mendes, L., Almeida, H.: Identifying thresholds for object-oriented software metrics. J. Syst. Softw. 85(2), 244–257 (2011)
Gillespie, C.S.: Fitting Heavy-Tailed Distributions: The PoweRlaw Package (2014). R package version 0.20.5
Gillespie, C.S.: The PoweRlaw Package: A General Overview (2014)
Hubert, M., Vandervieren, E.: An adjusted boxplot for skewed distributions. Comput. Stat. Data Anal. 52(12), 5186–5201 (2008)
Article MathSciNet MATH Google Scholar
Hunsen, C., Zhang, B., Siegmund, J., Kästner, C., Leßenich, O., Becker, M., Apel, S.: Preprocessor-based variability in open-source and industrial software systems: an empirical study. Empir. Softw. Eng. 1–34 (2015)
Jbara, A., Feitelson, D.: Characterization and Assessment of the Linux Configuration Complexity. In: International Working Conference on Source Code Analysis and Manipulation, pp. 11–20. IEEE (2013)
Kästner, C., Apel, S., Kuhlemann, M.: Granularity in Software Product Lines. In: Proceedings of the International Conference on Software Engineering, pp. 311–320. ACM (2008)
Kästner, C., Apel, S., Ostermann, K.: The Road to Feature Modularity? In: Proceedings of the International Workshop on Feature-Oriented Software Development, pp. 1–8. ACM (2011)
Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopes, C., Loingtier, J.M., Irwin, J.: Aspect-Oriented Programming. In: Proceedings of the European Conference on Object-Oriented Programming, pp. 220–242. Springer (1997)
Krone M.; Snelting, G.: On the Inference of Configuration Structures from Source Code. In: Proceedings of the International Conference on Software Engineering, pp. 49–57. IEEE (1994)
Liebig, J., Apel, S., Lengauer, C., Kästner, C., Schulze, M.: An Analysis of the Variability in Forty Preprocessor-Based Software Product Lines. In: Proceedings of the International Conference on Software Engineering, pp. 105–114. ACM (2010)
Liebig, J., Kästner, C., Apel, S.: Analyzing the Discipline of Preprocessor Annotations in 30 Million Lines of C Code. In: Proceedings of the International Conference on Aspect-Oriented Software Development, pp. 191–202. ACM (2011)
Louridas, P., Spinellis, D., Vlachos, V.: Power laws in software. ACM Trans. Softw. Eng. Methodol. 18, 1–26 (2008)
Article Google Scholar
Newman, M.: Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46, 323–351 (2005)
Article Google Scholar
Oliveira, P., Lima, F., Valente, M.T., Alexander, S.: RTTOOL: A Tool for Extracting Relative Thresholds for Source Code Metrics. In: Proceedings of the International Conference on Software Maintenance and Evolution (Tool Demo Track), pp. 1–4 (2014)
Oliveira, P., Valente, M., Paim Lima, F.: Extracting Relative Thresholds for Source Code Metrics. In: Proceedings of the International Conference on Software Maintenance, Reengineering and Reverse Engineering, pp. 254–263. IEEE (2014)
Passos, L., Guo, J., Teixeira, L., Czarnecki, K., Wasowski, A., Borba, P.: Coevolution of Variability Models and Related Artifacts: A Case Study from the Linux Kernel. In: Proceedings of the International Software Product Line Conference, pp. 91–100. ACM (2013)
Passos, L., Padilla, J., Berger, T., Apel, S., Czarnecki, K., Valente, M.T.: Feature Scattering in the Large: A Longitudinal Study of Linux Kernel Device Drivers. In: Proceedings of the International Conference on Modularity, pp. 1–12. ACM (2015)
Passos, L., Teixeira, L., Dintzner, N., Apel, S., Wasowski, A., Czarnecki, K., Borba, P., Guo, J.: Coevolution of variability models and related software artifacts. Empir. Softw. Eng. 1–50 (2015)
Queiroz, R., Passos, L., Valente, M.T., Apel, S., Czarnecki, K.: Does Feature Scattering Follow Power-Law Distributions? An Investigation of Five Pre-Processor-Based Systems. In: Proceedings of the International Workshop on Feature-Oriented Software Development (FOSD), pp. 23–29. ACM (2014)
Serebrenik, A., van den Brand, M.: Theil Index for Aggregation of Software Metrics Values. In: Proceedings of the International Conference on Software Maintenance, pp. 1–9. IEEE (2010)
Souza, L., Maia, M.: Do software Categories Impact Coupling Metrics? In: Proceedings of the Working Conference on Mining Software Repositories, pp. 217–220. IEEE (2013)
Spencer, H., Collyer, G.: #ifdef Considered Harmful, or Portability Experience with C News. In: Proceedings of the USENIX Technical Conference, pp. 185–197. USENIX Association (1992)
Sullivan, K., Griswold, W.G., Song, Y., Cai, Y., Shonle, M., Tewari, N., Rajan, H.: Information Hiding Interfaces for Aspect-Oriented Design. In: Proceedings of the International Symposium on Foundations of Software Engineering, pp. 166–175. ACM (2005)
Taube-Schock, C., Walker, R.J., Witten, I.H.: Can We Avoid High Coupling? In: Proceedings of the European Conference on Object-Oriented Programming, pp. 204–228. Springer (2011)
Valente, M.T., Borges, V., Passos, L.: A semi-automatic approach for extracting software product lines. IEEE Trans. Softw. Eng. 38(4), 737–754 (2012)
Article Google Scholar
Vasa, R., Lumpe, M., Branchand, P., Nierstrasz, O.: Comparative Analysis of Evolving Software Systems Using the Gini Coefficient. In: Proceedings of the International Conference on Software Maintenance, pp. 179–188. IEEE (2009)
Vasilescu, B., Serebrenik, A., van den Brand, M.: You Can’t Control the Unfamiliar: A Study on the Relations Between Aggregation Techniques for Software Metrics. In: Proceedings of the International Conference on Software Maintenance, pp. 313–322. IEEE (2011)
Wheeldon, R., Counsell, S.: Power Law Distributions in Class Relationships. In: Proceedings of the International Working Conference on Source Code Analysis and Manipulation, pp. 45–54. IEEE (2003)
Zhang, F., Mockus, A., Zou, Y., Khomh, F., Hassan, A.E.: How does Context affect the Distribution of Software Maintainability Metrics? In: Proceedings of the International Conference on Software Maintainability, pp. 1–10. IEEE (2013)

Download references

Acknowledgments

We thank CNPq, CAPES, FAPEMIG, and the German Research Foundation (AP 206/4, AP 206/5, AP 206/6) for partially funding this project.

Author information

Authors and Affiliations

Federal University of Minas Gerais, Belo Horizonte, Brazil
Rodrigo Queiroz & Marco Tulio Valente
University of Waterloo, Waterloo, Canada
Leonardo Passos & Krzysztof Czarnecki
University of Passau, Passau, Germany
Claus Hunsen & Sven Apel

Authors

Rodrigo Queiroz
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo Passos
View author publications
You can also search for this author in PubMed Google Scholar
Marco Tulio Valente
View author publications
You can also search for this author in PubMed Google Scholar
Claus Hunsen
View author publications
You can also search for this author in PubMed Google Scholar
Sven Apel
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Czarnecki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rodrigo Queiroz.

Additional information

Communicated by Prof. Andrzej Wąsowski and Thorsten Weyer.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Queiroz, R., Passos, L., Valente, M.T. et al. The shape of feature code: an analysis of twenty C-preprocessor-based systems. Softw Syst Model 16, 77–96 (2017). https://doi.org/10.1007/s10270-015-0483-z

Download citation

Received: 24 October 2014
Revised: 07 June 2015
Accepted: 11 June 2015
Published: 16 July 2015
Issue Date: February 2017
DOI: https://doi.org/10.1007/s10270-015-0483-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The shape of feature code: an analysis of twenty C-preprocessor-based systems

Abstract

Access this article

Similar content being viewed by others

Static data-flow analysis for software product lines in C

Software development with feature toggles: practices used by practitioners

Supporting Product Line Adoption by Combining Syntactic and Textual Feature Extraction

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The shape of feature code: an analysis of twenty C-preprocessor-based systems

Abstract

Access this article

Similar content being viewed by others

Static data-flow analysis for software product lines in C

Software development with feature toggles: practices used by practitioners

Supporting Product Line Adoption by Combining Syntactic and Textual Feature Extraction

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation