Skip to main content
Log in

Tradeoffs in modeling performance of highly configurable software systems

  • Regular Paper
  • Published:
Software & Systems Modeling Aims and scope Submit manuscript

Abstract

Modeling the performance of a highly configurable software system requires capturing the influences of its configuration options and their interactions on the system’s performance. Performance-influence models quantify these influences, explaining this way the performance behavior of a configurable system as a whole. To be useful in practice, a performance-influence model should have a low prediction error, small model size, and reasonable computation time. Because of the inherent tradeoffs among these properties, optimizing for one property may negatively influence the others. It is unclear, though, to what extent these tradeoffs manifest themselves in practice, that is, whether a large configuration space can be described accurately only with large models and significant resource investment. By means of 10 real-world highly configurable systems from different domains, we have systematically studied the tradeoffs between the three properties. Surprisingly, we found that the tradeoffs between prediction error and model size and between prediction error and computation time are rather marginal. That is, we can learn accurate and small models in reasonable time, so that one performance-influence model can fit different use cases, such as program comprehension and performance prediction. We further investigated the reasons for why the tradeoffs are marginal. We found that interactions among four or more configuration options have only a minor influence on the prediction error and that ignoring them when learning a performance-influence model can save a substantial amount of computation time, while keeping the model small without considerably increasing the prediction error. This is an important insight for new sampling and learning techniques as they can focus on specific regions of the configuration space and find a sweet spot between accuracy and effort. We further analyzed the causes for the configuration options and their interactions having the observed influences on the systems’ performance. We were able to identify several patterns across subject systems, such as dominant configuration options and data pipelines, that explain the influences of highly influential configuration options and interactions, and give further insights into the domain of highly configurable systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://kernel.org/.

  2. http://fosd.net/tradeoffs/.

  3. The models used in the discussions can be found on the supplementary Web site.

  4. Only valid combinations (i.e., those that respect dependencies among configuration options) are considered.

  5. The tradeoffs for other property pairs are calculated in the same way.

  6. Not all combinations of configuration options are valid system configurations, because of dependencies among the configuration options.

  7. https://httpd.apache.org/docs/2.4/mod/core.html#keepalive.

  8. pre* denotes all configuration options starting with “pre.”

  9. We excluded HSMGP, because conducting this additional experiment with the system would have taken several months of computation time.

References

  1. Apel, S., Speidel, H., Wendler, P., von A Rhein, Beyer, D.: Detection of feature interactions using feature-aware verification. In: Proceedings of the International Conference on Automated Software Engineering (ASE), IEEE, pp. 372–375 (2011)

  2. Apel,S., Kolesnikov, S., Siegmund, N., Kästner, C., Garvin, B.: Exploring feature interactions in the wild: the new feature-interaction challenge. In: Proceedings of the 5th International Workshop on Feature-Oriented Software Development (FOSD), ACM, pp. 1–8 (2013)

  3. Balsamo, S., Di Marco, A., Inverardi, P., Simeoni, M.: Model-based performance prediction in software development: a survey. IEEE Trans. Softw. Eng. 30(5), 295–310 (2004)

    Article  Google Scholar 

  4. Bastian, P., Blatt, M., Engwer, C., Dedner, A., Kuttanikkad, S., Ohlberger, M., Sander, O.: The distributed and unified numerics environment (Dune). In: Proceedings of the Symposium on Simulation Technique in Hannover, pp. 12–14 (2006)

  5. Ben-Gal, I.: Bayesian Networks. Wiley Online Library, New York (2007)

    Google Scholar 

  6. Brosig, F., Meier, P., Becker, S., Koziolek, A., Koziolek, H., Kounev, S.: Quantitative evaluation of model-driven performance analysis and simulation of component-based architectures. IEEE Trans. Softw. Eng. 41(2), 157–175 (2015)

    Article  Google Scholar 

  7. Brunnert, A., Vögele, C., Krcmar, H.: Automatic Performance Model Generation for Java Enterprise Edition (EE) Applications, pp. 74–88. Springer, Berlin (2013)

    Google Scholar 

  8. Calder, M., Miller, A.: Feature interaction detection by pairwise analysis of LTL properties—a case study. Form. Methods Syst. Des. 28(3), 213–261 (2006)

    Article  MATH  Google Scholar 

  9. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)

    Article  Google Scholar 

  10. Classen, A., Heymans, P., Schobbens, P., Legay, A., Raskin, J.: Model checking lots of systems: efficient verification of temporal properties in software product lines. In: Proceedings of the International Conference on Software Engineering (ICSE), ACM, pp. 335–344 (2010)

  11. Domingos, P.: A unified bias-variance decomposition. In: Proceedings of International Conference on Machine Learning (ICML), Morgan Kaufmann, pp. 231–238 (2000)

  12. Grelck, C., Scholz, S.B.: SaC: a functional array language for efficient multi-threaded execution. Int. J. Parallel Prog. 34(4), 383–427 (2006)

    Article  MATH  Google Scholar 

  13. Guo, J., Czarnecki, K., Apel, S., Siegmund, N., Wasowski, A.: Variability-aware performance prediction: a statistical learning approach. In: Proceedings of the International Conference on Automated Software Engineering (ASE), IEEE, pp. 301–311 (2013)

  14. James, G., Witten, D., Hastie, T., Tibshirani, R.: An introduction to statistical learning, vol. 112. Springer, Berlin (2013)

    Book  MATH  Google Scholar 

  15. Kuckuk, S., Gmeiner, B., Köstler, H., Rüde, U.: A generic prototype to benchmark algorithms and data structures for hierarchical hybrid grids. In: Parallel Computing: Accelerating Computational Science and Engineering (CSE), IOS Press, pp. 813–822 (2013)

  16. Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)

    Google Scholar 

  17. Membarth, R., Hannig, F., Teich, J., Körner, M., Eckert, W.: Generating device-specific GPU code for local operators in medical imaging. In: Proceedings of the International Parallel & Distributed Processing Symposium (IPDPS), IEEE, pp. 569–581 (2012)

  18. Myung, J., Pitt, M.: Model comparison methods. Methods Enzymol. 383, 351–366 (2004)

    Article  Google Scholar 

  19. Nhlabatsi, A., Laney, R., Nuseibeh, B.: Feature interaction: the security threat from within software systems. Prog. Inform. 5, 75–89 (2008)

    Article  Google Scholar 

  20. Pooley, R.: Software engineering and performance: a roadmap. In: Proceedings of the Conference on The Future of Software Engineering, ACM, pp. 189–199 (2000)

  21. Prehofer, C.: Plug-and-play composition of features and feature interactions with statechart diagrams. Softw. Syst. Model. 3(3), 221–234 (2004)

    Article  Google Scholar 

  22. Sammut, C., Webb, G.I.: Encyclopedia of Machine Learning. Springer, Berlin (2011)

    MATH  Google Scholar 

  23. Sarkar, A., Guo, J., Siegmund, N., Apel, S., Czarnecki, K.: Cost-efficient sampling for performance prediction of configurable systems. In: Proceedings of the International Conference on Automated Software Engineering (ASE), IEEE, pp. 342–352 (2015)

  24. Sayyad, A., Ingram, J., Menzies, T., Ammar, H.: Scalable product line configuration: a straw to break the camel’s back. In: Proceedings of the International Conference on Automated Software Engineering (ASE), IEEE, pp. 465–474 (2013)

  25. Schruben, L., Yucesan, E.: Complexity of simulation models a graph theoretic approach. In: Proceedings of the Conference on Winter Simulation, ACM, pp. 641–649 (1993)

  26. Siegmund, N., Kolesnikov, S., Kästner, C., Apel, S., Batory, D., Rosenmüller, M., Saake, G.: Predicting performance via automated feature-interaction detection. In: Proceedings of the International Conference on Software Engineering (ICSE), IEEE, pp. 167–177 (2012)

  27. Siegmund, N., Rosenmüller, M., Kästner, C., Giarrusso, P., Apel, S., Kolesnikov, S.: Scalable prediction of non-functional properties in software product lines: footprint and memory consumption. Inf. Softw. Technol. 55(3), 491–507 (2013)

    Article  Google Scholar 

  28. Siegmund, N., Grebhahn, A., Apel, S., Kästner, C.: Performance-influence models for highly configurable systems. In: Proceedings of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), ACM, pp. 284–294 (2015)

  29. Simon, D.: Evolutionary optimization algorithms. Wiley, New York (2013)

    Google Scholar 

  30. Steinberg, D., Colla, P.: CART: Classification and regression trees. Top Ten Algorithms Data Min. 9, 179 (2009)

    Article  Google Scholar 

  31. Steinwart, I., Christmann, A.: Support Vector Machines. Springer, Berlin (2008)

    MATH  Google Scholar 

  32. Wallace, J.: The control and transformation metric: toward the measurement of simulation model complexity. In: Proceedings of the Conference on Winter Simulation, ACM, pp. 597–603 (1987)

  33. Zhang, Y., Guo, J., Blais, E., Czarnecki, K.: Performance prediction of configurable software systems by Fourier learning. In: Proceedings of the International Conference on Automated Software Engineering (ASE), IEEE/ACM, pp. 365–373 (2015)

  34. Zhang, Y., Guo, J., Blais, E., Czarnecki, K., Yu , H.: A mathematical model of performance-relevant feature interactions. In: Proceedings of the International Systems and Software Product Line Conference, ACM, pp. 25–34 (2016)

Download references

Acknowledgements

Kolesnikov’s, Grebhahn’s, and Apel’s work has been supported by the German Research Foundation (AP 206/5, AP 206/6, AP 206/7, AP 206/11) and by the Austrian Federal Ministry of Transport, Innovation and Technology (BMVIT) Project No. 849928. Siegmund’s work has been supported by the German Research Foundation under the Contracts SI 2171/2 and SI 2171/3. Kästner’s work has been supported in part by the National Science Foundation (Awards 1318808, 1552944, and 1717022), the Science of Security Lablet (H9823014C0140), and AFRL and DARPA (FA8750-16-2-0042).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sergiy Kolesnikov or Sven Apel.

Additional information

Communicated by Prof. Gordon Blair.

Appendix A: Influence of configuration options and their interactions

Appendix A: Influence of configuration options and their interactions

See Table 2.

Table 2 A list of the most influential configuration options and interactions grouped by subject system

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kolesnikov, S., Siegmund, N., Kästner, C. et al. Tradeoffs in modeling performance of highly configurable software systems. Softw Syst Model 18, 2265–2283 (2019). https://doi.org/10.1007/s10270-018-0662-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10270-018-0662-9

Keywords

Navigation