Abstract
Modeling the performance of a highly configurable software system requires capturing the influences of its configuration options and their interactions on the system’s performance. Performance-influence models quantify these influences, explaining this way the performance behavior of a configurable system as a whole. To be useful in practice, a performance-influence model should have a low prediction error, small model size, and reasonable computation time. Because of the inherent tradeoffs among these properties, optimizing for one property may negatively influence the others. It is unclear, though, to what extent these tradeoffs manifest themselves in practice, that is, whether a large configuration space can be described accurately only with large models and significant resource investment. By means of 10 real-world highly configurable systems from different domains, we have systematically studied the tradeoffs between the three properties. Surprisingly, we found that the tradeoffs between prediction error and model size and between prediction error and computation time are rather marginal. That is, we can learn accurate and small models in reasonable time, so that one performance-influence model can fit different use cases, such as program comprehension and performance prediction. We further investigated the reasons for why the tradeoffs are marginal. We found that interactions among four or more configuration options have only a minor influence on the prediction error and that ignoring them when learning a performance-influence model can save a substantial amount of computation time, while keeping the model small without considerably increasing the prediction error. This is an important insight for new sampling and learning techniques as they can focus on specific regions of the configuration space and find a sweet spot between accuracy and effort. We further analyzed the causes for the configuration options and their interactions having the observed influences on the systems’ performance. We were able to identify several patterns across subject systems, such as dominant configuration options and data pipelines, that explain the influences of highly influential configuration options and interactions, and give further insights into the domain of highly configurable systems.
Similar content being viewed by others
Notes
The models used in the discussions can be found on the supplementary Web site.
Only valid combinations (i.e., those that respect dependencies among configuration options) are considered.
The tradeoffs for other property pairs are calculated in the same way.
Not all combinations of configuration options are valid system configurations, because of dependencies among the configuration options.
pre* denotes all configuration options starting with “pre.”
We excluded HSMGP, because conducting this additional experiment with the system would have taken several months of computation time.
References
Apel, S., Speidel, H., Wendler, P., von A Rhein, Beyer, D.: Detection of feature interactions using feature-aware verification. In: Proceedings of the International Conference on Automated Software Engineering (ASE), IEEE, pp. 372–375 (2011)
Apel,S., Kolesnikov, S., Siegmund, N., Kästner, C., Garvin, B.: Exploring feature interactions in the wild: the new feature-interaction challenge. In: Proceedings of the 5th International Workshop on Feature-Oriented Software Development (FOSD), ACM, pp. 1–8 (2013)
Balsamo, S., Di Marco, A., Inverardi, P., Simeoni, M.: Model-based performance prediction in software development: a survey. IEEE Trans. Softw. Eng. 30(5), 295–310 (2004)
Bastian, P., Blatt, M., Engwer, C., Dedner, A., Kuttanikkad, S., Ohlberger, M., Sander, O.: The distributed and unified numerics environment (Dune). In: Proceedings of the Symposium on Simulation Technique in Hannover, pp. 12–14 (2006)
Ben-Gal, I.: Bayesian Networks. Wiley Online Library, New York (2007)
Brosig, F., Meier, P., Becker, S., Koziolek, A., Koziolek, H., Kounev, S.: Quantitative evaluation of model-driven performance analysis and simulation of component-based architectures. IEEE Trans. Softw. Eng. 41(2), 157–175 (2015)
Brunnert, A., Vögele, C., Krcmar, H.: Automatic Performance Model Generation for Java Enterprise Edition (EE) Applications, pp. 74–88. Springer, Berlin (2013)
Calder, M., Miller, A.: Feature interaction detection by pairwise analysis of LTL properties—a case study. Form. Methods Syst. Des. 28(3), 213–261 (2006)
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)
Classen, A., Heymans, P., Schobbens, P., Legay, A., Raskin, J.: Model checking lots of systems: efficient verification of temporal properties in software product lines. In: Proceedings of the International Conference on Software Engineering (ICSE), ACM, pp. 335–344 (2010)
Domingos, P.: A unified bias-variance decomposition. In: Proceedings of International Conference on Machine Learning (ICML), Morgan Kaufmann, pp. 231–238 (2000)
Grelck, C., Scholz, S.B.: SaC: a functional array language for efficient multi-threaded execution. Int. J. Parallel Prog. 34(4), 383–427 (2006)
Guo, J., Czarnecki, K., Apel, S., Siegmund, N., Wasowski, A.: Variability-aware performance prediction: a statistical learning approach. In: Proceedings of the International Conference on Automated Software Engineering (ASE), IEEE, pp. 301–311 (2013)
James, G., Witten, D., Hastie, T., Tibshirani, R.: An introduction to statistical learning, vol. 112. Springer, Berlin (2013)
Kuckuk, S., Gmeiner, B., Köstler, H., Rüde, U.: A generic prototype to benchmark algorithms and data structures for hierarchical hybrid grids. In: Parallel Computing: Accelerating Computational Science and Engineering (CSE), IOS Press, pp. 813–822 (2013)
Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)
Membarth, R., Hannig, F., Teich, J., Körner, M., Eckert, W.: Generating device-specific GPU code for local operators in medical imaging. In: Proceedings of the International Parallel & Distributed Processing Symposium (IPDPS), IEEE, pp. 569–581 (2012)
Myung, J., Pitt, M.: Model comparison methods. Methods Enzymol. 383, 351–366 (2004)
Nhlabatsi, A., Laney, R., Nuseibeh, B.: Feature interaction: the security threat from within software systems. Prog. Inform. 5, 75–89 (2008)
Pooley, R.: Software engineering and performance: a roadmap. In: Proceedings of the Conference on The Future of Software Engineering, ACM, pp. 189–199 (2000)
Prehofer, C.: Plug-and-play composition of features and feature interactions with statechart diagrams. Softw. Syst. Model. 3(3), 221–234 (2004)
Sammut, C., Webb, G.I.: Encyclopedia of Machine Learning. Springer, Berlin (2011)
Sarkar, A., Guo, J., Siegmund, N., Apel, S., Czarnecki, K.: Cost-efficient sampling for performance prediction of configurable systems. In: Proceedings of the International Conference on Automated Software Engineering (ASE), IEEE, pp. 342–352 (2015)
Sayyad, A., Ingram, J., Menzies, T., Ammar, H.: Scalable product line configuration: a straw to break the camel’s back. In: Proceedings of the International Conference on Automated Software Engineering (ASE), IEEE, pp. 465–474 (2013)
Schruben, L., Yucesan, E.: Complexity of simulation models a graph theoretic approach. In: Proceedings of the Conference on Winter Simulation, ACM, pp. 641–649 (1993)
Siegmund, N., Kolesnikov, S., Kästner, C., Apel, S., Batory, D., Rosenmüller, M., Saake, G.: Predicting performance via automated feature-interaction detection. In: Proceedings of the International Conference on Software Engineering (ICSE), IEEE, pp. 167–177 (2012)
Siegmund, N., Rosenmüller, M., Kästner, C., Giarrusso, P., Apel, S., Kolesnikov, S.: Scalable prediction of non-functional properties in software product lines: footprint and memory consumption. Inf. Softw. Technol. 55(3), 491–507 (2013)
Siegmund, N., Grebhahn, A., Apel, S., Kästner, C.: Performance-influence models for highly configurable systems. In: Proceedings of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), ACM, pp. 284–294 (2015)
Simon, D.: Evolutionary optimization algorithms. Wiley, New York (2013)
Steinberg, D., Colla, P.: CART: Classification and regression trees. Top Ten Algorithms Data Min. 9, 179 (2009)
Steinwart, I., Christmann, A.: Support Vector Machines. Springer, Berlin (2008)
Wallace, J.: The control and transformation metric: toward the measurement of simulation model complexity. In: Proceedings of the Conference on Winter Simulation, ACM, pp. 597–603 (1987)
Zhang, Y., Guo, J., Blais, E., Czarnecki, K.: Performance prediction of configurable software systems by Fourier learning. In: Proceedings of the International Conference on Automated Software Engineering (ASE), IEEE/ACM, pp. 365–373 (2015)
Zhang, Y., Guo, J., Blais, E., Czarnecki, K., Yu , H.: A mathematical model of performance-relevant feature interactions. In: Proceedings of the International Systems and Software Product Line Conference, ACM, pp. 25–34 (2016)
Acknowledgements
Kolesnikov’s, Grebhahn’s, and Apel’s work has been supported by the German Research Foundation (AP 206/5, AP 206/6, AP 206/7, AP 206/11) and by the Austrian Federal Ministry of Transport, Innovation and Technology (BMVIT) Project No. 849928. Siegmund’s work has been supported by the German Research Foundation under the Contracts SI 2171/2 and SI 2171/3. Kästner’s work has been supported in part by the National Science Foundation (Awards 1318808, 1552944, and 1717022), the Science of Security Lablet (H9823014C0140), and AFRL and DARPA (FA8750-16-2-0042).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Communicated by Prof. Gordon Blair.
Appendix A: Influence of configuration options and their interactions
Appendix A: Influence of configuration options and their interactions
See Table 2.
Rights and permissions
About this article
Cite this article
Kolesnikov, S., Siegmund, N., Kästner, C. et al. Tradeoffs in modeling performance of highly configurable software systems. Softw Syst Model 18, 2265–2283 (2019). https://doi.org/10.1007/s10270-018-0662-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10270-018-0662-9