Abstract
Benchmarking as a method of assessing software performance is known to suffer from random fluctuations that distort the observed performance. In this paper, we focus on the fluctuations caused by compilation. We show that the design of a benchmarking experiment must reflect the existence of the fluctuations if the performance observed during the experiment is to be representative of reality.
We present a new statistical model of a benchmark experiment that reflects the presence of the fluctuations in compilation, execution and measurement. The model describes the observed performance and makes it possible to calculate the optimum dimensions of the experiment that yield the best precision within a given amount of time.
Using a variety of benchmarks, we evaluate the model within the context of regression benchmarking. We show that the model significantly decreases the number of erroneously detected performance changes in regression benchmarking.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Smith, C.U., Williams, L.G.: Performance Solutions: A Practical Guide to Creating Responsive, Scalable Software. Addison–Wesley, Reading (2001)
Kalibera, T., Bulej, L., Tuma, P.: Benchmark precision and random initial state. In: Proceedings of SPECTS 2005, SCS, pp. 853–862 (2005)
Kalibera, T., Bulej, L., Tuma, P.: Automated detection of performance regressions: The Mono experience. In: MASCOTS, pp. 183–190. IEEE Computer Society, Los Alamitos (2005)
Bulej, L., Kalibera, T., Tuma, P.: Repeated results analysis for middleware regression benchmarking. Performance Evaluation 60, 345–358 (2005)
Lo, S.L., Grisby, D., Riddoch, D., Weatherall, J., Scott, D., Richardson, T., Carroll, E., Evers, D., Meerwald, C.: Free high performance orb. (2006), http://omniorb.sourceforge.net
Novell, Inc.: The Mono Project (2006), http://www.mono-project.com
ECMA: ECMA-335: Common Language Infrastructure (CLI). ECMA (2002)
Distributed Systems Research Group: Mono regression benchmarking (2005), http://nenya.ms.mff.cuni.cz/projects/mono
Free Software Foundation: The gnu compiler collection (2006), http://gcc.gnu.org
Gu, D., Verbrugge, C., Gagnon, E.: Code layout as a source of noise in JVM performance. In: Component And Middleware Performance Workshop, OOPSLA 2004 (2004)
Wasserman, L.: All of Statistics: A Concise Course in Statistical Inference. Springer, New York (2004)
Jain, R.: The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. Wiley–Interscience, New York (1991)
Buble, A., Bulej, L., Tuma, P.: CORBA benchmarking: A course with hidden obstacles. In: IPDPS, p. 279. IEEE Computer Society, Los Alamitos (2003)
DOC Group: TAO performance scoreboard (2006), http://www.dre.vanderbilt.edu/stats/performance.shtml
Prochazka, M., Madan, A., Vitek, J., Liu, W.: RTJBench: A Real-Time Java Benchmarking Framework. In: Component And Middleware Performance Workshop, OOPSLA 2004 (2004)
Weisstein, E.W.: Mathworld–a wolfram web resource (2006), http://mathworld.wolfram.com
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kalibera, T., Tuma, P. (2006). Precise Regression Benchmarking with Random Effects: Improving Mono Benchmark Results. In: Horváth, A., Telek, M. (eds) Formal Methods and Stochastic Models for Performance Evaluation. EPEW 2006. Lecture Notes in Computer Science, vol 4054. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11777830_5
Download citation
DOI: https://doi.org/10.1007/11777830_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35362-1
Online ISBN: 978-3-540-35365-2
eBook Packages: Computer ScienceComputer Science (R0)