Abstract
Multicore hardware and system software have become complex and differ from platform to platform. Parallel application performance optimization and portability are now a real challenge. In practice, the effects of tuning parameters are hard to predict. Programmers face even more difficulties when several applications run in parallel and influence each other indirectly. We tackle these problems with Perpetuum, a novel operating-system-based auto-tuner that is capable of tuning applications while they are running. We go beyond tuning one application in isolation and are the first to employ OS-based auto-tuning to improve system-wide application performance. Our fully functional auto-tuner extends the Linux kernel, and the application tuning process does not require any user involvement. General multicore applications are automatically re-tuned on new platforms while they are executing, which makes portability easy. Extensive case studies with real applications demonstrate the feasibility and efficiency of our approach. Perpetuum realizes a first milestone in our vision to make every performance-critical multicore application auto-tuned by default.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abudiab, I.: Online-tunable parallel edge detection in video streams. Student project thesis. Karlsruhe Institute of Technology (2010)
Agakov, F., et al.: Using machine learning to focus iterative optimization. In: CGO 2006, p. 11 (2006)
Agrawal, K., et al.: Adaptive scheduling with parallelism feedback. In: PPoPP 2006, p. 1 (2006)
Azimi, R., et al.: Enhancing operating system support for multicore processors by using hardware performance monitoring. SIGOPS Oper. Syst. Rev. 43(2), 56 (2009)
Cavazos, J., Moss, J.E.B., O’Boyle, M.: Hybrid optimizations: Which optimization algorithm to use? In: Mycroft, A., Zeller, A. (eds.) CC 2006. LNCS, vol. 3923, pp. 124–138. Springer, Heidelberg (2006)
Ţăpuş, C., et al.: Active harmony: towards automated performance tuning. In: SC 2002, p. 44 (2002)
Frigo, M., Johnson, S.: FFTW: an adaptive software architecture for the FFT. In: Proc. IEEE ICASSP 1998, vol. 3, p. 1381 (1998)
Goedegebure, S., et al.: Big buck bunny. An open source movie (April 2008), http://www.bigbuckbunny.org (last accessed May 2011)
Hartono, A., Ponnuswamy, S.: Annotation-based empirical performance tuning using Orio. In: IPDPS 2009, p. 1 (2009)
Intel: Threading building blocks (August 2006), http://www.threadingbuildingblocks.org
Karcher, T., et al.: Auto-tuning support for manycore applications: perspectives for operating systems and compilers. SIGOPS Oper. Syst. Rev. 43(2), 96 (2009); Special Iss. on the Interaction among the OS, Compilers, and Multicore Processors
Karcher, T., Pankratius, V.: Auto-Tuning Multicore Applications at Run-Time with a Cooperative Tuner. Karlsruhe Reports in Informatics 2011-4 (February 2011)
Mars, J., Hundt, R.: Scenario based optimization: A framework for statically enabling online optimizations. In: Proc. CGO 2009, p. 169 (2009)
Mars, J., et al.: Contention aware execution: online contention detection and response. In: Proc. CGO 2010, p. 257 (2010)
Morajko, A., et al.: Mate: Monitoring, analysis and tuning environment for parallel & distributed applications: Research articles. Concurr. Comput.: Pract. Exper. 19(11), 1517 (2007)
Nelder, J.A., Mead, R.: A simplex method for function minimization. The Computer Journal 7(4), 308 (1965)
Pankratius, V., et al.: Parallelizing bzip2: A case study in multicore software engineering. IEEE Software 26(6), 70 (2009)
Puschel, M., et al.: Spiral: code generation for dsp transforms. Proceedings of the IEEE 93(2), 232 (2005)
Schwedes, S.: Operating system integration of an automatic performance optimizer for parallel applications. Master’s thesis, Karlsruhe Institute of Technology (2009)
Seward, J.: Bzip2 (2011), http://www.bzip.org
Tabatabaee, V., Hollingsworth, J.K.: Automatic software interference detection in parallel applications. In: SC 2007, vol. 1, p. 14 (2007)
Tabatabaee, V., et al.: Parallel parameter tuning for applications with performance variability. In: SC 2005, p. 57 (2005)
Tiwari, A., et al.: Tuning parallel applications in parallel. Parallel Comput. 35(8-9), 475 (2009)
Whaley, C.R., et al.: Automated empirical optimizations of software and the atlas project. Parallel Computing 27(1-2), 3 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Karcher, T., Pankratius, V. (2011). Run-Time Automatic Performance Tuning for Multicore Applications. In: Jeannot, E., Namyst, R., Roman, J. (eds) Euro-Par 2011 Parallel Processing. Euro-Par 2011. Lecture Notes in Computer Science, vol 6852. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23400-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-23400-2_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23399-9
Online ISBN: 978-3-642-23400-2
eBook Packages: Computer ScienceComputer Science (R0)