Abstract
This work proposes a new technique for performance evaluation to predict performance of parallel programs across diverse and complex systems. In this work the term system is comprehensive of the hardware organization, the development and execution environment.
The proposed technique considers the collection of completion times for some pairs (program, system) and constructs an empirical model that learns to predict performance of unknown pairs (program, system). This approach is feature-agnostic because it does not involve previous knowledge of program and/or system characteristics (features) to predict performance.
Experimental results conducted with a large number of serial and parallel benchmark suites, including SPEC CPU2006, SPEC OMP2012, and systems show that the proposed technique is equally applicable to be employed in several compelling performance evaluation studies, including characterization, comparison and tuning of hardware configurations, compilers, run-time environments or any combination thereof.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Moore, G.E.: Cramming more components onto integrated circuits. In: Readings in Computer Architecture, pp. 56–59 (2000)
Borkar, S., Chien, A.A.: The future of microprocessors. Commun. ACM 54(5), 67–77 (2011)
Jones, C.G., Liu, R., Meyerovich, L., Asanovic, K., Bodik, R.: Parallelizing the web browser. In: Proceedings of the First USENIX Workshop on Hot Topics in Parallelism (2009)
Paulson, L.D.: Developers shift to dynamic programming languages. Computer 40(2), 12–15 (2007)
Ruparelia, N.B.: Software development lifecycle models. SIGSOFT Softw. Eng. Notes 35(3), 8–13 (2010)
Hall, M.W., Padua, D.A., Pingali, K.: Compiler research: the next 50 years. Commun. ACM 52(2), 60–67 (2009)
Levinthal, D.: Performance Analysis Guide for Intel Core i7 Processor and Intel Xeon 5500 processors (2009)
Luk, C., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (2005)
Piccart, B., Georges, A., Blockeel, H., Eeckhout, L.: Ranking commercial machines through data transposition. In: Proceedings of the 2011 IEEE International Symposium on Workload Characterization (2011)
Van Craeynest, K., Jaleel, A., Eeckhout, L., Narvaez, P., Emer, J.: Scheduling heterogeneous multi-cores through performance impact estimation (pie). In: Proceedings of the 39th Annual International Symposium on Computer Architecture (2012)
Cavazos, J., Fursin, G., Agakov, F., Bonilla, E., O’Boyle, M.F.P., Temam, O.: Rapidly selecting good compiler optimizations using performance counters. In: Proceedings of the International Symposium on Code Generation and Optimization (2007)
Fursin, G., Temam, O.: Collective optimization: A practical collaborative approach. ACM Trans. Archit. Code Optim. 7(4) (December 2010)
Grewe, D., O’Boyle, M.F.P.: A static task partitioning approach for heterogeneous systems using opencl. In: Knoop, J. (ed.) CC 2011. LNCS, vol. 6601, pp. 286–305. Springer, Heidelberg (2011)
Moore, R.W., Childers, B.R.: Automatic generation of program affinity policies using machine learning. In: Jhala, R., De Bosschere, K. (eds.) CC 2013. LNCS, vol. 7791, pp. 184–203. Springer, Heidelberg (2013)
Zhang, Y., Voss, M.: Runtime empirical selection of loop schedulers on hyperthreaded smps. In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (2005)
Lee, B.C., Brooks, D.M.: Accurate and efficient regression modeling for microarchitectural performance and power prediction. In: Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (2006)
Dubach, C., Jones, T.M., O’Boyle, M.F.P.: Microarchitectural design space exploration using an architecture-centric approach. In: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (2007)
Dubach, C., Jones, T.M., Bonilla, E.V., O’Boyle, M.F.P.: A predictive model for dynamic microarchitectural adaptivity control. In: Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture (2010)
Hoste, K., Phansalkar, A., Eeckhout, L., Georges, A., John, L.K., Bosschere, K.D.: Performance prediction based on inherent program similarity. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (2006)
Meeuws, R., Ostadzadeh, S.A., Galuzzi, C., Sima, V.M., Nane, R., Bertels, K.: Quipu: A statistical model for predicting hardware resources. ACM Trans. Reconfigurable Technol. Syst. 6(1) (2013)
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci., PNAS (1998)
Sokal, R.R., Michener, C.D.: A statistical method for evaluating systematic relationships. University of Kansas Scientific Bulletin 28 (1958)
Rousseeuw, P.J., Leroy, A.M.: Robust regression and outlier detection. John Wiley & Sons, Inc. (1987)
Smola, A.J., Schölkopf, B.: A tutorial on support vector regression. Statistics and Computing 14(3) (August 2004)
Stone, M.: Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society B 36(1), 111–147 (1974)
Janowitz, M.F.: Ordinal and Relational Clustering. World Scientific (2010)
Phansalkar, A., Joshi, A., Eeckhout, L., John, L.K.: Measuring program similarity: Experiments with spec cpu benchmark suites (2005)
Reinders, J.: VTune Performance Analyzer Essentials: Measurement and Tuning Techniques for Software Developers. Engineer to Engineer Series. Intel Press (2005)
Jung, C., Rus, S., Railing, B.P., Clark, N., Pande, S.: Brainy: effective selection of data structures. In: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 86–97 (2011)
Dujmovic, J.J.: Universal benchmark suites. In: Proceedings of the 7th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (1999)
Yi, J.J., Lilja, D.J., Hawkins, D.M.: A statistically rigorous approach for improving simulation methodology. In: Proceedings of the 9th International Symposium on High-Performance Computer Architecture (2003)
Park, E., Cavazos, J., Alvarez, M.A.: Using graph-based program characterization for predictive modeling. In: Proceedings of the Tenth International Symposium on Code Generation and Optimization, pp. 196–206 (2012)
Granger, C.W.J., Ramanathan, R.: Improved methods of combining forecasts. Journal of Forecasting 3 (1984)
Henning, J.L.: SPEC CPU2006 benchmark descriptions. SIGARCH Comput. Archit. News 34(4) (September 2006)
Aslot, V., Eigenmann, R.: Performance characteristics of the spec omp2001 benchmarks. SIGARCH Comput. Archit. News (2001)
Müller, M.S., et al.: SPEC OMP2012 — An Application Benchmark Suite for Parallel Systems Using OpenMP. In: Chapman, B.M., Massaioli, F., Müller, M.S., Rorro, M. (eds.) IWOMP 2012. LNCS, vol. 7312, pp. 223–236. Springer, Heidelberg (2012)
Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The nas parallel benchmarks: summary and preliminary results. In: Proceedings of the Conference on Supercomputing (1991)
Firasta, N., Buxton, M., Jinbo, P., Nasri, K., Kuo, S.: Intel AVX: New frontiers in performance improvements and energy efficiency. Intel white paper (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cammarota, R., Beni, L.A., Nicolau, A., Veidenbaum, A.V. (2013). Optimizing Program Performance via Similarity, Using a Feature-Agnostic Approach. In: Wu, C., Cohen, A. (eds) Advanced Parallel Processing Technologies. APPT 2013. Lecture Notes in Computer Science, vol 8299. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45293-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-45293-2_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45292-5
Online ISBN: 978-3-642-45293-2
eBook Packages: Computer ScienceComputer Science (R0)