Abstract
Diminishing performance returns and increasing power consumption of single-threaded processors have made chip multiprocessors (CMPs) an industry imperative. Unfortunately, poor software/hardware interaction and bottlenecks in shared hardware structures can prevent scaling to many cores. In fact, adding a core may harm performance and increase power consumption. Given these observations, we compare two approaches to predicting parallel application scalability: multiple linear regression and artificial neural networks (ANNs). We throttle concurrency to levels with higher predicted power/performance efficiency. We perform experiments on a state-of-the-art, dual-processor, quad-core platform, showing that both methodologies achieve high accuracy and identify energy-efficient concurrency levels in multithreaded scientific applications. The ANN approach has advantages, but the simpler regression-based model achieves slightly higher accuracy and performance. The approaches exhibit median error of 7.5% and 5.6%, and improve performance by an average of 7.4% and 9.5%, respectively.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A.: The landscape of parallel computing research: A view from berkeley. Technical report UCB/EECS-2006-183, Department of Electrical Engineering and Computer Science, University of California at Berkeley (December 2006)
Curtis-Maury, M., Blagojevic, F., Antonopoulos, C.D., Nikolopoulos, D.S.: Prediction-based power-performance adaptation of multithreaded scientific codes. IEEE Transactions on Parallel and Distributed Systems (October 2008)
Curtis-Maury, M., Singh, K., McKee, S.A., Blagojevic, F., Nikolopoulos, D.S., de Supinski, B.R., Schulz, M.: Identifying energy-efficient concurrency levels using machine learning. In: Proc. International Workshop on Green Computing (September 2007)
de Langen, P., Juurlink, B.: Leakage-aware multiprocessor scheduling for low power. In: Proc. 20th IEEE/ACM International Parallel and Distributed Processing Symposium (April 2006)
Electronic Educational Devices. Watts Up PRO (May 2009), http://www.wattsupmeters.com/
Ä°pek, E., McKee, S.A., Singh, K., Caruana, R., de Supinski, B.R., Schulz, M.: Efficient architectural design space exploration via predictive modeling. ACM Transactions on Architecture and Code Optimization 4(4) (2008)
Jin, H., Frumkin, M., Yan, J.: The OpenMP implementation of NAS parallel benchmarks and its performance. Technical report NAS-99-011, NASA Ames Research Center (October 1999)
Kadayif, I., Kandemir, M., Vijaykrishnan, N., Irwin, M.J., Kolcu, I.: Exploiting processor workload heterogeneity for reducing energy consumption on chip multiprocessors. In: Proc. ACM/IEEE Design, Automation and Test in Europe Conference and Exposition (February 2004)
Karkhanis, T.S., Smith, J.E.: A first-order superscalar processor model. In: Proc. 31st IEEE/ACM International Symposium on Computer Architecture (June 2004)
Lee, B.C., Brooks, D.: Accurate and efficient regression modeling for microarchitectural performance and power prediction. In: Proc. 12th ACM Symposium on Architectural Support for Programming Languages and Operating Systems (June 2006)
Lee, B.C., Brooks, D., de Supinski, B.R., Schulz, M., Singh, K., McKee, S.A.: Methods of inference and learning for performance modeling of parallel applications. In: Proc. ACM Symposium on the Principles and Practice of Parallel Programming (March 2007)
Li, J., MartÃnez, J.F.: Dynamic power-performance adaptation of parallel computation on chip multiprocessors. In: Proc. 12th IEEE Symposium on High Performance Computer Architecture (February 2006)
Mitchell, T.M.: Machine Learning. WCB/McGraw Hill, Boston (1997)
Saha, B., Adl-Tabatabai, A.-R., Ghuloum, A.M., Rajagopalan, M., Hudson, R.L., Petersen, L., Menon, V., Murphy, B.R., Shpeisman, T., Sprangle, E., Rohillah, A., Carmean, D., Fang, J.: Enabling scalability and performance in a large scale CMP environment. In: Proc. ACM SIGOPS/EuroSys European Conference on Computer Systems (March 2007)
Sasaki, H., Ikeda, Y., Kondo, M., Nakamura, H.: An intra-task DVFS technique based on statistical analysis of hardware events. In: Proc. ACM Computing Frontiers Conference (May 2007)
Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically characterizing large scale program behavior. In: Proc. 8th ACM Symposium on Architectural Support for Programming Languages and Operating Systems (October 2002)
Singh, K., McKee, S.A., de Supinski, B.R., Schulz, M.: Using machine learning to explore huge parameter spaces for high end computing applications: Tools and examples. Technical Report CSL-TR-2007-1049, Cornell Computer Systems Lab (July 2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Singh, K. et al. (2010). Comparing Scalability Prediction Strategies on an SMP of CMPs. In: D’Ambra, P., Guarracino, M., Talia, D. (eds) Euro-Par 2010 - Parallel Processing. Euro-Par 2010. Lecture Notes in Computer Science, vol 6271. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15277-1_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-15277-1_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15276-4
Online ISBN: 978-3-642-15277-1
eBook Packages: Computer ScienceComputer Science (R0)