Abstract
The complexity of state-of-the-art processor architectures and their consequent vast design spaces have made it difficult and time-consuming to explore the best configuration for them. Design space exploration (DSE) refers to systematic analysis and pruning of unwanted design points based on parameters of interest. DSE requires analysis and estimation of performance criteria of design points. A more accurate estimation produces a more efficient target design. A typical estimation method is machine learning approaches based on statistical inference, also known as empirical modeling, which requires only a limited number of simulations. Undoubtedly, an empirical model finds the optima much faster than using cycle-accurate simulations and is much more accurate than employing analytical models. For that purpose, our paper proposes a general methodology and a framework to find an appropriate and most accurate empirical model to estimate the performance of general-purpose or embedded multiprocessors running multithreaded workloads. This framework consists of three main steps: (1) Workload characterization and clustering, (2) Finding optimal model, and (3) Estimating the performance of a new workload outside the training set. These optimal performance prediction models could be utilized in the process of exploring the architectural design space. An experimental case is also tested using this framework for feasibility purposes. Validation experiments show MAEs less than 10% for this case.















Similar content being viewed by others
References
Lee BC, Brooks DM (2007) Illustrative Design Space Studies with Microarchitectural Regression Models. In: IEEE 13th International Symposium on High-Performance Computer Architecture
Lee BC, Brooks DM (2010) Applied inference: case studies in microarchitectural design. ACM Transactions on Architecture and Code Optimization 7:1–37
Steen SVD et al (2015) Micro-architecture independent analytical processor performance and power modeling. In: IEEE Computer Society
Steen SVD et al (2016) Analytical processor performance and power modeling using micro-architecture independent characteristics”. IEEE Transactions on Computers 65(12):3537
Jongerius R et al (2018) Analytic multi-core processor model for fast design-space exploration. IEEE Transactions on Computers 67(6):755
Cui W et al (2018) Charm: a language for closed-form high-level architecture modeling. In: ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA)
Carlson TE et al (2014) An evaluation of high-level mechanistic core models. ACM Transactions on Architecture and Code Optimization 11(3):1–25
Breughe MB, Eyerman S, Eeckhout L (2015) Mechanistic Analytical Modeling of Superscalar In-Order Processor Performance. ACM Transactions on Architecture and Code Optimization 11(4):1–26
Ahmadinejad Hoda, Fatemi Omid (2018) Moving Towards Grey-Box Predictive Models at Micro-architecture Level by Investigating Inherent Program Characteristics. IET Computers & Digital Techniques 12(2):53
Zhang Y et al (2017) A novel evaluation method for superscalar out-of-order ARM microprocessors targeting android applications. In: IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)
Pestel SD et al (2018) RPPM: Rapid performance prediction of multithreaded applications on multicore hardware. IEEE Computer Architecture Letters 12(2):183
Silvano C et al (2010) MULTICUBE: multi-objective design space exploration of multi-core architectures. In: IEEE Computer Society Annual Symposium on VLSI
Thornton C et al (2013) Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: 19th ACM SIGKDD International Conference on Knowledge discovery and data mining
Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: 5th International Conference on Learning and Intelligent Optimization Italy
Joseph P, Vaswani K, Thazhuthaveetil MJ (2006) Construction and use of linear regression models for processor performance analysis. In: 12th Symposium on High-Performance Computer Architecture
Lee BC, Brooks DM (2006) Accurate and efficient regression modeling for microarchitectural performance and power prediction. In: 12th International Conference on Architectural Support for Programming Languages and Operating Systems
Rodrigues R et al (2013) A study on the use of performance counters to estimate power in microprocessors. IEEE Transactions on Circuits and Systems II: Express Briefs 60(12):882
Powell MD et al (2009) CAMP: a technique to estimate per-structure power at run-time using a few simple parameters. In: The IEEE 15th International Symposium on High-Performance Computer Architecture
Lively C et al (2011) Power-aware predictive models of hybrid (MPI/OpenMP) scientific applications on multicore systems. Computer Science-Research and Development, Springer 27:245
Vijayalakshmi S et al (2011) A study on factors influencing power consumption in multithreaded and multi-core CPUs. WSEAS Transactions on Computers 10(3):93
Lee BC et al (2008) CPR: composable performance regression for scalable multiprocessor models. In: The International Symposium on Microarchitecture
Wu W, Lee BC (2012) Inferred models for dynamic and sparse hardware-software spaces. In: The 45th Annual IEEE/ACM International Symposium on Microarchitecture
Shafiabadi MH et al (2020) Comprehensive Regression-based Model to Predict Performance of General-Purpose Graphics Processing Unit. Journal of Cluster Computing 23:1505
Ipek E et al (2008) Efficient architectural design space exploration via predictive modeling. ACM Transactions on Architecture and Code Optimization 4(4):1–34
O’Neal K, Brisk P (2018) Predictive modeling for CPU, GPU, and FPGA performance and power consumption: a survey. In: IEEE Computer Society Annual Symposium on VLSI
Dubach C, Jones TM, O’Boyle MFP (2007) Microarchitectural design space exploration using an architecture-centric approach. In: The 40th Annual IEEE/ACM International Symposium on Microarchitecture
Dubach C, Jones TM, O’Boyle MFP (2009) Rapid early-stage microarchitecture design using predictive models. In: The IEEE International Conference on Computer Design
Lee BC et al (2007) Methods of inference and learning for performance modeling of parallel applications. In: The 12th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Ozisikyilmaz B, Memik G, Choudhary A (2008) Efficient system design space exploration using machine learning techniques. In: The 45th ACM/IEEE Design Automation Conference
Yan W, Liu J, Lin C (2010) A hybrid modeling approach to microarchitecture design space exploring. In: The 9th International Conference on Grid and Cooperative Computing
Joseph PJ, Vaswani K, Thazhuthaveetil MJ (2006) A predictive performance model for superscalar processors. In: The 39th Annual IEEE/ACM International Symposium on Microarchitecture
Dubach C, Jones TM, O’Boyle MFP (2008) Exploring and predicting the architecture/optimizing compiler co-design space. In: The International Conference on Compilers, Architectures, and Synthesis for Embedded Systems
Li B, Peng L, Ramadass B (2009) Accurate and efficient processor performance prediction via regression tree-based modeling. Journal of Systems Architecture: the EUROMICRO Journal archive 55(10):457
T. Chen et al., “Effective and Efficient Microprocessor Design Space Exploration Using Unlabeled Design Configurations”, ACM Transactions on Intelligent Systems and Technology, 2014
Rai JK et al (2010) Performance prediction on multi-core processors. In: The International Conference on Computational Intelligence and Communication Networks, India
Lahiri K, Kunnoth S (2017) Fast IPC estimation for performance projections using proxy suites and decision trees. In: IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
Wang Y et al (2019) Predicting new workload or CPU performance by analyzing public datasets. In: ACM Transactions on Architecture and Code Optimization
Malakar P et al (2018) Benchmarking machine learning methods for performance modeling of scientific applications. In: IEEE/ACM Conference on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems
Li D et al (2018) Processor design space exploration via statistical sampling and semi-supervised ensemble learning. IEEE Access 6:25495
Hall M et al (2009) The WEKA data mining software: an update. SIGKDD Explorations 11(1):10
Hoste K, Eeckhout L (2007) Microarchitecture-Independent Workload Characterization. IEEE Micro 27(3):63
Reddi VJ et al (2004) PIN: a binary instrumentation tool for computer architecture research and education. In: 31st International Symposium on Computer Architecture, Germany
Hyvärinen A, Oja E (2000) Independent Component Analysis: Algorithms and Applications. Neural Networks 13(4–5):411
Topchy A, Jain AK, Punch W (2005) Clustering Ensembles: Models of Consensus and Weak Partitions. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(12):1866
Yi JJ, Lilja DJ, Hawkins DM (2005) Improving computer architecture simulation methodology by adding statistical rigor. IEEE Transactions on Computers 54(11):1360
Kim J, Seo BS (2013) How to calculate sample size and why. Clin Orthop Surg 5(3):235
McKay MD, Beckman RJ, Conover WJ (1979) A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21(2):55
Johnston J (1984) Econometric Methods, 3rd edn. McGraw-Hill, New York
Woo SC et al (1995) The SPLASH-2 programs: characterization and methodological considerations. In: The 22nd Annual International Symposium on Computer Architecture
Bienia C et al (2008) The PARSEC Benchmark suite: characterization and architectural implications. In: The 17th International Conference on Parallel Architectures and Compilation Techniques
Breughe M et al (2011) How sensitive is processor customization to the workload’s input datasets?. In: The IEEE 9th Symposium on Application Specific Processors
Ubal R et al (2012) Multi2Sim: a simulation framework for CPU–GPU computing. In: The 21st International Conference on Parallel Architectures and Compilation Techniques (PACT)
Li S, et al (2009) McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In: The 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sheidaeian, H., Fatemi, O. Toward a general framework for jointly processor-workload empirical modeling. J Supercomput 77, 5319–5353 (2021). https://doi.org/10.1007/s11227-020-03475-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-020-03475-9