Skip to main content
Log in

Toward a general framework for jointly processor-workload empirical modeling

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The complexity of state-of-the-art processor architectures and their consequent vast design spaces have made it difficult and time-consuming to explore the best configuration for them. Design space exploration (DSE) refers to systematic analysis and pruning of unwanted design points based on parameters of interest. DSE requires analysis and estimation of performance criteria of design points. A more accurate estimation produces a more efficient target design. A typical estimation method is machine learning approaches based on statistical inference, also known as empirical modeling, which requires only a limited number of simulations. Undoubtedly, an empirical model finds the optima much faster than using cycle-accurate simulations and is much more accurate than employing analytical models. For that purpose, our paper proposes a general methodology and a framework to find an appropriate and most accurate empirical model to estimate the performance of general-purpose or embedded multiprocessors running multithreaded workloads. This framework consists of three main steps: (1) Workload characterization and clustering, (2) Finding optimal model, and (3) Estimating the performance of a new workload outside the training set. These optimal performance prediction models could be utilized in the process of exploring the architectural design space. An experimental case is also tested using this framework for feasibility purposes. Validation experiments show MAEs less than 10% for this case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Lee BC, Brooks DM (2007) Illustrative Design Space Studies with Microarchitectural Regression Models. In: IEEE 13th International Symposium on High-Performance Computer Architecture

  2. Lee BC, Brooks DM (2010) Applied inference: case studies in microarchitectural design. ACM Transactions on Architecture and Code Optimization 7:1–37

    Article  Google Scholar 

  3. Steen SVD et al (2015) Micro-architecture independent analytical processor performance and power modeling. In: IEEE Computer Society

  4. Steen SVD et al (2016) Analytical processor performance and power modeling using micro-architecture independent characteristics”. IEEE Transactions on Computers 65(12):3537

    MathSciNet  MATH  Google Scholar 

  5. Jongerius R et al (2018) Analytic multi-core processor model for fast design-space exploration. IEEE Transactions on Computers 67(6):755

    Article  MathSciNet  Google Scholar 

  6. Cui W et al (2018) Charm: a language for closed-form high-level architecture modeling. In: ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA)

  7. Carlson TE et al (2014) An evaluation of high-level mechanistic core models. ACM Transactions on Architecture and Code Optimization 11(3):1–25

    Article  Google Scholar 

  8. Breughe MB, Eyerman S, Eeckhout L (2015) Mechanistic Analytical Modeling of Superscalar In-Order Processor Performance. ACM Transactions on Architecture and Code Optimization 11(4):1–26

    Article  Google Scholar 

  9. Ahmadinejad Hoda, Fatemi Omid (2018) Moving Towards Grey-Box Predictive Models at Micro-architecture Level by Investigating Inherent Program Characteristics. IET Computers & Digital Techniques 12(2):53

    Article  Google Scholar 

  10. Zhang Y et al (2017) A novel evaluation method for superscalar out-of-order ARM microprocessors targeting android applications. In: IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)

  11. Pestel SD et al (2018) RPPM: Rapid performance prediction of multithreaded applications on multicore hardware. IEEE Computer Architecture Letters 12(2):183

    Article  Google Scholar 

  12. Silvano C et al (2010) MULTICUBE: multi-objective design space exploration of multi-core architectures. In: IEEE Computer Society Annual Symposium on VLSI

  13. Thornton C et al (2013) Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: 19th ACM SIGKDD International Conference on Knowledge discovery and data mining

  14. Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: 5th International Conference on Learning and Intelligent Optimization Italy

  15. Joseph P, Vaswani K, Thazhuthaveetil MJ (2006) Construction and use of linear regression models for processor performance analysis. In: 12th Symposium on High-Performance Computer Architecture

  16. Lee BC, Brooks DM (2006) Accurate and efficient regression modeling for microarchitectural performance and power prediction. In: 12th International Conference on Architectural Support for Programming Languages and Operating Systems

  17. Rodrigues R et al (2013) A study on the use of performance counters to estimate power in microprocessors. IEEE Transactions on Circuits and Systems II: Express Briefs 60(12):882

    Article  Google Scholar 

  18. Powell MD et al (2009) CAMP: a technique to estimate per-structure power at run-time using a few simple parameters. In: The IEEE 15th International Symposium on High-Performance Computer Architecture

  19. Lively C et al (2011) Power-aware predictive models of hybrid (MPI/OpenMP) scientific applications on multicore systems. Computer Science-Research and Development, Springer 27:245

    Article  Google Scholar 

  20. Vijayalakshmi S et al (2011) A study on factors influencing power consumption in multithreaded and multi-core CPUs. WSEAS Transactions on Computers 10(3):93

    Google Scholar 

  21. Lee BC et al (2008) CPR: composable performance regression for scalable multiprocessor models. In: The International Symposium on Microarchitecture

  22. Wu W, Lee BC (2012) Inferred models for dynamic and sparse hardware-software spaces. In: The 45th Annual IEEE/ACM International Symposium on Microarchitecture

  23. Shafiabadi MH et al (2020) Comprehensive Regression-based Model to Predict Performance of General-Purpose Graphics Processing Unit. Journal of Cluster Computing 23:1505

    Article  Google Scholar 

  24. Ipek E et al (2008) Efficient architectural design space exploration via predictive modeling. ACM Transactions on Architecture and Code Optimization 4(4):1–34

    Article  Google Scholar 

  25. O’Neal K, Brisk P (2018) Predictive modeling for CPU, GPU, and FPGA performance and power consumption: a survey. In: IEEE Computer Society Annual Symposium on VLSI

  26. Dubach C, Jones TM, O’Boyle MFP (2007) Microarchitectural design space exploration using an architecture-centric approach. In: The 40th Annual IEEE/ACM International Symposium on Microarchitecture

  27. Dubach C, Jones TM, O’Boyle MFP (2009) Rapid early-stage microarchitecture design using predictive models. In: The IEEE International Conference on Computer Design

  28. Lee BC et al (2007) Methods of inference and learning for performance modeling of parallel applications. In: The 12th ACM SIGPLAN Symposium on Principles and practice of parallel programming

  29. Ozisikyilmaz B, Memik G, Choudhary A (2008) Efficient system design space exploration using machine learning techniques. In: The 45th ACM/IEEE Design Automation Conference

  30. Yan W, Liu J, Lin C (2010) A hybrid modeling approach to microarchitecture design space exploring. In: The 9th International Conference on Grid and Cooperative Computing

  31. Joseph PJ, Vaswani K, Thazhuthaveetil MJ (2006) A predictive performance model for superscalar processors. In: The 39th Annual IEEE/ACM International Symposium on Microarchitecture

  32. Dubach C, Jones TM, O’Boyle MFP (2008) Exploring and predicting the architecture/optimizing compiler co-design space. In: The International Conference on Compilers, Architectures, and Synthesis for Embedded Systems

  33. Li B, Peng L, Ramadass B (2009) Accurate and efficient processor performance prediction via regression tree-based modeling. Journal of Systems Architecture: the EUROMICRO Journal archive 55(10):457

    Article  Google Scholar 

  34. T. Chen et al., “Effective and Efficient Microprocessor Design Space Exploration Using Unlabeled Design Configurations”, ACM Transactions on Intelligent Systems and Technology, 2014

  35. Rai JK et al (2010) Performance prediction on multi-core processors. In: The International Conference on Computational Intelligence and Communication Networks, India

  36. Lahiri K, Kunnoth S (2017) Fast IPC estimation for performance projections using proxy suites and decision trees. In: IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

  37. Wang Y et al (2019) Predicting new workload or CPU performance by analyzing public datasets. In: ACM Transactions on Architecture and Code Optimization

  38. Malakar P et al (2018) Benchmarking machine learning methods for performance modeling of scientific applications. In: IEEE/ACM Conference on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems

  39. Li D et al (2018) Processor design space exploration via statistical sampling and semi-supervised ensemble learning. IEEE Access 6:25495

    Article  Google Scholar 

  40. Hall M et al (2009) The WEKA data mining software: an update. SIGKDD Explorations 11(1):10

    Article  Google Scholar 

  41. Hoste K, Eeckhout L (2007) Microarchitecture-Independent Workload Characterization. IEEE Micro 27(3):63

    Article  Google Scholar 

  42. Reddi VJ et al (2004) PIN: a binary instrumentation tool for computer architecture research and education. In: 31st International Symposium on Computer Architecture, Germany

  43. Hyvärinen A, Oja E (2000) Independent Component Analysis: Algorithms and Applications. Neural Networks 13(4–5):411

    Article  Google Scholar 

  44. Topchy A, Jain AK, Punch W (2005) Clustering Ensembles: Models of Consensus and Weak Partitions. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(12):1866

    Article  Google Scholar 

  45. Yi JJ, Lilja DJ, Hawkins DM (2005) Improving computer architecture simulation methodology by adding statistical rigor. IEEE Transactions on Computers 54(11):1360

    Article  Google Scholar 

  46. Kim J, Seo BS (2013) How to calculate sample size and why. Clin Orthop Surg 5(3):235

    Article  Google Scholar 

  47. McKay MD, Beckman RJ, Conover WJ (1979) A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21(2):55

    MathSciNet  MATH  Google Scholar 

  48. Johnston J (1984) Econometric Methods, 3rd edn. McGraw-Hill, New York

    Google Scholar 

  49. Woo SC et al (1995) The SPLASH-2 programs: characterization and methodological considerations. In: The 22nd Annual International Symposium on Computer Architecture

  50. Bienia C et al (2008) The PARSEC Benchmark suite: characterization and architectural implications. In: The 17th International Conference on Parallel Architectures and Compilation Techniques

  51. Breughe M et al (2011) How sensitive is processor customization to the workload’s input datasets?. In: The IEEE 9th Symposium on Application Specific Processors

  52. Ubal R et al (2012) Multi2Sim: a simulation framework for CPU–GPU computing. In: The 21st International Conference on Parallel Architectures and Compilation Techniques (PACT)

  53. Li S, et al (2009) McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In: The 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamed Sheidaeian.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sheidaeian, H., Fatemi, O. Toward a general framework for jointly processor-workload empirical modeling. J Supercomput 77, 5319–5353 (2021). https://doi.org/10.1007/s11227-020-03475-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03475-9

Keywords