Skip to main content

CPU Architecture Modelling and Co-design

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13948))

Included in the following conference series:

  • 1218 Accesses

Abstract

Co-design has become an established process for both developing high-performance computing (HPC) architectures (and, more specifically, CPU architectures) as well as HPC applications. The co-design process is frequently based on models. This paper discusses an approach to CPU architecture modelling and its relation to modelling theory. The approach is implemented using the gem5 simulator for Arm-based CPU architectures and applied for the purpose of generating co-design knowledge using two applications that are widely used on HPC systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The used simulator source including configuration are available at https://github.com/binebrank/gem5/tree/neoverse_model.

  2. 2.

    https://github.com/ssvb/tinymembench.

  3. 3.

    https://github.com/benchmark-subsetting/NPB3.0-omp-C.

  4. 4.

    In the carbon nanotube use case, 19-point stencil is used in combination with a 7-point stencil, which we have not evaluated.

  5. 5.

    The used source code, together with manually vectorized functions, is available at https://gitlab.jsc.fz-juelich.de/brank1/gpaw-benchmarks.

  6. 6.

    The rename component in gem5 stalls if there are no physical registers available or the ROB is full.

References

  1. Abraham, M.J., et al.: GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1, 19–25 (2015). https://doi.org/10.1016/j.softx.2015.06.001

    Article  Google Scholar 

  2. Akram, A., Sawalha, L.: Validation of the gem5 simulator for x86 architectures. In: 2019 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), pp. 53–58 (2019). https://doi.org/10.1109/PMBS49563.2019.00012

  3. Armejach, A., et al.: Stencil codes on a vector length agnostic architecture. In: Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques (PACT 2018). Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3243176.3243192

  4. Bailey, D.H.: The NAS parallel benchmarks. Tech. rep., LBNL (2009). https://doi.org/10.2172/983318

  5. Binkert, N., et al.: The gem5 simulator. SIGARCH Comput. Archit. News 39(2), 1–7 (2011). https://doi.org/10.1145/2024716.2024718

    Article  Google Scholar 

  6. Cataldo, R., et al.: Architectural exploration of last-level caches targeting homogeneous multicore systems. In: Proceedings of the 29th Symposium on Integrated Circuits and Systems Design: Chip on the Mountains (SBCCI 2016). IEEE Press (2017)

    Google Scholar 

  7. Dally, W.J., Turakhia, Y., Han, S.: Domain-specific hardware accelerators. Commun. ACM 63(7), 48–57 (2020). https://doi.org/10.1145/3361682

    Article  Google Scholar 

  8. Enkovaara, J., et al.: Electronic structure calculations with GPAW: a real-space implementation of the projector augmented-wave method. J. Phys.: Condens. Matter 22(25), 253202 (2010). https://doi.org/10.1088/0953-8984/22/25/253202

    Article  Google Scholar 

  9. Ghosh, A., Sinha, A., Chatterjee, A.: Exploring network on chip architectures using GEM5. In: 2017 International Conference on Information Technology (ICIT), pp. 50–55 (2017). https://doi.org/10.1109/ICIT.2017.16

  10. Halloun, I.A.: Modeling Theory in Science Education, vol. 24. Springer, Dordrecht (2007). https://doi.org/10.1007/1-4020-2140-2

  11. Hestenes, D.: Toward a modeling theory of physics instruction. Am. J. Phys. 55(5), 440–454 (1987). https://doi.org/10.1119/1.15129

    Article  MathSciNet  Google Scholar 

  12. Kodama, Y., Odajima, T., Asato, A., Sato, M.: Evaluation of the RIKEN Post-K processor simulator. CoRR abs/1904.06451 (2019). http://arxiv.org/abs/1904.06451

  13. Lavin, P., et al.: Evaluating gather and scatter performance on CPUs and GPUs. In: The International Symposium on Memory Systems (MEMSYS 2020), pp. 209–222. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3422575.3422794

  14. Lioen, W., et al.: D7.4: evaluation of benchmark performance (Final). Tech. rep., PRACE (2021). https://prace-ri.eu/wp-content/uploads/PRACE6IP-D7.4.pdf

  15. Lowe-Power, J., et al.: The gem5 simulator: version 20.0+ (2020). https://doi.org/10.48550/ARXIV.2007.03152

  16. McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers. IEEE Comput. Soc. Tech. Committee Comput. Archit. Newsl. 2, 19–25 (1995)

    Google Scholar 

  17. Ortega, C., et al.: Data prefetching on in-order processors. In: 2018 International Conference on High Performance Computing and Simulation (HPCS), pp. 322–329 (2018). https://doi.org/10.1109/HPCS.2018.00061

  18. Pellegrini, A., et al.: The Arm Neoverse N1 platform: building blocks for the next-gen cloud-to-edge infrastructure SoC. IEEE Micro 40(2), 53–62 (2020). https://doi.org/10.1109/MM.2020.2972222

    Article  Google Scholar 

  19. Sato, M., et al.: Co-design for A64FX manycore processor and “Fugaku”. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2020). IEEE Press (2020)

    Google Scholar 

  20. Shalf, J., Quinlan, D., Janssen, C.: Rethinking hardware-software codesign for exascale systems. Computer 44(11), 22–30 (2011). https://doi.org/10.1109/MC.2011.300

    Article  Google Scholar 

  21. Shao, Y.S., et al.: Co-designing accelerators and SoC interfaces using gem5-Aladdin. In: 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 1–12 (2016). https://doi.org/10.1109/MICRO.2016.7783751

  22. Stephens, N., et al.: The ARM scalable vector extension. IEEE Micro 37(2), 26–39 (2017). https://doi.org/10.1109/MM.2017.35

    Article  Google Scholar 

  23. Takahashi, D., Franchetti, F.: FFTE on SVE: SPIRAL-generated kernels. In: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region (HPCAsia2020), pp. 114–122. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3368474.3368488

  24. Thalheim, B.: The conceptual model \(\equiv \) an adequate and faithful artifact enhanced by concepts. Front. Artif. Intell. Appl. 260, 241–254 (2014). https://doi.org/10.3233/978-1-61499-361-2-241

    Article  Google Scholar 

  25. Walker, M., et al.: Hardware-validated CPU performance and energy modelling. In: 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 44–53 (2018). https://doi.org/10.1109/ISPASS.2018.00013

  26. Zaourar, L., et al.: Multilevel simulation-based co-design of next generation HPC microprocessors. In: 2021 International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), pp. 18–29 (2021). https://doi.org/10.1109/PMBS54543.2021.00008

Download references

Acknowledgements

The authors would like to thank the Stony Brook Research Computing and Cyberinfrastructure, and the Furthermore, we want to thank the Open Edge and HPC Initiative for access to an Arm-based development Funding for parts of this work has been received from the European Commission H2020 program under Grant Agreement 779877 (Mont-Blanc 2020), and from the Swedish e-Science Research Centre (SeRC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dirk Pleiter .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Brank, B., Pleiter, D. (2023). CPU Architecture Modelling and Co-design. In: Bhatele, A., Hammond, J., Baboulin, M., Kruse, C. (eds) High Performance Computing. ISC High Performance 2023. Lecture Notes in Computer Science, vol 13948. Springer, Cham. https://doi.org/10.1007/978-3-031-32041-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-32041-5_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-32040-8

  • Online ISBN: 978-3-031-32041-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics