Abstract
Co-design has become an established process for both developing high-performance computing (HPC) architectures (and, more specifically, CPU architectures) as well as HPC applications. The co-design process is frequently based on models. This paper discusses an approach to CPU architecture modelling and its relation to modelling theory. The approach is implemented using the gem5 simulator for Arm-based CPU architectures and applied for the purpose of generating co-design knowledge using two applications that are widely used on HPC systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The used simulator source including configuration are available at https://github.com/binebrank/gem5/tree/neoverse_model.
- 2.
- 3.
- 4.
In the carbon nanotube use case, 19-point stencil is used in combination with a 7-point stencil, which we have not evaluated.
- 5.
The used source code, together with manually vectorized functions, is available at https://gitlab.jsc.fz-juelich.de/brank1/gpaw-benchmarks.
- 6.
The rename component in gem5 stalls if there are no physical registers available or the ROB is full.
References
Abraham, M.J., et al.: GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1, 19–25 (2015). https://doi.org/10.1016/j.softx.2015.06.001
Akram, A., Sawalha, L.: Validation of the gem5 simulator for x86 architectures. In: 2019 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), pp. 53–58 (2019). https://doi.org/10.1109/PMBS49563.2019.00012
Armejach, A., et al.: Stencil codes on a vector length agnostic architecture. In: Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques (PACT 2018). Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3243176.3243192
Bailey, D.H.: The NAS parallel benchmarks. Tech. rep., LBNL (2009). https://doi.org/10.2172/983318
Binkert, N., et al.: The gem5 simulator. SIGARCH Comput. Archit. News 39(2), 1–7 (2011). https://doi.org/10.1145/2024716.2024718
Cataldo, R., et al.: Architectural exploration of last-level caches targeting homogeneous multicore systems. In: Proceedings of the 29th Symposium on Integrated Circuits and Systems Design: Chip on the Mountains (SBCCI 2016). IEEE Press (2017)
Dally, W.J., Turakhia, Y., Han, S.: Domain-specific hardware accelerators. Commun. ACM 63(7), 48–57 (2020). https://doi.org/10.1145/3361682
Enkovaara, J., et al.: Electronic structure calculations with GPAW: a real-space implementation of the projector augmented-wave method. J. Phys.: Condens. Matter 22(25), 253202 (2010). https://doi.org/10.1088/0953-8984/22/25/253202
Ghosh, A., Sinha, A., Chatterjee, A.: Exploring network on chip architectures using GEM5. In: 2017 International Conference on Information Technology (ICIT), pp. 50–55 (2017). https://doi.org/10.1109/ICIT.2017.16
Halloun, I.A.: Modeling Theory in Science Education, vol. 24. Springer, Dordrecht (2007). https://doi.org/10.1007/1-4020-2140-2
Hestenes, D.: Toward a modeling theory of physics instruction. Am. J. Phys. 55(5), 440–454 (1987). https://doi.org/10.1119/1.15129
Kodama, Y., Odajima, T., Asato, A., Sato, M.: Evaluation of the RIKEN Post-K processor simulator. CoRR abs/1904.06451 (2019). http://arxiv.org/abs/1904.06451
Lavin, P., et al.: Evaluating gather and scatter performance on CPUs and GPUs. In: The International Symposium on Memory Systems (MEMSYS 2020), pp. 209–222. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3422575.3422794
Lioen, W., et al.: D7.4: evaluation of benchmark performance (Final). Tech. rep., PRACE (2021). https://prace-ri.eu/wp-content/uploads/PRACE6IP-D7.4.pdf
Lowe-Power, J., et al.: The gem5 simulator: version 20.0+ (2020). https://doi.org/10.48550/ARXIV.2007.03152
McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers. IEEE Comput. Soc. Tech. Committee Comput. Archit. Newsl. 2, 19–25 (1995)
Ortega, C., et al.: Data prefetching on in-order processors. In: 2018 International Conference on High Performance Computing and Simulation (HPCS), pp. 322–329 (2018). https://doi.org/10.1109/HPCS.2018.00061
Pellegrini, A., et al.: The Arm Neoverse N1 platform: building blocks for the next-gen cloud-to-edge infrastructure SoC. IEEE Micro 40(2), 53–62 (2020). https://doi.org/10.1109/MM.2020.2972222
Sato, M., et al.: Co-design for A64FX manycore processor and “Fugaku”. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2020). IEEE Press (2020)
Shalf, J., Quinlan, D., Janssen, C.: Rethinking hardware-software codesign for exascale systems. Computer 44(11), 22–30 (2011). https://doi.org/10.1109/MC.2011.300
Shao, Y.S., et al.: Co-designing accelerators and SoC interfaces using gem5-Aladdin. In: 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 1–12 (2016). https://doi.org/10.1109/MICRO.2016.7783751
Stephens, N., et al.: The ARM scalable vector extension. IEEE Micro 37(2), 26–39 (2017). https://doi.org/10.1109/MM.2017.35
Takahashi, D., Franchetti, F.: FFTE on SVE: SPIRAL-generated kernels. In: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region (HPCAsia2020), pp. 114–122. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3368474.3368488
Thalheim, B.: The conceptual model \(\equiv \) an adequate and faithful artifact enhanced by concepts. Front. Artif. Intell. Appl. 260, 241–254 (2014). https://doi.org/10.3233/978-1-61499-361-2-241
Walker, M., et al.: Hardware-validated CPU performance and energy modelling. In: 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 44–53 (2018). https://doi.org/10.1109/ISPASS.2018.00013
Zaourar, L., et al.: Multilevel simulation-based co-design of next generation HPC microprocessors. In: 2021 International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), pp. 18–29 (2021). https://doi.org/10.1109/PMBS54543.2021.00008
Acknowledgements
The authors would like to thank the Stony Brook Research Computing and Cyberinfrastructure, and the Furthermore, we want to thank the Open Edge and HPC Initiative for access to an Arm-based development Funding for parts of this work has been received from the European Commission H2020 program under Grant Agreement 779877 (Mont-Blanc 2020), and from the Swedish e-Science Research Centre (SeRC).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Brank, B., Pleiter, D. (2023). CPU Architecture Modelling and Co-design. In: Bhatele, A., Hammond, J., Baboulin, M., Kruse, C. (eds) High Performance Computing. ISC High Performance 2023. Lecture Notes in Computer Science, vol 13948. Springer, Cham. https://doi.org/10.1007/978-3-031-32041-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-32041-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-32040-8
Online ISBN: 978-3-031-32041-5
eBook Packages: Computer ScienceComputer Science (R0)