ABSTRACT
As computer systems grow larger and more complex, it takes more time to simulate their behavior in detail. Researchers interested in simulating large-scale systems must choose between less-accurate high-level models or simulating smaller portions of their benchmark suite, both of which are highly manual, offline approaches that require time-consuming analysis by experts. Multifidelity simulation aims to lessen this burden by automatically adapting the fidelity of a simulation to the complexity of the behavior occurring at any given point in time. We show how a multifidelity memory system model can be used to accelerate single node simulation by up to 2x with 1-5% mean absolute percent error in the simulated instructions per cycle across benchmark suites.
- Tae-Hyuk Ahn, Damian Dechev, Heshan Lin, Helgi Adalsteinsson, and Curtis L Janssen. 2011. Evaluating Performance Optimizations of Large-scale Genomic Sequence Search Applications using SST/macro.. In SIMULTECH. 65–73.Google Scholar
- Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News 39, 2 (aug 2011), 1–7. https://doi.org/10.1145/2024716.2024718Google ScholarDigital Library
- Trevor E. Carlson, Wim Heirman, Stijn Eyerman, Ibrahim Hur, and Lieven Eeckhout. 2014. An Evaluation of High-Level Mechanistic Core Models. ACM Transactions on Architecture and Code Optimization (TACO), Article 5 (2014), 23 pages. https://doi.org/10.1145/2629677Google ScholarDigital Library
- Trevor E Carlson, Wim Heirman, Kenzo Van Craeynest, and Lieven Eeckhout. 2014. Barrierpoint: Sampled simulation of multi-threaded applications. In 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). IEEE, 2–12.Google ScholarCross Ref
- Seon Han Choi, Kyung-Min Seo, and Tag Gon Kim. 2017. Accelerated simulation of discrete event dynamic systems via a multi-fidelity modeling framework. Applied Sciences 7, 10 (2017), 1056.Google ScholarCross Ref
- T.M. Conte, M.A. Hirsch, and K.N. Menezes. 1996. Reducing state loss for effective trace sampling of superscalar processors. In Proceedings International Conference on Computer Design. VLSI in Computers and Processors. 468–477. https://doi.org/10.1109/ICCD.1996.563595Google ScholarCross Ref
- Ashutosh S. Dhodapkar and James E. Smith. 2002. Managing Multi-Configuration Hardware via Dynamic Working Set Analysis. In Proceedings of the 29th Annual International Symposium on Computer Architecture (Anchorage, Alaska) (ISCA ’02). IEEE Computer Society, USA, 233–244.Google Scholar
- Lieven Eeckhout. 2010. Computer architecture performance evaluation methods. Morgan & Claypool Publishers.Google Scholar
- Johannes Feldmann, Kira Kraft, Lukas Steiner, Norbert Wehn, and Matthias Jung. 2020. Fast and Accurate DRAM Simulation: Can we Further Accelerate it?. In 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE). 364–369. https://doi.org/10.23919/DATE48585.2020.9116275Google ScholarCross Ref
- Qi Guo, Tianshi Chen, Yunji Chen, and Franz Franchetti. 2016. Accelerating Architectural Simulation Via Statistical Techniques: A Survey. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 35, 3 (March 2016), 433–446. https://doi.org/10.1109/TCAD.2015.2481796Google ScholarDigital Library
- Simon David Hammond, Karl Scott Hemmert, Michael J Levenhagen, Arun F Rodrigues, and Gwendolyn Renae Voskuilen. 2015. Ember: Reference Communication Patterns for Exascale.Technical Report. Sandia National Lab.(SNL-NM), Albuquerque, NM (United States).Google Scholar
- Mor Harchol-Balter. 2013. Performance modeling and design of computer systems: queueing theory in action. Cambridge University Press.Google ScholarDigital Library
- Jeyhun Karimov, Tilmann Rabl, and Volker Markl. 2019. Polybench: The first benchmark for polystores. In Performance Evaluation and Benchmarking for the Era of Artificial Intelligence: 10th TPC Technology Conference, TPCTC 2018, Rio de Janeiro, Brazil, August 27–31, 2018, Revised Selected Papers 10. Springer, 24–41.Google ScholarCross Ref
- Patrick Lavin, Jeffrey Young, Richard Vuduc, and Jonathan Beard. 2021. Online model swapping for architectural simulation. In Proceedings of the 18th ACM International Conference on Computing Frontiers. 102–112.Google ScholarDigital Library
- Patrick Lavin, Jeffrey Young, Richard Vuduc, Jason Riedy, Aaron Vose, and Daniel Ernst. 2020. Evaluating Gather and Scatter Performance on CPUs and GPUs. In The International Symposium on Memory Systems. 209–222.Google Scholar
- Wonbok Lee, Kimish Patel, and Massoud Pedram. 2006. B 2 Sim:: a fast micro-architecture simulator based on basic block characterization. In Proceedings of the 4th international conference on Hardware/software codesign and system synthesis - CODES+ISSS ’06. ACM Press, Seoul, Korea, 199. https://doi.org/10.1145/1176254.1176303Google ScholarDigital Library
- Shang Li, Zhiyuan Yang, Dhiraj Reddy, Ankur Srivastava, and Bruce Jacob. 2020. DRAMsim3: A Cycle-Accurate, Thermal-Capable DRAM Simulator. IEEE Computer Architecture Letters 19, 2 (July 2020), 106–109. https://doi.org/10.1109/LCA.2020.2973991Google ScholarDigital Library
- Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. SIGPLAN Not. 40, 6 (jun 2005), 190–200. https://doi.org/10.1145/1064978.1065034Google ScholarDigital Library
- John D. McCalpin. 1995. Memory Bandwidth and Machine Balance in Current High Performance Computers. IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter (Dec. 1995), 19–25.Google Scholar
- Chris Nellis, Thomas Danielson, Aditya Savara, and Celine Hin. 2018. The ft-pj-rg method: an adjacent-rolling-windows based steady-state detection technique for application to kinetic monte carlo simulations. Computer Physics Communications 232 (2018), 124–138.Google ScholarCross Ref
- Daiheng Ni. 2011. Multiscale modeling of traffic flow. Mathematica Aeterna 1, 1 (2011), 27–54.Google Scholar
- EPFL Parallel Systems Architecture Lab (PARSA). 2020. QFlex. https://qflex.epfl.chGoogle Scholar
- Harish Patil and Trevor E Carlson. 2014. Pinballs: portable and shareable user-level checkpoints for reproducible analysis and simulation. In Proceedings of the Workshop on Reproducible Research Methodologies (REPRODUCE), Vol. 2.Google Scholar
- Erez Perelman, Greg Hamerly, Michael Van Biesbrouck, Timothy Sherwood, and Brad Calder. 2003. Using SimPoint for Accurate and Efficient Simulation. In Proceedings of the 2003 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (San Diego, CA, USA) (SIGMETRICS ’03). Association for Computing Machinery, New York, NY, USA, 318–319. https://doi.org/10.1145/781027.781076Google ScholarDigital Library
- A. F. Rodrigues, K. S. Hemmert, B. W. Barrett, C. Kersey, R. Oldfield, M. Weston, R. Risen, J. Cook, P. Rosenfeld, E. Cooper-Balis, and B. Jacob. 2011. The Structural Simulation Toolkit. SIGMETRICS Perform. Eval. Rev. 38, 4 (March 2011), 37–42. https://doi.org/10.1145/1964218.1964225Google ScholarDigital Library
- Lukas Steiner, Matthias Jung, Felipe S. Prado, Kirill Bykov, and Norbert Wehn. 2020. DRAMSys4.0: A Fast and Cycle-Accurate SystemC/TLM-Based DRAM Simulator. In Embedded Computer Systems: Architectures, Modeling, and Simulation, Alex Orailoglu, Matthias Jung, and Marc Reichenbach (Eds.). Springer International Publishing, Cham, 110–126.Google Scholar
- Samuel Williams, Andrew Waterman, and David Patterson. 2009. Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52, 4 (2009), 65–76.Google ScholarDigital Library
- Jeffrey S. Young, Jason Riedy, Thomas M. Conte, Vivek Sarkar, Prasanth Chatarasi, and Sriseshan Srikanth. 2019. Experimental Insights from the Rogues Gallery. In 2019 IEEE International Conference on Rebooting Computing (ICRC). 1–8. https://doi.org/10.1109/ICRC.2019.8914707Google ScholarCross Ref
Index Terms
- Multifidelity Memory System Simulation in SST
Recommendations
Online model swapping for architectural simulation
CF '21: Proceedings of the 18th ACM International Conference on Computing FrontiersAs systems and applications grow more complex, detailed computer architecture simulation takes an ever increasing amount of time. Longer simulation times result in slower design iterations which then force architects to use simpler models, such as ...
Fast, Accurate, and Validated Full-System Software Simulation of x86 Hardware
This article presents a fast and accurate interval-based CPU timing model that is easily implemented and integrated in the COTSon full-system simulation infrastructure. Validation against real x86 hardware demonstrates the timing model's accuracy. The ...
Circuit-aware architectural simulation
DAC '04: Proceedings of the 41st annual Design Automation ConferenceArchitectural simulation has achieved a prominent role in the system design cycle by providing designers the ability to quickly examine a wide variety of design choices. However, the recent trend in system design toward architectures that react to ...
Comments