Abstract
Modern microprocessor design relies heavily on detailed full-chip performance simulations to evaluate complex trade-offs. Typically, different design alternatives are tried out for a specific sub-system or component, while keeping the rest of the system unchanged. We observe that full-chip simulations for such studies is overkill. This paper introduces mesoscale simulation, which employs high-level modeling for the unchanged parts of a design and uses detailed cycle-accurate simulations for the components being modified. This combination of high-level and low-level modeling enables accuracy on par with detailed full-chip modeling while achieving much higher simulation speeds than detailed full-chip simulations. Consequently, mesoscale models can be used to quickly explore vast areas of the design space with high fidelity. We describe a proof-of-concept mesoscale implementation of the memory subsystem of the Cell/B.E. processor and discuss results from running various workloads.
Similar content being viewed by others
References
Bloch G., Greiner S., de Meer H., Trivedi K.: Queueing Networks and Markov Chains. Wiley-Interscience, Hoboken, NJ (2006)
Kleinrock L.: Queueing Systems Volume 1: Theory. John Wiley, New York (1975)
Kleinrock L.: Queueing Systems Volume 2: Computer Applications. John Wiley, New York (1976)
Tam, E.S., Rivers, J.A., Davidson, E.S.: “Flexible timing simulation of multiple-cache configurations,” University of Michigan, Technical Report 348–97 (1997)
Pai, V., Ranganathan, P., Adve, S.: “The impact of instruction-level parallelism on multiprocessor performance and simulation methodology”. In: Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), pp. 72–83, February 1997
Yi, J.J., Lilja, D.J., Hawkins D.M.: “A statistically rigorous approach for improving simulation methodology”. In: Proceedings of the International Symposium on High Performance Computer Architecture, February 2003)
Kahle, J.A., Day, M.N., Hofstee, H.P., Johns, C.R., Maeurer, T.R., Shippy, D.: Introduction to the cell multiprocessor. IBM J Res Dev 49 (2005)
“Cell broadband engine resource center: Software development kit (SDK) 3.1,” 2008. [Online]. Available: http://www.ibm.com/developerworks/power/cell/documents.html
Law A., Kelton W.: Simulation Modeling and Analysis. 3rd edn. McGraw-Hill, Boston (2000)
Banks, J., Carson, J.S., Nelson, B.L., Nicol, D.M.: Discrete-event System Simulation. 5th edn. Prentice Hall, Upper Saddle River, NJ (2010)
Cassandras C., Lafortune S.: Introduction to Discrete Event Systems. 2nd edn. Springer Science+Business Media, New York (2008)
Hackenberg, D.: “Fast matrix multiplication on cell systems,” July 2007. [Online]. Available: http://www.tu-dresden.de/zih/cell/matmul
Lee, W., Patel, K., Pedram, M.: “B2Sim: A fast micro-architecture simulator based on basic block characterization,” In: Proceedings of CODES+ISSS, October 2006
Mukherjee, S.S., Reinhardt, S.K., Falsafi, B., Litzkow, M., Huss-Lederman, S., Hill, M.D., Larus, J.R., Wood, D.A.: Wisconsin wind tunnel II: a fast and portable parallel architecture simulator”. In: Proceedings of the Workshop on Performance Analysis and Its Impact on Design (PAID), June 1997
Genbrugge, D., Eyerman, S., Eeckhout, L.: “Interval simulation: raising the level of abstraction in architectural simulation”. In: Proceedings of the International Symposium on High Performance Computer Architecture, pp. 1–12, January 2010
Denzel, W.E., Li, J., Walker, P., Jin, Y.: “A framework for end-to-end simulation of high-performance computing systems”. In: Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications, Networks, and Systems, March 2008
Prete, C.A., Prina, G., Ricciardi, L.: A trace-driven simulator for performance evaluation of cache-based multiprocessor systems. IEEE Trans Parallel Distrib Syst. 6 (1995)
Wild, T., Herkersdorf, A., Ohlendorf, R.: Performance evaluation for system-on-chip architectures using trace-based transaction level simulation. J Syst Archit: the EUROMICRO Journal. 53 (2007)
Schnarr, E., Larus, J.: “Fast out-of-order processor simulation using memoization”. In: Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 283–294, october 1998
Ipek, E., McKee, S.A., de Supinski, B.R., Schulz, M., Caruana, R.: “Efficiently exploring architectural design spaces via predictive modeling”. In: Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), (2006)
Onder, S., Gupta, R.: Automating generation of microarchitecture simulators. In: Proceedings of the International Conference on Computer Languages (ICCL), (1998)
Vachharajani, M., Vachharajani, N., Penry, D.A., Blome, J.A., August, D.I.: “Microarchitectural exploration with Liberty”. In: Proceedings of the 35th International Symposium on Microarchitecture, (2002)
August, D.I., Chang, J., Girbal, S., Perez, D.G., Mouchard, G., Penry, D.A., Temam, O., Vachharajani, N.: UNISIM: an open simulation environment and library for complex architecture design and collaborative development. IEEE Comput Archit Lett (CAL), (2007)
KelinOsowski, A. Lilja, D.J.: Minnespec: A new SPEC benchmark workload for simulation-based computer architecture research. IEEE Comput Archit Lett (CAL). 1 (2002)
Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically characterizing large scale program behavior”. In: Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 2002
Wunderlich, R.E., Wenisch, T.F., Falsafi, B., Hoe, J.C.: “SMARTS: Accelerating microarchitecture simulation via rigorous statistical sampling”. In: Proceedings of the 30th Annual International Symposium on Computer Architecture (ISCA), June 2003
Wenisch, T.F., Wunderlich, R.E., Falsafi, B., Hoe, J.C.: “TurboSMARTS: Accurate microarchitecture simulation sampling in minutes”. In: Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, (2005)
Chiou, D., Sunwoo, D., Kim, J., Patil, N.A., Reinhart, W., Johnson, D.E., Keefe, J., Angepat, H.: “FPGA-accelerated simulation technologies (FAST): Fast, full-system, cycle-accurate simulators”. In: Proceedings of the 40th International Symposium on Microarchitecture (MICRO), (2007)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Dr. Raffaela Mirandola and David Lilja.
T. Kiss participated in this work while working at IBM Germany Research & Development, Boeblingen, Germany. R. Rangan participated in this work while working at IBM Research in Austin, TX.
Rights and permissions
About this article
Cite this article
Altevogt, P., Kiss, T., Kistler, M. et al. Mesoscale performance simulation of multicore processor systems. Softw Syst Model 12, 731–744 (2013). https://doi.org/10.1007/s10270-012-0231-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10270-012-0231-6