skip to main content
10.1145/2656075.2656088acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

HSAemu: a full system emulator for HSA platforms

Published:12 October 2014Publication History

ABSTRACT

Heterogeneous System Architecture (HSA) is an open industry standard designed to support a large variety of data-parallel and task-parallel programming models. Currently, most of HSA hardware and software components are still in development. It is helpful to provide various heterogeneous simulation environments for HSA developers in developing HSA software stacks. This paper presents the design of HSAemu, a full system emulator for the HSA platform, and illustrates how those HSA features are implemented in the simulator. HSAemu provides an infrastructure of heterogeneous simulation environments by supporting required HSA features, including hUMA, hQ and HSAIL. Based on the infrastructure, HSAemu provide two simulation models, FastSim and DeepSim, for high-speed functional emulation and slow cycle-accurate simulation, respectively. In our preliminary experiments, HSAemu helps test a complete HSA software stack and profile system performance. Our case studies show that HSAemu is very useful as a hardware/software co-design tool for heterogeneous systems.

References

  1. Bellard, F. QEMU, a fast and portable dynamic translator. In USENIX ATC, 41--46, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Magnusson, P. S., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F. Simics: A full system simulation platform. Computer. 35, 2 (2002), 50--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. AMD. A revolutionary, new architecture pioneered by AMD. Retrieved April 20, 2014 from http://www.amd.com/en-us/innovations/software-technologies/hsaGoogle ScholarGoogle Scholar
  4. HSA Foundation. HSA programmer's reference manual: HSAIL virtual ISA and programming model, compiler writer's guide, and object format (BRIG) v.95. Retrieved April 20, 2014 from http://hsafoundation.com/standards/Google ScholarGoogle Scholar
  5. Lattner, C., and Adve, V. LLVM: a compilation framework for lifelong program analysis & transformation. In CGO, 75--86, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ding, J. H., Chang, P. C., Hsu, W. C., and Chung, Y. C. PQEMU: a parallel system emulator based on QEMU. In ICPADS, 276--283, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ubal, R., Sahuquillo, J., Petit, S., and López, P. Multi2Sim: a simulation framework to evaluate multicore-multithread processors. In HPCA, 62--68, 2007.Google ScholarGoogle Scholar
  8. Bakhoda, A., Yuan, G. L., Fung, W. W. L., Wong, H., and Aamodt, T. M. Analyzing CUDA workloads using a detailed GPU simulator. In ISPASS, 163--174, 2009.Google ScholarGoogle Scholar
  9. Leng, J., Hetherington, T., ElTantawy, A., Gilani, S., Kim, N. S., Aamodt, T. M., and Reddi, V. J. GPUWattch: Enabling Energy Optimizations in GPGPUs. In ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Austin T., Larson E., and Ernst D. SimpleScalar: an infrastructure for computer system modeling. Computer. 35, 2 (2002), 59--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Brooks, D., Tiwari, V., and Martonosi, M. Wattch: A Framework for Architectural-Level Power Analysis and Optimization. In ISCA, 83--94, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Sanchez, D., and Kozyrakis, C. ZSim: fast and accurate microarchitectural simulation of thousand-core systems. In ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Witchel, E. and Rosenblum R. 1996. "Embra: fast and flexible machine simulation," In Proc. ACM SIGMETRICS Intl. Conf. on Measurement and Modeling of Computer Systems, 68--78, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Bohrer, P., Peterson, J., Elnozahy, M., Rajamony, R., Gheith, A., Rockhold, R., Lefurgy, C., et al. Mambo: a full system simulator for the PowerPC architecture. ACM SIGMETRICS Performance Evaluation Review, 31, 4 (2004), 8--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Wang, Z., Liu, R., Chen, Y., Wu, X., Chen, H., Zhang, W., and Zang, B. 2011. COREMU: a scalable and portable parallel full-system emulator. In PPoPP, 213--222, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Wang, K., Zhang, Y., Wang, Y., and Shen, X. Parallelization of IBM Mambo System Simulator in Functional Modes. ACM SIGOPS Operating Systems Review, 42, 1 (2008). 71--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Lantz, R. E. Fast Functional Simulation with Parallel Embra. In Proceedings of Workshop on Modeling, Benchmarking, and Simulation (MoBS), 2008.Google ScholarGoogle Scholar
  18. Hung, S.-H., Shih C.-S., Kuo T.-W, Tu C.-H., Tu, and Chang C.-W. A Real-Time, Energy-Efficient System Software Suite for Heterogeneous Multicore Platforms. In CODES+ISSS, 23--32, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Stone, J. E., Gohara, D., and Shi, G. OpenCL: a parallel programming standard for heterogeneous computing systems. Computing in Science & Engineering. 12, 3 (2010), 66--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Collange, S., Daumas, M., Defour, D., and Parello, D. 2010. Barra: a parallel functional simulator for GPGPU. In Proc. IEEE Intl. Symp. on Modeling, Analysis & Simulation of Computer and Telecommunication Systems. 351--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. August, D., Chang, J., Girbal, S., Gracia-Perez, D., Mouchard, G., Penry, D., Temam, O., and Vachharajani, N. UNISIM: An Open Simulation Environment and Library for Complex Architecture Design and Collaborative Development. IEEE CAL, 6, 2(2007), 45--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Diamos, G. F., Kerr, A. R., Yalamanchili, S., and Clark, N. Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems. In PACT, 353--364, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Zakharenko, V., Aamodt, T., and Moshovos, A. Characterizing the performance benefits of fused CPU/GPU systems using FusionSim. In DATE, 685--688, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yourst, M. T. PTLsim: a cycle accurate full system x86--64 microarchitectural simulator. In ISPASS, 23--24, 2007.Google ScholarGoogle Scholar
  25. Shen, B.-Y., Chen, J.-Y., Hsu, W,-C., and Yang, W. LLBT: An LLVM-Based Static Binary Translator. In CASES, 51--60, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Perez, G. A., Kao, C.-M., Hsu, W.-C., and Chung, Y.-C. A Hybrid Just-In-Time Compiler for Android. In CASES, 41--50, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Smith, J. E., and Nair, R.. Virtual machines: versatile platforms for systems and processes. Elsevier, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. HSAemu: a full system emulator for HSA platforms

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CODES '14: Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis
        October 2014
        331 pages
        ISBN:9781450330510
        DOI:10.1145/2656075

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 October 2014

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate280of864submissions,32%

        Upcoming Conference

        ESWEEK '24
        Twentieth Embedded Systems Week
        September 29 - October 4, 2024
        Raleigh , NC , USA

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader