skip to main content
10.1145/1878961.1879011acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

A performance model and code overlay generator for scratchpad enhanced embedded processors

Published:24 October 2010Publication History

ABSTRACT

Software managed scratchpad memories (SPMs) provide improved performance and power in embedded processors by reducing required hardware resources. Performance depends strongly on the scheme used to map code and data onto the SPM, but generating optimal mappings can be extremely difficult. Here we address instruction mapping on SPMs and present a performance model and algorithm, "Code Overlay Generator" (COG), for producing high performance dynamic SPM code mappings. Our heuristic does not require profiling information, and is suitable for generating mapping solutions for large programs which are otherwise infeasible using previously proposed Integer Linear Programming (ILP) techniques.

We compare our algorithm with a published heuristic and the code overlay mapping algorithm provided with the Cell Broadband Engine (CBE) Synergistic Processing Unit (SPU) compiler from IBM, spu-gcc. We find an average performance advantage of 34% compared to the previous algorithm, and 87% with respect to spugcc. We additionally show that our performance model enables improved tools for offline evaluation of code overlay performance and mapping selection.

References

  1. Cell Broadband Engine Architecture. IBM Systems and Technology Group, 2007.Google ScholarGoogle Scholar
  2. Software Development Kit for Multicore Acceleration Version 3.1 Programmer's Guide. IBM Systems and Technology Group, 2008.Google ScholarGoogle Scholar
  3. F. Angiolini, F. Menichelli, A. Ferrero, L. Benini, and M. Olivieri. A post-compiler approach to scratchpad mapping of code. In International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pages 259--267, Washington, DC, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Cytron and P. G. Loewner. An automatic overlay generator. IBM Journal of Research and Development, 30:603--608, Nov. 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Egger, J. Lee, and H. Shin. Scratchpad memory management for portable systems with a memory management unit. In International Conference On Embedded Software, pages 321--330, Seoul, Korea, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. gnu.org. GCC online documentation. http://gcc.gnu.org/onlinedocs/.Google ScholarGoogle Scholar
  7. A. Janapsatya, A. Ignjatovic, and S. Parameswaran. A novel instruction scratchpad memory optimization method based on concomitance metric. In Proc. Asia and South Pacific Design Automation Conference, pages 612--617, Yokohama, Japan, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Pabalkar, A. Shrivastava, A. Kannan, and J. Lee. SDRM: Simultaneous Determination of Regions and Function-to-Region Mapping for Scratchpad Memories. Lecture Notes in Computer Science. Berlin, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Pham et al. Overview of the Architecture, Circuit Design, and Physical Implementation of a First-generation Cell Processor. In IEEE Journal of Solid-State Circuits, volume 41, pages 179--196. IBM, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  10. Power Architecture editors. An introduction to compiling for the Cell Broadband Engine architecture. IBM, developerWorks, 2006.Google ScholarGoogle Scholar
  11. T. R. Spacek. A proposal to establish a pseudo virtual memory via writable overlays. Communications of the ACM, 15:421--426, June 1972. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Steinke, N. Grunwald, L. Wehmeyer, R. Banakar, M. Balakrishnan, and P. Marwedel. Reducing energy consumption by dynamic copying of instructions onto onchip memory. In ISSS '02: Proceedings of the 15th international symposium on System Synthesis, pages 213--218, New York, NY, USA, 2002. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Udayakumaran, A. Dominguez, and R. Barua. Dynamic allocation for scratch-pad memory using compile-time decisions. ACM Transactions on Embedded Computing Systems (TECS), 5:472--511, May 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Verma and P. Marwedel. Overlay techniques for scratchpad memories in low power embedded processors. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 14:802--815, Aug. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A performance model and code overlay generator for scratchpad enhanced embedded processors

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CODES/ISSS '10: Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
      October 2010
      348 pages
      ISBN:9781605589053
      DOI:10.1145/1878961

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 October 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate280of864submissions,32%

      Upcoming Conference

      ESWEEK '24
      Twentieth Embedded Systems Week
      September 29 - October 4, 2024
      Raleigh , NC , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader