skip to main content
10.1145/1529282.1529649acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Celling SHIM: compiling deterministic concurrency to a heterogeneous multicore

Published:08 March 2009Publication History

ABSTRACT

Parallel architectures are the way of the future, but are notoriously difficult to program. In addition to the low-level constructs they often present (e.g., locks, DMA, and non-sequential memory models), most parallel programming environments admit data races: the environment may make nondeterministic scheduling choices that can change the function of the program.

We believe the solution is model-based design, where the programmer is presented with a constrained higher-level language that prevents certain unwanted behavior. In this paper, we describe a compiler for the SHIM scheduling-independent concurrent language that generates code for the Cell Broadband heterogeneous multicore processor. The complexity of the code our compiler generates relative to the source illustrates how difficult it is to manually write code for the Cell.

We demonstrate the efficacy of our compiler on two examples. While the SHIM language is (by design) not ideal for every algorithm, it works well for certain applications and simplifies the parallel programming process, especially on the Cell architecture.

References

  1. S. V. Adve et al. Shared memory consistency models: A tutorial. Computer, 29(12):66--76, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. V. Agarwal et al. Clock rate versus IPC: The end of the road for conventional microarchitectures. In Intl. Symp. Computer Architecture (ISCA), pages 248--259, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. W. Ainsworth and T. M. Pinkston. Characterizing the Cell EIB on-chip network. IEEE Micro, 27(5):6--14, Sept. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. D. Blumofe et al. Cilk: An efficient multithreaded runtime system. In Principles and Practice of Parallel Programming (PPoPP), pages 207--216, Santa Barbara, CA, July 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. C. Chow et al. A programming example: Large FFT on the Cell Broadband Engine. In Global Signal Processing Expo (GSPx), Santa Clara, CA, Oct. 2005. (from IBM)Google ScholarGoogle Scholar
  6. S. A. Edwards and O. Tardieu. SHIM: A deterministic model for heterogeneous embedded systems. In Embedded Software (Emsoft), pages 37--44, Jersey City, New Jersey, Sept. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. A. Edwards, N. Vasudevan, and O. Tardieu. Programming shared memory multiprocessors with deterministic message-passing concurrency: Compiling SHIM to Pthreads. In Proc. Design, Automation, and Test in Europe (DATE), pages 1498--1503, Munich, Germany, Mar. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. E. Eichenberger et al. Using advanced compiler technology to exploit the performance of the Cell Broadband Engine architecture. IBM Sys. J., 45(1):59--84, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. E. Eichenberger et al. Optimizing compiler for the CELL processor. In Par. Arch. and Compilation Techniques (PACT), pages 161--172, Saint Louis, MO, Sept. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. Fatahalian, T. J. Knight, M. Houston, M. Erez, D. R. Horn, L. Leem, J. Y. Park, M. Ren, A. Aiken, W. J. Dally, and P. Hanrahan. Sequoia: Programming the memory hierarchy. In Supercomputing (SC), Tampa, FL, 2006. Article 83. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. B. Gedik et al. CellSort: High performance sorting on the Cell processor. In Very Large Data Bases (VLDB), pp. 1286--1297, Vienna, Austria, Sept. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C. A. R. Hoare. Communicating sequential processes. Communications of the ACM, 21(8):666--677, Aug. 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. IBM. Cell Broadband Engine Architecture v1.02, Oct. 2007.Google ScholarGoogle Scholar
  14. IBM. Example Library API Reference v3.0, Sept. 2007.Google ScholarGoogle Scholar
  15. J. A. Kahle et al. Introduction to the Cell multiprocessor. IBM J. of R&D, 49(4/5):589--604, July/Sep. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Kahn. The semantics of a simple language for parallel programming. In Information Processing 74: IFIP Congress 74, pages 471--475, Stockholm, Sweden, Aug. 1974.Google ScholarGoogle Scholar
  17. M. Kistler, M. Perrone, and F. Petrini. Cell multiprocessor communication network: Built for speed. IEEE Micro, 26(3):10--23, May-June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. E. A. Lee and D. G. Messerschmitt. Synchronous data flow. Proc. IEEE, 75(9):1235--1245, Sept. 1987.Google ScholarGoogle ScholarCross RefCross Ref
  19. The Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, June 1995. Version 1.1.Google ScholarGoogle Scholar
  20. M. Ohara, H. Inoue, Y. Sohda, H. Komatsu, and T. Nakatani. MPI microtask for programming the Cell Broadband Engine processor. IBM Systems Journal, 45(1):85--102, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. OpenMP Arch. Review Board, www.openmp.org. OpenMP C and C++ Application Program Interface, 2002. Ver. 2.0.Google ScholarGoogle Scholar
  22. F. Petrini et al. Multicore surprises: Lessons learned from optimizing Sweep3D on the Cell Broadband Engine. In Intl. Parallel and Distributed Processing Symposium (IPDPS), pages 1--10, Long Beach, CA, Mar. 2007.Google ScholarGoogle Scholar
  23. D. Pham et al. The design and implementation of a first-generation Cell processor. In Solid-State Cir. Conf. (ISSCC), v. 1, pp. 184--185, San Francisco, CA, Feb. 2005.Google ScholarGoogle ScholarCross RefCross Ref
  24. T. Saidani, S. Piskorski, L. Lacassagne, and S. Bouaziz. Parallelization schemes for memory optimization on the Cell processor: A case study of image processing algorithm. In Workshop on Memory Performance: Dealing with Applications, Systems and Architecture (MEDEA), pages 9--16, Brastov, Romania, Sept. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. O. Tardieu and S. A. Edwards. Scheduling-independent threads and exceptions in SHIM. In Embedded Software (Emsoft), pages 142--151, Seoul, Korea, Oct. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. W. Thies, M. Karczmarek, and S. Amarasinghe. StreamIt: A language for streaming applications. In Compiler Construction (CC), volume 2304 of LNCS, pages 179--196, Grenoble, France, Apr. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. N. Vasudevan and S. A. Edwards. Static deadlock detection for the SHIM concurrent language. In Formal Methods and Models for Codesign, Anaheim, CA, June 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Celling SHIM: compiling deterministic concurrency to a heterogeneous multicore

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SAC '09: Proceedings of the 2009 ACM symposium on Applied Computing
      March 2009
      2347 pages
      ISBN:9781605581668
      DOI:10.1145/1529282

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 8 March 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,650of6,669submissions,25%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader