skip to main content
10.1145/1878921.1878924acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

Erbium: a deterministic, concurrent intermediate representation to map data-flow tasks to scalable, persistent streaming processes

Published:24 October 2010Publication History

ABSTRACT

Tuning applications for multicore systems involve subtle concurrency concepts and target-dependent optimizations. This paper advocates for a streaming execution model, called ER, where persistent processes communicate and synchronize through a multi-consumer processing applications, we demonstrate the scalability and efficiency advantages of streaming compared to data-driven scheduling. To exploit these benefits in compilers for parallel languages, we propose an intermediate representation enabling the compilation of data-flow tasks into streaming processes. This intermediate representation also facilitates the application of classical compiler optimizations to concurrent programs.

References

  1. G. Al-Kadi and A. S. Terechko. A hardware task scheduler for embedded video processing. In Proc. of the 4th Intl. Conf. on High Performance and Embedded Architectures and Compilers (HiPEAC'09), Paphos, Cyprus, Jan. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Aldinucci, M. Meneghin, and M. Torquati. Efficient Smith-Waterman on multi-core with FastFlow. In Euromicro Intl. Conf. on Parallel, Distributed and Network-Based Processing, pages 195--199, Pisa, Feb. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Arvind, R. S. Nikhil, and K. Pingali. I-structures: Data structures for parallel computing. ACM Trans. on Programming Languages and Systems, 11(4):598--632, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Augonnet, S. Thibault, R. Namyst, and M. Nijhuis. Exploiting the Cell/BE architecture with the StarPU unified runtime system. In Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS'09), pages 329--339, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Azevedo, C. Meenderinck, B. H. H. Juurlink, A. Terechko, J. Hoogerbrugge, M. Alvarez, and A. Ramírez. Parallel H.264 decoding on an embedded multicore processor. In Proc. of the 4th Intl. Conf. on High Performance and Embedded Architectures and Compilers (HiPEAC'09), Paphos, Cyprus, Jan. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. M. Carpenter, D. Ródenas, X. Martorell, A. Ramırez, and E. Ayguadé. A streaming machine description and programming model. In Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS'07), pages 107--116, Samos, Greece, July 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Caspi and M. Pouzet. Synchronous Kahn networks. In ACM Intl. Conf. on Functional programming (ICFP'96), pages 226--238, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Cohen, L. Mandel, F. Plateau, and M. Pouzet. Abstraction of clocks in synchronous data-flow systems. In 6th Asian Symp. on Programming Languages and Systems (APLAS 08), Bangalore, India, Dec. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. I. Corp. Occam Programming Manual. Prentice Hall, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. E. Culler and Arvind. Resource requirements of dataflow programs. In ISCA, pages 141--150, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. B. Dennis and G. R. Gao. An efficient pipelined dataflow processor architecture. In Supercomputing (SC'88), pages 368--373, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. M. et al. Acotes project: Advanced compiler technologies for embedded streaming. Intl. J. of Parallel Programming, 2010. Special issue on European HiPEAC network of excellence member's projects.Google ScholarGoogle Scholar
  13. F. L. Fessant and L. Maranget. Compiling join-patterns. Electr. Notes Theor. Comput. Sci., 16(3), 1998.Google ScholarGoogle Scholar
  14. C. Fournet and G. Gonthier. The reflexive chemical abstract machine and the join-calculus. In ACM Symp. on Principles of Programming Languages, pages 372--385, St. Petersburg Beach, Florida, Jan. 1996. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Giacomoni, T. Moseley, and M. Vachharajani. Fastforward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue. In ACM Symp. on Principles and practice of parallel programming (PPoPP'08), pages 43--52, Salt Lake City, Utah, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Gupta. Exploiting parallelism on a fine-grain MIMD architecture based upon channel queues. Intl. J. of Parallel Programming, 21(3):169--192, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. W. Haid, L. Schor, K. Huang, I. Bacivarov, and L. Thiele. Efficient execution of Kahn process networks on multi-processor systems using protothreads and windowed FIFOs. In Workshop on Embedded Systems for Real-Time Multimedia (ESTImedia'09), pages 35--44, Grenoble, France, Oct. 2009.Google ScholarGoogle ScholarCross RefCross Ref
  18. N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous dataflow programming language Lustre. Proc. of the IEEE, 79(9):1305--1320, Sept. 1991.Google ScholarGoogle ScholarCross RefCross Ref
  19. R. H. Halstead, Jr. Multilisp: a language for concurrent symbolic computation. ACM Trans. on Programming Languages and Systems, 7(4):501--538, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. Henriksson and P. van der Wolf. TTL hardware interface: A high-level interface for streaming multiprocessor architectures. In Workshop on Embedded Systems for Real-Time Multimedia (ESTImedia'06), pages 107--112, Seoul, Korea, Oct. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. A. R. Hoare. Communicating Sequential Processes. Prentice-Hall, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. G. Kahn. The semantics of a simple language for parallel programming. In J. L. Rosenfeld, editor, Information processing, pages 471--475, Stockholm, Sweden, Aug. 1974. North Holland, Amsterdam.Google ScholarGoogle Scholar
  23. C. Kim, J.-L. Gaudiot, and W. Proskurowski. Parallel computing with the sisal applicative language: Programmability and performance issues. Software, Practice and Experience, 26(9):1025--1051, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Kyriacou, P. Evripidou, and P. Trancoso. Data-driven multithreading using conventional microprocessors. IEEE Trans. on Parallel Distributed Systems, 17(10):1176--1188, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. E. A. Lee and D. G. Messerschmitt. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. on Computers, 36(1):24--25, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. A. Lee and A. L. Sangiovanni-Vincentelli. A framework for comparing models of computation. IEEE Trans. on CAD of Integrated Circuits and Systems, 17(12):1217--1229, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. H. R. M. Frigo, C. E. Leiserson. The implementation of the Cilk-5 multithreaded language. In ACM Symp. on Programming Language Design and Implementation (PLDI'98), pages 212--223, Montreal, Quebec, June 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. V. Marjanovic, J. Labarta, E. Ayguadé, and M. Valero. Effective communication and computation overlap with hybrid MPI/SMPSs. In PPOPP, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Milner, J. Parrow, and D. Walker. A calculus of mobile processes, i and ii. Inf. Comput., 100(1):1--40 and 41--77, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Olszewski, J. Ansel, and S. Amarasinghe. Kendo: Efficient deterministic multithreading in software. In The Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, Washington, DC, Mar 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. G. Ottoni, R. Rangan, A. Stoler, and D. I. August. Automatic thread extraction with decoupled software pipelining. In IEEE Intl. Symp. on Microarchitecture (MICRO'05), pages 105--118, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. M. Pérez, P. Bellens, R. M. Badia, and J. Labarta. CellSs: Making it easier to program the cell broadband engine processor. IBM Journal of Research and Development, 51(5):593--604, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Planas, R. M. Badia, E. Ayguadé, and J. Labarta. Hierarchical task-based programming with starss. Intl. J. on High Performance Computing Architecture, 23(3):284--299, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A. Pop and A. Cohen. A stream-comptuting extension to OpenMP. In Proc. of the 4th Intl. Conf. on High Performance and Embedded Architectures and Compilers (HiPEAC'11), Jan. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. A. Pop, S. Pop, and J. Sjödin. Automatic streamization in GCC. In GCC Developer's Summit, Montreal, Quebec, June 2009.Google ScholarGoogle Scholar
  36. M. C. Rinard and M. S. Lam. The design, implementation, and evaluation of Jade. ACM Trans. on Programming Languages and Systems, 20(3):483--545, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Själander, A. Terechko, and M. Duranton. A look-ahead task management unit for embedded multi-core architectures. In Proc. of the 2008 11th EUROMICRO Conf. on Digital System Design Architectures, Parma, Italy, Sept. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. K. Stavrou, M. Nikolaides, D. Pavlou, S. Arandi, P. Evripidou, and P. Trancoso. Tflux: A portable platform for data-driven multithreading on commodity multicore systems. In Intl. Conf. on Parallel Processing (ICPP'08), pages 25--34, Portland, Oregon, Sept. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. S. Stuijk. Concurrency in computational networks. Master's thesis, Technische Universiteit Eindhoven (TU/e), Oct. 2002. # 446407.Google ScholarGoogle Scholar
  40. W. Thies and S. Amarasinghe. An empirical characterization of stream programs and its implications for language and compiler design. In Intl. Conf. on Parallel Architectures and Compilation Techniques (PACT'10), Vienna, Austria, Sept. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. W. Thies, M. Karczmarek, and S. Amarasinghe. StreamIt: A language for streaming applications. In Intl. Conf. on Compiler Construction, Grenoble, France, Apr. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. I. Watson and J. R. Gurd. A practical data flow computer. IEEE Computer, 15(2):51--57, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Erbium: a deterministic, concurrent intermediate representation to map data-flow tasks to scalable, persistent streaming processes

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CASES '10: Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
      October 2010
      276 pages
      ISBN:9781605589039
      DOI:10.1145/1878921

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 October 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate52of230submissions,23%

      Upcoming Conference

      ESWEEK '24
      Twentieth Embedded Systems Week
      September 29 - October 4, 2024
      Raleigh , NC , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader