Abstract
Because multicore CPUs have become the standard with all major hardware manufacturers, it becomes increasingly important for programming languages to provide programming abstractions that can be mapped effectively onto parallel architectures. Stream processing is a programming paradigm where computations are expressed as independent actors that communicate via FIFO data-channels. The coarse-grained parallelism exposed in stream programs facilitates such an efficient mapping of actors onto the underlying multicore hardware.
We propose a stream-parallel programming abstraction that extends object-oriented languages with stream-programming facilities. StreamPI consists of a class hierarchy for actor-specification together with a language-independent runtime system that supports the execution of stream programs on multicore architectures. We show that the language-specific part of StreamPI, i.e., the class hierarchy, can be implemented as a library-level programming language extension. A library-level extension has the advantage that an existing programming language implementation need not be touched. Legacy-code can be mixed with a stream-parallel application, and the use of sequential legacy code with actors is supported. Unlike previous approaches, StreamPI allows dynamic creation and subsequent execution of stream programs. StreamPI actors are typed. Type-safety is achieved through type-checks at stream graph creation time.
We have implemented StreamPI’s language-independent runtime system and language interfaces for Ada 2005 and C++ for Intel multicore architectures. We have evaluated StreamPI for up to 16 cores on a two CPU 8-core Intel Xeon X7560 server, and we provide a performance comparison with StreamIt (Gordon et al. in International Conference on Architectural Support for Programming Languages and Operating Systems, 2006), which is the de facto standard for stream-parallel programming. Although our approach provides greater programming flexibility than StreamIt, the performance of StreamPI compares favorably to the static compilation model of StreamIt.
Similar content being viewed by others
References
Amarasinghe S, Gordon MI, Karczmarek M, Lin J, Maze D, Rabbah RM, Thies W (2005) Language and compiler design for streaming applications. Int J Parallel Program 33(2):261–278
Andrews J, Baker N (2006) Xbox 360 system architecture. IEEE MICRO 26(2):25–37
Battacharyya SS, Lee EA, Murthy PK (1996) Software synthesis from dataflow graphs. Kluwer Academic, Norwell
Belina F, Hogrefe D (1989) The CCITT-specification and description language SDL. Comput Netw 16:311–341
Berry G, Gonthier G (1992) The Esterel synchronous programming language: design, semantics, implementation. Sci Comput Program 19(2):87–152
Bryant RE, O’Halloran DR (2003) Computer systems: a programmer’s perspective. Prentice-Hall, New York
Buttlar D, Farrell J, Nichols B (1996) PThreads programming. O’Reilly, Sebastopol
Carpenter PM, Ramirez A, Ayguade E (2009) Mapping stream programs onto heterogeneous multiprocessor systems. In: CASES ’09: proceedings of the 2009 international conference on compilers, architecture, and synthesis for embedded systems. ACM Press, New York, pp 57–66
Caspi P, Pilaud D, Halbwachs N, Plaice J (1987) Lustre: a declarative language for programming synchronous systems. In: Proceedings of the 14th ACM conference on principles of programming languages, pp 178–188
Chen MK, Li XF, Lian R, Lin JH, Liu L, Liu T, Ju R (2005) Shangri-la: Achieving high performance from compiled network applications while enabling ease of programming. In: PLDI ’05: proceedings of the 2005 ACM SIGPLAN conference on programming language design and implementation. ACM Press, New York
Farhad SM, Ko Y, Burgstaller B, Scholz B (2011) Orchestration by approximation: mapping stream programs onto multicore architectures. In: Proceedings of the sixteenth international conference on architectural support for programming languages and operating systems, ASPLOS ’11, New York, NY, USA. ACM Press, New York, pp 357–368
Google (2009) The Go programming language specification, retrieved Nov 2009. http://golang.org
Gordon MI, Thies W, Amarasinghe S (2006) Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In: International conference on architectural support for programming languages and operating systems, San Jose, CA
Gummaraju J, Rosenblum M (2005) Stream programming on general-purpose processors. In: MICRO 38: proceedings of the 38th annual IEEE/ACM international symposium on microarchitecture. IEEE Computer Society Press, Los Alamitos, pp 343–354
Gupta R, Hill CR (1990) A scalable implementation of barrier synchronization using an adaptive combining tree. Int J Parallel Program 18:161–180
Hagiescu A, Wong W, Bacon DF, Rabbah R (2009) A computing origami: folding streams in FPGAs. In: DAC ’09: proceedings of the 2009 design automation conference. ACM Press, New York
Herlihy M, Shavit N (2008) The art of multiprocessor programming. Morgan Kaufmann, San Mateo
Hofstee HP (2005) Power efficient processor architecture and the Cell processor. In: HPCA ’05: proceedings of the 2005 international symposium on high-performance computer architecture. IEEE Computer Society Press, Los Alamitos, pp 258–262
Hormati AH, Choi Y, Kudlur M, Rabbah R, Mudge T, Mahlke S (2009) Flextream: Adaptive compilation of streaming applications for heterogeneous architectures. In: PACT ’09: proceedings of the 2009 18th international conference on parallel architectures and compilation techniques, Washington, DC, USA. IEEE Computer Society Press, Los Alamitos, pp 214–223
IBM Redbooks (2008) Programming the cell broadband engine architecture: examples and best practices. http://www.redbooks.ibm.com
IDC (2008) PC semiconductor market briefing: re-architecting the PC and the migration of value, June 2008, http://www.idc.com
ISO/IEC 8652:2007 (2006) Ada reference manual, 3rd edn
Kahn G (1974) The semantics of a simple language for parallel programming. In: Rosenfeld JL (ed) Information processing, Stockholm, Sweden, Aug. North Holland, Amsterdam, pp 471–475
Kapasi UJ, Dally WJ, Rixner S, Owens JD, Khailany B (2002) The imagine stream processor. In: Computer design, international conference on, p 282
Karczmarek M (2002) Constrained and phased scheduling of synchronous data flow graphs for the StreamIt language. Master’s thesis, Massachusetts Institute of Technology
Karczmarek M, Thies W, Amarasinghe S (2003) Phased scheduling of stream programs. ACM SIGPLAN Not 38(7):103–112
Karczmarek M, Thies W, Amarasinghe S (2003) Phased scheduling of stream programs. In: LCTES ’03: proceedings of the 2003 ACM SIGPLAN/SIGBED conference on languages, compilers, and tools for embedded systems, vol 38, pp 1235–1245
Karp RM, Miller RE (1966) Properties of a model for parallel computations: determinacy, termination, queueing. SIAM J Appl Math 14(6):1390–1411
Kudlur M, Mahlke S (2008) Orchestrating the execution of stream programs on multicore platforms. In: PLDI ’08: proceedings of the 2008 ACM SIGPLAN conference on programming language design and implementation. ACM Press, New York
Lee EA, Messerschmitt DG (1987) Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans Comput 36(1):24–35
Lee EA, Messerschmitt DG (1987) Synchronous data flow. Proc IEEE 75(9):1235–1245
Leung M-K, Liu I, Zou J (2008) Code generation for process network models onto parallel architectures. Technical Report UCB/EECS-2008-139, EECS Department, University of California, Berkeley
Lin C, Snyder L (2008) Principles of parallel programming. Addison-Wesley, Reading
Lin Y, Choi Y, Mahlke SA, Mudge TN, Chakrabarti C (2008) A parameterized dataflow language extension for embedded streaming systems. In: SAMOS ’08: proceedings of the 2008 international conference on embedded computer systems: architectures, modeling, and simulation, pp 10–17
Mattson TG, Sanders BA, Massingill BL (2007) Patterns for parallel programming, 3rd edn. Addison-Wesley, Reading
Michael EW, Taylor M, Sarkar V, Lee W, Lee V, Kim J, Frank M, Finch P, Devabhaktuni S, Barua R, Babb J, Amarasinghe S, Agarwal A (1997) Baring it all to software: the Raw machine. Computer 30:86–93
Pacheco PS (1996) Parallel programming with MPI. Morgan Kaufmann, San Francisco
Reinders J (2007) Intel threading building blocks. O’Reilly, Sebastopol
Sedgewick R (2002) Algorithms in C++, 3rd edn. Addison-Wesley-Longman, Reading
Sermulins J, Thies W, Rabbah R, Amarasinghe S (2005) Cache aware optimization of stream programs. In: LCTES ’05: proceedings of the 2005 ACM SIGPLAN/SIGBED conference on languages, compilers, and tools for embedded systems. ACM Press, New York, pp 115–126
Spring JH, Privat J, Guerraoui R, Vitek J (2007) StreamFlex: High-throughput stream programming in Java. In: OOPSLA ’07: proceedings of the 2007 ACM SIGPLAN conference on object-oriented programming systems and applications
Stephens R (1997) A survey of stream processing. Acta Inform 34:491–541
StreamIt research group (2006) StreamIt Cookbook. Online reference manual. Massachusetts Institute of Technology
StreamIt Web Site (2010) http://groups.csail.mit.edu/cag/streamit/, retrieved Dec 2010
Thies W (2009) Language and compiler support for stream programs. PhD thesis, Massachusetts Institute of Technology
Thies W, Amarasinghe S (2010) An empirical characterization of stream programs and its implications for language and compiler design. In: PACT ’10 proceedings of the 2010 conference on parallel architectures and compilation techniques. ACM Press, New York
Thies W, Karczmarek M, Amarasinghe SP (2002) StreamIt: A Language for Streaming Applications. In: CC ’02: proceedings of the 11th international conference on compiler construction, London, UK, LNCS. Springer, Berlin, pp 179–196
Udupa A, Govindarajan R, Thazhuthaveetil MJ (2009) Software pipelined execution of stream programs on GPUs. In: CGO ’09: proceedings of the 7th Annual IEEE/ACM international symposium on code generation and optimization. IEEE Computer Society Press, Los Alamitos
Udupa A, Govindarajan R, Thazhuthaveetil MJ (2009) Synergistic execution of stream programs on multicores with accelerators. In: LCTES ’09: proceedings of the 2009 ACM SIGPLAN/SIGBED conference on languages, compilers, and tools for embedded systems
Wei H, Yu J, Yu H, Gao GR (2010) Minimizing communication in rate-optimal software pipelining for stream programs. In: CGO ’10: proceedings of the 8th annual IEEE/ACM international symposium on code generation and optimization. ACM Press, New York, pp 210–217
Zhang D, Li QJ, Rabbah R, Amarasinghe S (2008) A lightweight streaming layer for multicore execution. SIGARCH Comput Archit News 36(2):18–27
Zhang D, Li Z, Song H, Liu L (2005) A programming model for an embedded media processing architecture. In: SAMOS ’05: proceedings of the 2005 international conference on embedded computer systems: architectures, modeling, and simulation, LNCS. Springer, Berlin
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hong, J., Hong, K., Burgstaller, B. et al. StreamPI: a stream-parallel programming extension for object-oriented programming languages. J Supercomput 61, 118–140 (2012). https://doi.org/10.1007/s11227-011-0656-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-011-0656-7