SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism

Sbîrlea, Dragoş; Shirako, Jun; Newton, Ryan; Sarkar, Vivek

doi:10.1007/s10766-015-0353-x

SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism

Published: 03 March 2015

Volume 44, pages 233–256, (2016)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Dragoş Sbîrlea¹,
Jun Shirako¹,
Ryan Newton² &
…
Vivek Sarkar¹

196 Accesses
Explore all metrics

Abstract

Stream processing is a special form of the dataflow execution model that offers extensive opportunities for optimization and automatic parallelization. To take full advantage of the paradigm programmers are typically required to learn a new language and re-implement their applications. This work shows that it is possible to exploit streaming as a safe and automatic optimization of a more general dataflow-based model—one in which computation kernels are written in standard, general-purpose languages and organized as a coordination graph. We propose streaming concurrent collections (SCnC), a streaming system that can efficiently run a subset of programs supported by concurrent collections (CnC). CnC is a general purpose parallel programming paradigm that integrates task parallelism and dataflow computing. The proposed streaming support allows application developers to reason about their program as a general dataflow graph, while benefiting from the performance and tight memory footprint of stream parallelism when their program satisfies streaming constraints. In this paper, we formally define the application requirements for using SCnC, and outline a static decision procedure for identifying and processing eligible SCnC subgraphs. We present initial results showing that transitioning from general CnC to SCnC leads to a throughput increase of up to 40\(\times \) for certain benchmarks, and also enables programs with large data sizes to execute in available memory for cases where CnC execution may run out of memory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stream parallelism with ordered data constraints on multi-core systems

Article 17 July 2018

FlowPools: A Lock-Free Deterministic Concurrent Dataflow Abstraction

Data stream processing via code annotations

Article 22 June 2016

Notes

The OpenStream system, developed concurrently with this work, has a similar feature, but the OpenStream state cannot be inferred from stream accesses.
If there are multiple consumer functions, all combinations must be considered and only the maximum buffer size obtained is safe.
As shown in Sect. 3.2, the control graph is a tree, so there is only one such path.

References

Thies, W., Karczmarek, M., Amarasinghe, S.P.: Streamit: a language for streaming applications. In: CC ’02, pp. 179–196. Springer, London
Budimlic, Z., Burke, M., Cavé, V., Knobe, K., Lowney, G., Newton, R., Palsberg, J., Peixotto, D.M., Sarkar, V., Schlimbach, F., Tasirlar, S.: Concurrent collections. Sci. Program. 18(3–4), 203–207 (2010)
Google Scholar
Blumofe, R.D., Leiserson, C.E.: Scheduling multithreaded computations by work stealing. J. ACM 46, 720–748 (1999)
Article MathSciNet MATH Google Scholar
Agarwal, S., Barik, R., Bonachea, D., Sarkar, V., Shyamasundar, R.K., Yelick, K.: Deadlock-free scheduling of X10 computations with bounded resources. In: SPAA ’07 ACM, New York
Guo, Y., Barik, R., Raman, R., Sarkar, V.: Work-first and help-first scheduling policies for async-finish task parallelism. In: IPDPS’09
MathWorks Symbolic Math Toolbox Documentation. http://www.mathworks.com/help/symbolic/index.html. Accessed Feb 2015
Li, P., Agrawal, K., Buhler, J., Chamberlain, R.D.: Deadlock avoidance for streaming computations with filtering. In: SPAA ’10
Li, P., Agrawal, K., Buhler, J., Chamberlain, R.D., Lancaster, J.M.: Deadlock-avoidance for streaming applications with split-join structure: two case studies. In: ASAP, pp. 333–336 (2010)
Soul, R., Gordon, M.I., Amarasinghe, S., Grimm, R., Hirzel, M.: Hitting the Sweet Spot for Streaming Languages. NY University CS Technical Report TR2012-948 (2009)
Cavé, V., Zhao, J., Shirako, J., Sarkar, V.: Habanero-java: the new adventures of old X10. In: Proceedings of the 9th International Conference on Principles and Practice of Programming in Java, PPPJ ’11 (2011)
Shirako, J., Peixotto, D.M., Sarkar, V., Scherer, W.N.: Phasers: a unified deadlock-free construct for collective and point-to-point synchronization. In: ICS ’08, pp. 277–288, ACM, New York
Shirako, J., Peixotto, D.M., Sarkar, V., Scherer, W.N.: Phaser accumulators: a new reduction construct. In: IPDPS 09
Georges, A., Buytaert, D., Eeckhout, L.: Statistically rigorous java performance evaluation. In: OOPSLA’07, pp. 57–76. ACM
Meyerson, A.: Online facility location. In: FOCS ’01
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 679–698 (1986)
Article Google Scholar
Nijhuis, M., Bos, H., Bal, H.E.:A component-based coordination language for efficient reconfigurable streaming applications. In: ICPP (2007)
Nijhuis, M.: Framework for parallel streaming applications. Ph.D. dissertation (2007)
Auerbach, J., Bacon, D.F., Cheng, P., Rabbah, R.: Lime: a java-compatible and synthesizable language for heterogeneous architectures. In: OOPSLA ’10, pp. 89–108, ACM, New York
Liao, S., Du, Z., Wu, G., Lueh, G.-Y.: Data and computation transformations for brook streaming applications on multiprocessors. In: CGO ’06, pp. 196–207, IEEE Computer Society, Washington
Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P.: Brook for gpus: stream computing on graphics hardware. In: SIGGRAPH ’04, pp. 777–786, ACM, New York (2004)
Aoyagi, Y., Uehara, M., Mori, H.: A case study on predictive method of task allocation in stream-based computing. In: Proceedings of the 13th International Conference on Information Networking, ICOIN ’98
Collins, R.L., Carloni, L.P.: Flexible filters: load balancing through backpressure for stream programs. In: EMSOFT ’09
Aleen, F., Sharif, M., Pande, S.: Input-driven dynamic execution prediction of streaming applications. In: PPoPP ’10, pp. 315–324
Miranda, C., Pop, A., Dumont, P., Cohen, A., Duranton, M.: Erbium: a deterministic, concurrent intermediate representation to map data-flow tasks to scalable, persistent streaming processes. In: CASES ’10, pp. 11–20. ACM
Vandierendonck, H., Tzenakis, G., Nikolopoulos, D.S.: A unified scheduler for recursive and task dataflow parallelism. In: PACT ’11
Pop, A., Cohen, A.: Openstream: expressiveness and data-flow compilation of openmp streaming programs. In: TACO ’13

Download references

Author information

Authors and Affiliations

Rice University, 6100 Main St, Houston, TX, 77030, USA
Dragoş Sbîrlea, Jun Shirako & Vivek Sarkar
Indiana University, 107 South Indiana Avenue, Bloomington, IN, 47405, USA
Ryan Newton

Authors

Dragoş Sbîrlea
View author publications
You can also search for this author in PubMed Google Scholar
Jun Shirako
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Newton
View author publications
You can also search for this author in PubMed Google Scholar
Vivek Sarkar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dragoş Sbîrlea.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sbîrlea, D., Shirako, J., Newton, R. et al. SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism. Int J Parallel Prog 44, 233–256 (2016). https://doi.org/10.1007/s10766-015-0353-x

Download citation

Received: 28 February 2013
Accepted: 19 February 2015
Published: 03 March 2015
Issue Date: April 2016
DOI: https://doi.org/10.1007/s10766-015-0353-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism

Abstract

Access this article

Similar content being viewed by others

Stream parallelism with ordered data constraints on multi-core systems

FlowPools: A Lock-Free Deterministic Concurrent Dataflow Abstraction

Data stream processing via code annotations

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism

Abstract

Access this article

Similar content being viewed by others

Stream parallelism with ordered data constraints on multi-core systems

FlowPools: A Lock-Free Deterministic Concurrent Dataflow Abstraction

Data stream processing via code annotations

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation