Abstract
In this paper, we address the problem of scheduling parallel tasks with general synchronization patterns using a cooperative runtime. Current implementations for task-parallel programming models provide efficient support for fork-join parallelism, but are unable to efficiently support more general synchronization patterns such as locks, futures, barriers and phasers. We propose a novel approach to addressing this challenge based on cooperative scheduling with one-shot delimited continuations (OSDeConts) and event-driven controls (EDCs). The use of OSDeConts enables the runtime to suspend a task at any point (thereby enabling the task’s worker to switch to another task) whereas other runtimes may have forced the task’s worker to be blocked. The use of EDCs ensures that identification of suspended tasks that are ready to be resumed can be performed efficiently. Furthermore, our approach is more efficient than schedulers that spawn additional worker threads to compensate for blocked worker threads.
We have implemented our cooperative runtime in Habanero-Java (HJ), an explicitly parallel language with a large variety of synchronization patterns. The OSDeConts and EDC primitives are used to implement a wide range of synchronization constructs, including those where a task may trigger the enablement of multiple suspended tasks (as in futures, barriers and phasers). In contrast, current task-parallel runtimes and schedulers for the fork-join model (including schedulers for the Cilk language) focus on the case where only one continuation is enabled by an event (typically, the termination of the last child/descendant task in a join scope). Our experimental results show that the HJ cooperative runtime delivers significant improvements in performance and memory utilization on various benchmarks using future and phaser constructs, relative to a thread-blocking runtime system while using the same underlying work-stealing task scheduler.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Blumofe, R.: LU decomposition - Cilk, http://courses.cs.tau.ac.il/368-4064/cilk-5.3.1/examples/lu.cilk
Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: An Efficient Multithreaded Runtime System. In: Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 1995, pp. 207–216. ACM, New York (1995)
Cavé, V., Zhao, J., Guo, Y., Sarkar, V.: Habanero-Java: the New Adventures of Old X10. In: PPPJ, pp. 51–61 (2011)
Chamberlain, B.L., Callahan, D., Zima, H.P.: Parallel Programmability and the Chapel Language. International Journal of High Performance Computing Applications 21(3), 291–312 (2007)
Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: An Object-Oriented Approach to Non-uniform Cluster Computing. SIGPLAN Not. 40, 519–538 (2005)
Drago, I., Cunei, A., Vitek, J.: Continuations in the Java Virtual Machine. In: International Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems (2007)
EPCC: The Java Grande Forum Multi-threaded Benchmarks, http://www2.epcc.ed.ac.uk/computing/research_activities/java_grande/threads/s1contents.html
Felleisen, M.: The Theory and Practice of First-Class Prompts. In: Proceedings of the 15th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 1988, pp. 180–190. ACM, New York (1988)
Fischer, J., Majumdar, R., Millstein, T.: Tasks: Language Support for Event-driven Programming. In: Proceedings of the 2007 ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation, PEPM 2007, ACM, New York (2007)
Fluet, M., Rainey, M., Reppy, J., Shaw, A.: Implicitly Threaded Parallelism in Manticore. J. Funct. Program. 20(5-6) (November 2010)
Fulgham, B.: binary-trees benchmark, http://benchmarksgame.alioth.debian.org/u32/performance.php?test=binarytrees
Fulgham, B.: n-body benchmark, http://benchmarksgame.alioth.debian.org/u32/performance.php?test=nbody
Georges, A., Buytaert, D., Eeckhout, L.: Statistically Rigorous Java Performance Evaluation. In: Proceedings of the 22nd Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems and Applications, OOPSLA 2007, pp. 57–76. ACM, New York (2007)
Gray, J.: Writing Faster Managed Code: Know What Things Cost, http://msdn.microsoft.com/en-us/library/ms973852.aspx
Guo, Y., Barik, R., Raman, R., Sarkar, V.: Work-First and Help-First Scheduling Policies for Async-Finish Task Parallelism. In: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, IPDPS 2009, pp. 1–12. IEEE Computer Society, Washington, DC (2009)
Gupta, S., Nandivada, V.K.: IMSuite: A Benchmark Suite for Simulating Distributed Algorithms. CoRR abs/1310.2814 (2013)
Halstead, R.H.: Multilisp: A Language for Concurrent Symbolic Computation. ACM Transactions on Programming Languages and Systems 7, 501–538 (1985)
Haynes, C.T., Friedman, D.P.: Engines Build Process Abstractions. In: Proceedings of the 1984 ACM Symposium on LISP and Functional Programming, LFP 1984, pp. 18–24. ACM, New York (1984)
Herzeel, C., Costanza, P.: Dynamic Parallelization of Recursive Code Part I: Managing Control Flow Interactions with the Continuator. In: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA 2010, pp. 377–396. ACM, New York (2010)
Imam, S., Sarkar, V.: Integrating Task Parallelism with Actors. In: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA 2012, pp. 753–772. ACM, New York (2012), http://doi.acm.org/10.1145/2384616.2384671
Imam, S., Sarkar, V.: A Case for Cooperative Scheduling in X10’s Managed Runtime. In: The 2014 X10 Workshop (X10 2014) (June 2014)
Lea, D.: A Java Fork/Join Framework. In: Java Grande, pp. 36–43 (2000)
Li, P., Marlow, S., Peyton Jones, S., Tolmach, A.: Lightweight Concurrency Primitives for GHC. In: Proceedings of the ACM SIGPLAN Haskell Workshop, Haskell 2007, pp. 107–118. ACM, New York (2007)
OpenMP Application Program Interface, Version 3.0 (May 2008), http://www.openmp.org/mp-documents/spec30.pdf
Reinders, J.: Intel Threading Building Blocks, 1st edn. O’Reilly & Associates, Inc., Sebastopol (2007)
Shirako, J., Peixotto, D.M., Sarkar, V., Scherer, W.N.: Phasers: a Unified Deadlock-Free Construct for Collective and Point-to-Point Synchronization. In: Proceedings of the 22nd Annual International Conference on Supercomputing, ICS 2008, pp. 277–288. ACM, New York (2008)
Sigoure, B.: How long does it take to make a context switch, http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html
Srinivasan, S., Mycroft, A.: Kilim: Isolation-Typed Actors for Java. In: Vitek, J. (ed.) ECOOP 2008. LNCS, vol. 5142, pp. 104–128. Springer, Heidelberg (2008)
Tardieu, O., Wang, H., Lin, H.: A Work-Stealing Scheduler for X10s Task Parallelism with Suspension. In: Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2012, pp. 267–276. ACM, New York (2012)
Wheeler, K., Murphy, R., Thain, D.: Qthreads: An API for programming with millions of lightweight threads. In: IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008, pp. 1–8 (2008)
Yan, Y., Chatterjee, S., Budimlic, Z., Sarkar, V.: Integrating MPI with Asynchronous Task Parallelism. In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds.) EuroMPI 2011. LNCS, vol. 6960, pp. 333–336. Springer, Heidelberg (2011), http://dx.doi.org/10.1007/978-3-642-24449-0_41
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Imam, S., Sarkar, V. (2014). Cooperative Scheduling of Parallel Tasks with General Synchronization Patterns. In: Jones, R. (eds) ECOOP 2014 – Object-Oriented Programming. ECOOP 2014. Lecture Notes in Computer Science, vol 8586. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44202-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-662-44202-9_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44201-2
Online ISBN: 978-3-662-44202-9
eBook Packages: Computer ScienceComputer Science (R0)