Abstract
Parallel tasks work-stealing schedulers yield near-optimal tasks distribution (i.e. all CPU cores are loaded equally) and have low time, memory and inter-thread synchronizations. The key idea of work-stealing strategy is that when scheduler worker runs out of tasks for execution, it start stealing tasks from the queues of other workers. Itβs been shown that double ended queues based on circular arrays are effective in this scenario. They are designed with an assumption that tasks pointer are stored in these data structures, while tasks object reside in heap memory. By modifying tasks queues so that they can hold task objects instead pointers we managed to increase the performance above 2.5 times on CPU bound applications and decrease last-level cache misses 30% compared to Intel TBB and Intel/MIT Cilk work-stealing schedulers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kwok, Y.-K., Ahmad, I.: Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv. 31(4) (1999). https://doi.org/10.1145/344588.344618
Beaumont, O., Carter, L., Ferrante, J., Legrand, A., Marchal, L., Robert, Y.: Centralized versus distributed schedulers for multiple bag-of-task applications. In: 20th IEEE International Parallel & Distributed Processing Symposium (2006). https://doi.org/10.1109/IPDPS.2006.1639262
Hendler, D., Shavit, N.: Work dealing (extended abstract). In: Proceedings of the Fourteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 2002, pp. 164β172 (2002). https://doi.org/10.1145/564870.564900
Acar, U.A., Chargueraud, A., Rainey, M.: Scheduling parallel programs by work stealing with private deques. In: PPoPP 2013, pp. 219β228. ACM, New York (2013). https://doi.org/10.1145/2442516.2442538
Hendler, D., Shavit, N.: Non-blocking steal-half work queues. In: Proceedings of the Twenty-First Annual Symposium on Principles of Distributed Computing, pp. 280β289. https://doi.org/10.1145/571825.571876
Arora, N.S., Blumofe, R.D., Plaxton, C.G.: Thread scheduling for multiprogrammed multiprocessors. In: Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 119β129 (1998)
Blumofe, R.D., Leiserson, C.E.: Scheduling multithreaded computations by work stealing. In: Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 119β129 (1999)
Reinders, J.: Intel Threading Building Blocks. OβReilly & Associates Inc., Sebastopol (2007)
Duffy, J.: Concurrent Programming on Windows. Addison-Wesley, Upper Saddle River (2008)
Berenbrink, P., Friedetzky, T., Goldberg, L.A.: The natural work-stealing algorithm is stable. In: Proceedings of 42nd IEEE Symposium on Foundations of Computer Science, pp. 1260β1279 (2001). https://doi.org/10.1137/S0097539701399551
Mitzenmacher, M.: Analyses of load stealing models based on differential equations. In: SPAA 1998 Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures
Aksenova, E.A., Sokolov, A.V.: Modeling of the memory management process for dynamic work-stealing schedulers. In: Ivannikov ISPRAS Open Conference (ISPRAS), Moscow, pp. 12β15 (2017). https://doi.org/10.1109/ISPRAS.2017.00009
Kuchumov, R.I.: Implementation and analysis of work-stealing task scheduler. Stochastic Optim. Comput. Sci. 12, 20β39 (2016)
Peierls, T., Bloch, J., Bowbeer, J., Lea, D., Holmes, D.: Java Concurrency in Practice. Addison-Wesley Professional, Reading (2006)
Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In: Proceedings of the 20th Annual ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages, and Applications, pp. 519β538 (2005). https://doi.org/10.1145/1094811.1094852
Guo, Y.: A scalable locality-aware adaptive work-stealing scheduler for multi-core task parallelism. Rice University Houston, TX, USA (2010)
Robison, A.: A primer on scheduling fork-join parallelism with work stealing (2014)
Hendler, D., Lev, Y., Moir, M., Shavit, N.: A dynamic-sized nonblocking work stealing deque. Distrib. Comput. 18(3), 189β207 (2005)
Chase, D., Lev Y.: Dynamic circular work-stealing deque. In: SPAA 2005 Proceedings of the Seventeenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, pp. 21β28 (2005). https://doi.org/10.1145/1073970.1073974
Le, N.M., Pop, A., Cohen, A., Nardelli, F.Z.: Correct and efficient work-stealing for weak memory models. In: PPoPP 2013 Proceedings of the 18th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming, pp. 69β80 (2013). https://doi.org/10.1145/2442516.2442524
Chen, Q., Guo, M., Guan, H.: LAWS: Locality-aware work-stealing for multi-socket multi-core architectures. In: ICS 2014 Proceedings of the 28th ACM International Conference on Supercomputing (2014). https://doi.org/10.1145/2597652.2597665
Chen, Q., Guo, M.: Contention and locality-aware work-stealing for iterative applications in multi-socket computers. IEEE Trans. Comput. https://doi.org/10.1109/TC.2017.2783932
Wang, K., Zhou, X., Li, T., Zhao, D., Lang, M., Raicu, I.: Optimizing load balancing and data-locality with data-aware scheduling In.: 2014 IEEE International Conference on Big Data (Big Data). https://doi.org/10.1109/BigData.2014.7004220
Armbrust, M., Fox, A., Griffith, R., Joseph A.D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., Zaharia, M.: A view of cloud computing. Commun. ACM 53(4), 50β58 (2010). https://doi.org/10.1145/1721654.1721672
Acknowledgements
Research has been supported by the RFBR grants No. 18-01-00125-a and 16-07-01111.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Β© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Kuchumov, R., Sokolov, A., Korkhov, V. (2018). Staccato: Cache-Aware Work-Stealing Task Scheduler for Shared-Memory Systems. In: Gervasi, O., et al. Computational Science and Its Applications β ICCSA 2018. ICCSA 2018. Lecture Notes in Computer Science(), vol 10963. Springer, Cham. https://doi.org/10.1007/978-3-319-95171-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-95171-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95170-6
Online ISBN: 978-3-319-95171-3
eBook Packages: Computer ScienceComputer Science (R0)