skip to main content
10.1145/2481268.2481277acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Hybrid parallel task placement in X10

Published: 20 June 2013 Publication History

Abstract

This paper presents a hybrid parallel task-placement strategy that combines work stealing and work dealing to improve workload distribution across nodes in distributed shared-memory machines. Existing work-dealing-based load balancers suffer from large performance penalties resulting from excessive task migration and from excessive communication among the nodes to determine the target node for a migrated task. This work employs a simple heuristic to determine the load status of a node and also to detect a good target for migration of tasks.
Experimental evaluations on applications chosen from the Cowichan and Lonestar suites demonstrate a speedup, with the proposed approach, in the range of 2% to 16% on a cluster of 128 cores over the state-of-the-art work-stealing scheduler.

References

[1]
U. A. Acar, G. E. Blelloch, and R. D. Blumofe. The Data Locality of Work Stealing. In Proceedings of the twelfth annual ACM Symposium on Parallel Algorithms and Architectures, SPAA '00, pages 1--12, Bar Harbor, Maine, United States, 2000.
[2]
U. A. Acar, A. Chargueraud, and M. Rainey. Scheduling Parallel Programs by Work Stealing with Private Deques. In Proceedings of the 18th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming, PPoPP '13, pages 219--228, Shenzhen, China, 2013. ACM.
[3]
S. Bertozzi, A. Acquaviva, D. Bertozzi, and A. Poggiali. Supporting task migration in multi-processor systems-on-chip: a feasibility study. In Proceedings of the conference on Design, automation and test in Europe: Proceedings, DATE '06, pages 15--20, 3001 Leuven, Belgium, Belgium, 2006. ISBN 3-9810801-0-6.
[4]
R. D. Blumofe and C. E. Leiserson. Scheduling in Multithreaded Computations by Work Stealing. In Symposium on Foundations of Computer Science, pages 356--368, Santa Fe, New Mexico, 1994.
[5]
R. D. Blumofe and P. A. Lisiecki. Adaptive and Reliable Parallel Computing on Networks of Workstations. In USENIX Annual Technical Conference, pages 10--10, Berkeley, CA, USA, 1997.
[6]
J. Brezin, S. Fink, B. Bloom, and C. Swart. An Introduction to Programming with X10. http://dist.codehaus.org/x10/documentation/guide/pguide.pdf, Last access: 10 March, 2013.
[7]
P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: An object-oriented approach to non-uniform cluster computing. In Proceedings of the 20th annual ACM SIGPLAN Conference on Object-oriented Programming Systems Languages and Applications, OOPSLA '05, pages 519--538, San Diego, CA, USA, 2005. ACM.
[8]
D. Eager, E. Lazowska, and J. Zahorjan. Adaptive load sharing in homogeneous distributed systems. Software Engineering, IEEE Transactions on, SE-12(5):662--675, May 1986.
[9]
M. Frigo, C. E. Leiserson, and K. H. Randall. The Implementation of The Cilk-5 Multithreaded Language. In Programming Language Design and Implementation, pages 212--223, Montreal, Quebec, Canada, 1998.
[10]
Y. Guo, J. Zhao, V. Cave, and V. Sarkar. SLAW: A Scalable Locality-aware Adaptive Work-stealing Scheduler for Multi-core Systems. In Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '10, pages 341--342, Bangalore, India, 2010. ACM.
[11]
P. Kambadur, A. Gupta, A. Ghoting, H. Avron, and A. Lumsdaine. PFunc: Modern Task Parallelism for Modern High Performance Computing. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 43:1--43:11, Portland, Oregon, 2009. ACM.
[12]
M. Kulkarni, M. Burtscher, C. Casçaval, and K. Pingali. Lonestar: A Suite of Parallel Irregular Programs. In ISPASS '09: IEEE International Symposium on Performance Analysis of Systems and Software, Boston, MA, USA, 2009.
[13]
S.-M. Lau, Q. Lu, and K.-S. Leung. Adaptive load distribution algorithms for heterogeneous distributed systems with multiple task classes. Journal of Parallel and Distributed Computing, 66(2):163--180, 2006.
[14]
D. Lea. A Java Fork/Join Framework. In Proceedings of the ACM 2000 conference on Java Grande, JAVA '00, pages 36--43, San Francisco, California, United States, 2000. ACM.
[15]
D. Leijen, W. Schulte, and S. Burckhardt. The Design of a Task Parallel Library. In Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications, OOPSLA '09, pages 227--242, Orlando, Florida, USA, 2009. ACM.
[16]
J. Paudel and J. N. Amaral. Using Cowichan Problems to Investigate the Programmability of X10 Programming System. In ACM SIGPLAN X10 Workshop, San Jose, CA, USA, 2011.
[17]
J. Reinders. Intel Threading Building Blocks. O'Reilly & Associates, Inc., Sebastopol, CA, USA, first edition, 2007. ISBN 9780596514808.
[18]
V. Saraswat, B. Bloom, I. Peshansky, O. Tardieu, and D. Grove. X10 Language Specification. http://x10.codehaus.org/x10/documentation, Last access: 22 March, 2013.
[19]
V. Saraswat, G. Almasi, G. Bikshandi, C. Cascaval, D. Cunningham, D. Grove, S. Kodali, I. Peshansky, and O. Tardieu. The Asynchronous Partitioned Global Address Space Model. In Proceedings of the First Workshop on Advances in Message Passing, AMP'10, New York, NY, USA, 2010. ACM.
[20]
V. Saraswat, P. Kambadur, S. Kodali, D. Grove, and S. Krishnamoorthy. Lifeline-based Global Load Balancing. In Symposium on Principles and Practice of Parallel Programming, pages 201--212, San Antonio, TX, USA, 2011.

Cited By

View all
  • (2018)Hybrid work stealing of locality-flexible and cancelable tasks for the APGAS libraryThe Journal of Supercomputing10.1007/s11227-018-2234-874:4(1435-1448)Online publication date: 1-Apr-2018
  • (2018)A Combination of Intra- and Inter-place Work Stealing for the APGAS LibraryParallel Processing and Applied Mathematics10.1007/978-3-319-78054-2_22(234-243)Online publication date: 23-Mar-2018
  • (2017)APHiDProceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2017.33(228-237)Online publication date: 14-May-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
X10 '13: Proceedings of the third ACM SIGPLAN X10 Workshop
June 2013
47 pages
ISBN:9781450321570
DOI:10.1145/2481268
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. X10
  2. async
  3. parallel programming
  4. place flexible

Qualifiers

  • Research-article

Funding Sources

Conference

PLDI '13
Sponsor:

Acceptance Rates

X10 '13 Paper Acceptance Rate 5 of 5 submissions, 100%;
Overall Acceptance Rate 5 of 5 submissions, 100%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Hybrid work stealing of locality-flexible and cancelable tasks for the APGAS libraryThe Journal of Supercomputing10.1007/s11227-018-2234-874:4(1435-1448)Online publication date: 1-Apr-2018
  • (2018)A Combination of Intra- and Inter-place Work Stealing for the APGAS LibraryParallel Processing and Applied Mathematics10.1007/978-3-319-78054-2_22(234-243)Online publication date: 23-Mar-2018
  • (2017)APHiDProceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2017.33(228-237)Online publication date: 14-May-2017
  • (2015)Hybrid parallel task placement in irregular applicationsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2014.09.01476:C(94-105)Online publication date: 1-Feb-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media