research-article

Serialization sets: a dynamic dependence-based parallel execution model

Authors:

Matthew D. Allen,

Srinath Sridharan,

Gurindar S. SohiAuthors Info & Claims

PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming

Pages 85 - 96

https://doi.org/10.1145/1504176.1504190

Published: 14 February 2009 Publication History

Abstract

This paper proposes a new parallel execution model where programmers augment a sequential program with pieces of code called serializers that dynamically map computational operations into serialization sets of dependent operations. A runtime system executes operations in the same serialization set in program order, and may concurrently execute operations in different sets. Because serialization sets establish a logical ordering on all operations, the resulting parallel execution is predictable and deterministic.

We describe the API and design of Prometheus, a C++ library that implements the serialization set abstraction through compile-time template instantiation and a runtime support library. We evaluate a set of parallel programs running on the x86_64 and SPARC-V9 instruction sets and study their performance on multicore, symmetric multiprocessor, and ccNUMA parallel machines. By contrast with conventional parallel execution models, we find that Prometheus programs are significantly easier to write, test, and debug, and their parallel execution achieves comparable performance.

References

[1]

C. Bienia et al. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th international conference on Parallel Architectures and Compilation Techniques (PACT), October 2008.

Digital Library

[2]

K. Czarnecki and U. W. Eisenecker. Generative Programming. Addison-Wesley, 2000.

Digital Library

[3]

R. Das et al. Communications optimizations for irregular scientific computations on distributed memory architectures. Journal of Parallel and Distributed Computing, 22:462--478, September 1994.

Digital Library

[4]

J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th symposium on Operating System Design and Implementation (OSDI), 2004.

Digital Library

[5]

M. Frigo et al. The implementation of the Cilk-5 multithreaded language. In Proceedings of the 1998 conference on Programming Language Design and Implementation (PLDI), pages 212--223, 1998.

Digital Library

[6]

J. Giacomoni et al. FastForward for efficient pipeline parallelism: A cache-optimized concurrent lock-free queue. In Proceedings of the 13th symposium on Principles and Practice of Parallel Programming (PPoPP), pages 43--52, 2008.

Digital Library

[7]

R. H. Halstead. MULTILISP: A language for concurrent symbolic computation. ACM Transactions on Programming Languages and Systems, 7(4):501--538, 1985.

Digital Library

[8]

C. Hewitt. Viewing control structures as patterns of passing messages. Journal of Aritificial Intelligence, 8:323--363, 1977.

Digital Library

[9]

Intel. Threading building blocks. http://threadingbuildingblocks.org.

[10]

M. Kulkarni et al. Optimistic parallelism requires abstractions. In Proceedings of the 2007 conference on Programming Language Design and Implementation (PLDI), pages 211--222, 2007.

Digital Library

[11]

R. G. Lavender and D. C. Schmidt. Active object: An object behavioral pattern for concurrent programming. In Proceedings of the 2nd conference on Pattern Languages of Programs (PLoP), 1995.

Digital Library

[12]

E. A. Lee. The problem with threads. IEEE Computer, 39(5):33--42, May 2006.

Digital Library

[13]

C. E. Leiserson. Cilk++: Multicore-enabling legacy C++ code. Carnegie Mellon University Parallel Thinking Series, April 2008.

[14]

Microsoft. Programming the ThreadPool in .NET. http://msdn.microsoft.com/en-us/library/ms973903.aspx.

[15]

Microsoft. Task parallel library (TPL). http://msdn.microsoft.com/en-us/magazine/cc163340.aspx.

[16]

OpenMP. The OpenMP API specification for parallel programming. http://openmp.org/wp/.

[17]

J. Pisharath et al. NU-MineBench 2.0. Technical Report CUCIS-2005-08-01, Northwestern University, 2005.

[18]

R. Rajwar and J. Larus. Transactional Memory. Morgan Claypool, October 2006.

[19]

C. Ranger et al. Evaluating MapReduce for multi-core and multiprocessor systems, 2007.

[20]

M. C. Rinard and M. S. Lam. The design, implementation, and evaluation of jade. ACM Transactions on Programming Languages and Systems, 20(3):483--545, 1998.

Digital Library

[21]

H. Sutter and J. Larus. Software and the concurrency revolution. ACM Queue, 3(7), September 2005.

Digital Library

[22]

T. Veldhuizen. Using C++ template metaprograms. C++ Report, 7(4):36--43, May 1995.

Cited By

Durvasula SZhao AKiguru RGuan YChen ZVijaykumar N(2024)ACE: Efficient GPU Kernel Concurrency for Input-Dependent Irregular Computational GraphsProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676897(258-270)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3656019.3676897
Tzannes AHeumann SEloussi LVakilian MAdve VHan M(2019)Region and effect inference for safe parallelismAutomated Software Engineering10.1007/s10515-019-00257-326:2(463-509)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s10515-019-00257-3
Faes MGross T(2019)Parallel Roles for Practical Deterministic Parallel ProgrammingLanguages and Compilers for Parallel Computing10.1007/978-3-030-35225-7_12(163-181)Online publication date: 15-Nov-2019
https://doi.org/10.1007/978-3-030-35225-7_12
Show More Cited By

Index Terms

Serialization sets: a dynamic dependence-based parallel execution model
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language features
        Concurrent programming structures
      2. Language types
        Parallel programming languages

Recommendations

Serialization sets: a dynamic dependence-based parallel execution model
PPoPP '09

This paper proposes a new parallel execution model where programmers augment a sequential program with pieces of code called serializers that dynamically map computational operations into serialization sets of dependent operations. A runtime system ...
Programming with exceptions in JCilk
Special issue: Synchronization and concurrency in object-oriented languages

JCilk extends the serial subset of the Java language by importing the fork-join primitives spawn and sync from the Cilk multithreaded language, thereby providing call-return semantics for multithreaded subcomputations. In addition, JCilk transparently ...
The implementation of the Cilk-5 multithreaded language

The fifth release of the multithreaded language Cilk uses a provably good "work-stealing" scheduling algorithm similar to the first system, but the language has been completely redesigned and the runtime system completely reengineered. The efficiency of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming

February 2009

322 pages

ISBN:9781605583976

DOI:10.1145/1504176

General Chair:
Daniel Reed
Microsoft Research, USA
,
Program Chair:
Vivek Sarkar
Rice University, USA

ACM SIGPLAN Notices Volume 44, Issue 4
PPoPP '09
April 2009
294 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1594835
Issue’s Table of Contents

Copyright © 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 February 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

PPoPP09

Sponsor:

PPoPP09: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

February 14 - 18, 2009

NC, Raleigh, USA

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

57
Total Citations
View Citations
816
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Durvasula SZhao AKiguru RGuan YChen ZVijaykumar N(2024)ACE: Efficient GPU Kernel Concurrency for Input-Dependent Irregular Computational GraphsProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676897(258-270)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3656019.3676897
Tzannes AHeumann SEloussi LVakilian MAdve VHan M(2019)Region and effect inference for safe parallelismAutomated Software Engineering10.1007/s10515-019-00257-326:2(463-509)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s10515-019-00257-3
Faes MGross T(2019)Parallel Roles for Practical Deterministic Parallel ProgrammingLanguages and Compilers for Parallel Computing10.1007/978-3-030-35225-7_12(163-181)Online publication date: 15-Nov-2019
https://doi.org/10.1007/978-3-030-35225-7_12
Faes MGross T(2018)Concurrency-aware object-oriented programming with rolesProceedings of the ACM on Programming Languages10.1145/32765002:OOPSLA(1-30)Online publication date: 24-Oct-2018
https://dl.acm.org/doi/10.1145/3276500
Zhao DQiao KZhou ZLi TZhou XRaicu I(2016)Exploiting multi-cores for efficient interchange of large messages in distributed systemsConcurrency and Computation: Practice & Experience10.1002/cpe.374228:13(3568-3585)Online publication date: 10-Sep-2016
https://dl.acm.org/doi/10.1002/cpe.3742
Pan APai VKern JVetter J(2015)Runtime-driven shared last-level cache management for task-parallel programsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/2807591.2807625(1-12)Online publication date: 15-Nov-2015
https://dl.acm.org/doi/10.1145/2807591.2807625
Tzannes AHeumann SEloussi LVakilian MAdve VHan MCohen MGrunske LWhalen M(2015)Region and effect inference for safe parallelismProceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE.2015.59(512-523)Online publication date: 9-Nov-2015
https://dl.acm.org/doi/10.1109/ASE.2015.59
Komuravelli RAdve SChou C(2014)Revisiting the Complexity of Hardware Cache Coherence and Some ImplicationsACM Transactions on Architecture and Code Optimization10.1145/266334511:4(1-22)Online publication date: 8-Dec-2014
https://dl.acm.org/doi/10.1145/2663345
De Carli LSommer RJha SAhn GYung MLi N(2014)Beyond Pattern MatchingProceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security10.1145/2660267.2660361(1378-1390)Online publication date: 3-Nov-2014
https://dl.acm.org/doi/10.1145/2660267.2660361
Gilad EMackay EOskin MEtsion YSinger JKulkarni MHarris T(2014)O-structuresProceedings of the workshop on Memory Systems Performance and Correctness10.1145/2618128.2618130(1-8)Online publication date: 13-Jun-2014
https://dl.acm.org/doi/10.1145/2618128.2618130
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten