A general method for rendering static analyses for diverse concurrency models modular

doi:10.1016/j.jss.2018.10.001

Journal of Systems and Software

Volume 147, January 2019, Pages 17-45

https://doi.org/10.1016/j.jss.2018.10.001 Get rights and content

Highlights

•
A discussion of the limitations of static analysis techniques for concurrent programs.
•
A general method for building scalable modular analysis of concurrent programs.
•
The application of this method to actors and to threads.
•
A theoretical and empirical evaluation of the resulting analyses.
•
A new benchmark suite for evaluating static analysis of multi-threaded applications.

Abstract

Shared-memory multi-threading and the actor model both share the notion of processes featuring communication, respectively by modifying shared state and by sending messages. Existing static analyses for concurrent programs either model every possible process interleavings and therefore suffer from the state explosion problem, or feature modularity but lack in precision or in their support for dynamic processes. In this paper we present a general method for obtaining a scalable analysis of concurrent programs featuring dynamic process creation. Our ModConc method transforms an abstract concurrent semantics modeling processes and communication into a modular static analysis treating the behavior of processes separately from their communication. We present ModConc in a generic way and demonstrate its applicability by instantiating it for multi-threaded and actor-based programs. The resulting analyses are evaluated in terms of precision, performance, scalability, and soundness. While a typical non-modular static analysis time out on half of our 56 benchmarks with a 30 min timeout, ModConc analyses successfully analyze all of them in less than 30 s, while remaining on par in terms of precision. Analyzing concurrent processes in isolation while modeling their communications is the key ingredient in supporting scalable analysis of concurrent programs featuring dynamic process communication.

Introduction

In most concurrent programming models, programs consist of entities called processes that run concurrently to each other and that interfere through communication effects. Interprocess communication effects can be accesses and modifications to shared variables in thread models, or messages exchanged between different actors in the actor model (Agha, 1986, Hewitt, Bishop, Steiger, 1973). At run time, a single process in a concurrent program may create an unbounded number of additional processes. This combination of process creation and communication effects in concurrent programs results in highly dynamic control flow and data flow.

Static analyses for concurrent programs have been proposed, and are discussed in details in Section 8. Most of these existing analyses either explicitly take into account every possible interleaving of concurrent executions at points where interprocess communication occurs, rendering them non-scalable as they are subject to the state explosion problem (Valmari, 1996), or are modular but limited in one of the following important properties: automation, precision, or support for programs in which the set of processes evolves dynamically. Performing static analysis of concurrent programs featuring unbounded processes in an automated, scalable, and precise way is therefore still an open problem, which we tackle in this paper.

Scalability of a static analysis can be achieved by modularizing it according to the general framework proposed by Cousot and Cousot (2002). A modular analysis treats the behavior of components (in our case, processes) separately from their interferences (in our case, communication). Modular static analysis has been explored in the context of shared-memory concurrency by Miné (2014) and for synchronous sequential processes by Midtgaard et al. (2016a). However, these and similar analyses are limited to programs with a known and fixed number of processes and therefore do not support analyzing programs where processes can be created dynamically.

In a modular analysis, the analysis of a component can trigger other components for re-analysis. This violates the notion of compositionality of an analysis, in which the results of the analysis for a whole program are the composition of the results of the analysis of each component. In a modular analysis, the results of the analysis follow from a fixed point obtained in the analysis of each component, having taken other components that interfere with it into account. In the case of concurrent programs, a thread that accesses a variable that is modified by another thread has to be reconsidered for analysis, as is one actor to which another actor may send a message.

This paper proposes ModConc, an approach for the modular static analysis of concurrent programs with unbounded dynamic process creation. This sets it apart from the aforementioned modular analyses for concurrent programs (Miné, 2014, Midtgaard, Nielson, Nielson, 2016a), which require a static process topology. We design ModConc as a general technique to design scalable modular analyses of concurrent programs based on their concurrent semantics. We demonstrate the use of ModConc to render an AAM-style analysis (Horn and Might, 2010) of concurrent programs scalable. AAM is a well-studied abstract interpretation method, and is used here to analyze the behavior of a single process in a flow-insensitive and context-insensitive manner. We demonstrate our approach on two concurrency models supporting dynamic process creation: threads and actors. Note that the choice of AAM and of the sensitivities is made only to demonstrate the application of ModConc, and that ModConc is not limited to such analyses.

The core insight behind ModConc is that the dynamic behavior of a process is entirely defined by its code and its communication effects. We therefore construct an intra-process analysis that analyzes a single process in isolation to infer the processes created and communication effects generated by this process under a given set of input conditions. The intra-process analysis is based on a modified version of the program semantics, replacing concurrent operations by operations that denote the corresponding generated effects but otherwise do not modify the analysis state of other processes in any way. The information obtained from the intra-process analyses is subsequently used by an inter-process analysis to compute the set of processes that interfered with the analyzed processes and therefore require (additional) intra-process analysis. When no new interprocess communication effects can be discovered, a sound over-approximation of the behavior of all processes in the program is obtained. The result is a modular —in the sense of Cousot and Cousot (2002) – whole-program analysis for concurrent programs that infers the set of all running processes and their communication effects in a sound and scalable manner.

A ModConc analysis is capable of inferring properties of concurrent programs that form the foundation of tool support for addressing pressing problems in software engineering such as program comprehension, bug detection and program verification. These inferred properties concern the processes created and their communication effects in addition to the traditional data and control flow properties computed by analyses for sequential programs. Our evaluation demonstrates that modular analyses designed with ModConc scale linearly with both the number of abstract processes created and the number of communication effects. The analyses do not suffer from the state explosion problem and are therefore able to analyze concurrent programs from a benchmark suite that consists of the actor-based programs from the well-known Savina benchmark suite (Imam and Sarkar, 2014) and their shared-memory multi-threaded equivalent, in a matter of seconds.

To summarize, the contributions of this paper are the following.

•
An extension of the framework by Cousot and Cousot of modular analysis (Cousot and Cousot, 2002) to concurrent programs with dynamic process creation.
•
The application of this extension to both thread-based and actor-based concurrency.
•
The formalization, empirical validation, and discussion of termination, soundness, and complexity of the approach on an analysis for thread-based and actor-based concurrency.
•
The construction of a benchmark suite composed of programs exposing dynamic creation of threads in a shared-memory concurrency setting, similar to the Savina benchmark suite for actor programs.

Section snippets

Context: Models of concurrency and their dynamic behavior

Most concurrency models share the concepts of processes and communication. A process is a unit of computation isolated from other processes, except for interferences (communications) that can occur between processes. In the thread model, a thread is a process and communication happens through shared variables, locks, and thread joining. In the actor programming paradigm, an actor is a process and communication happens through the exchange of messages.

Communication effects may trigger new

The ModConc Approach

Non-modular analyses explicitly explore all possible interleavings of the transition relation of a concurrent semantics (or a sound subset thereof) to derive how a concurrent program evolves at every program point. A non-modular analysis concurrently keeps track of the state of all processes and may step one of them at any given point. Every interference between two processes (process creation and interprocess communication) is immediately effected on the global analysis state.

Analyses

Base language: λ₀

In this work we explore both thread-based (Section 5) and actor-based concurrency (Section 6) on top of a Scheme-like base language λ₀. This base language is based on the λ-calculus in A-Normal Form (Flanagan et al., 1993), underlining the fact that ModConc readily supports higher-order languages. This restricted language is chosen to enable focusing on the core principles of ModConc, while still preserving the challenging aspects of combining higher-order with concurrent models that feature

Shared-memory concurrency with threads: λ_τ

We now add support for shared-memory concurrency to the base language λ₀ defined in the previous section by adding three new concepts.

1.
Thread creation and joining. Threads can be created to compute a value in a different process, and a thread can join another thread to obtain the final value of the computation performed by that other thread. Thread joining is a blocking operation and is a form of synchronization. Note that the notion of thread joining (one thread blocking until another thread

Actor-based concurrency

In this section we add support for actors to the base language λ₀ from Section 4 and then show how ModConc can be applied to this extended language. We present this on a higher-level, as most of the developments are similar to the ones made for λ_τ in the previous section. We add two new concepts to λ₀.

1.
Actor definition, creation, and evolution. Actors can be created by instantiating some defined actor behavior, resulting in the creation of a new process. Actors are associated with state variables

Empirical validation

We applied the modular analyses described in this paper to a number of benchmark programs to compute flow graphs over-approximating the behavior of concurrent programs written in λ_τ and λ_α, and to deduce the communication effects that may be generated by each process. We performed a number of experiments on our implementation of the analyses and their result.

Related work

Our ModConc method derives a modular concurrency analysis from a sequentialized concurrency analysis that collects communication effects instead of directly applying them. In this paper, the sequential analysis we use is inspired by Van Horn and Might’s work on Abstracting Abstract Machines (AAM) (Horn and Might, 2010). An AAM intra-process analysis can be parameterized in such a way that it is able to infer communication effects of processes with sufficient precision. The resulting ModConc

Conclusion and future work

This paper describes ModConc, a method to derive scalable modular static analyses for concurrent programs. Scalability is achieved by avoiding the state explosion problem that arises when an analysis models every possible process interleaving at every point that processes can possibly interfere. Instead, ModConc uses an intra-process analysis to analyze the behavior of a single process in isolation to infer which processes and communication effects by this process generates. The information

Acknowledgments

Quentin Stiévenart is funded bythe strategic research program titled “Foundations of Programming Models for Next-Generation Computing Platforms” funded by Vrije Universiteit Brussel. Jens Nicolay is funded by the SeCloud project sponsored by Innoviris, the Brussels Institute for Research and Innovation.

Quentin Stiévenart is a post-doctoral researcher at the Software Languages Lab of the Vrije Universiteit Brussel, where he obtained his Ph.D. in 2018. His research interests are in the domain of program analysis, more specifically static program analysis for modern concurrent programs that can exhibit complex features for program analyses, going from higher-order functions to combinations of concurrency paradigms.

References (97)

L. Fredlund et al.
McErlang: a model checker for a distributed functional programming language
A. Miné
Static analysis of run-time errors in embedded real-time parallel c programs
Logical Methods Comput. Sci.
(2012)
M. Abadi et al.
Types for safe locking: static race detection for java
ACM Trans. Program. Lang. Syst.
(2006)
H. Abelson et al.
Revised5 report on the algorithmic language scheme
High. Order Symb. Comput.
(1998)
G. Agha et al.
A foundation for actor computation
J. Funct. Program.
(1997)
G.A. Agha
ACTORS - A model of concurrent computation in distributed systems
(1986)
E.S. Andreasen et al.
Systematic approaches for increasing soundness and precision of static analyzers
Proceedings of the 6th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis
(2017)
C. Artho et al.
Applying static analysis to large-scale, multi-threaded Java programs
Proceedings of the 13th Australian Software Engineering Conference (ASWEC 2001)
(2001)
T. Arts et al.
System description: Verification of distributed Erlang programs
T. Arts et al.
Verifying generic Erlang client-server implementations

M.F. Atig et al.

On bounded reachability analysis of shared memory systems

LIPIcs-Leibniz International Proceedings in Informatics

(2014)

M.F. Atig et al.

Verification of asynchronous programs with nested locks

LIPIcs-Leibniz International Proceedings in Informatics

(2018)

S. Blom et al.

The vercors tool for verification of concurrent programs

Proceedings of the International Symposium on Formal Methods

(2014)

C. Boyapati et al.

Ownership types for safe programming: preventing data races and deadlocks

C. Boyapati et al.

A parameterized type system for race-free Java programs

C. Calcagno et al.

Compositional shape analysis by means of bi-abduction

E. Castegren et al.

Reference capabilities for concurrency control

Proceedings of the ECOOP

(2016)

M. Christakis et al.

Systematic testing for detecting concurrency errors in Erlang programs

Proceedings of the Sixth IEEE International Conference on Software Testing, Verification and Validation, ICST, Luxembourg, Luxembourg, March 18–22, 2013

(2013)

M. Christakis et al.

Static detection of race conditions in Erlang

E. Clarke et al.

SATABS: SAT-Based Predicate Abstraction for ANSI-C

Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems

(2005)

C. Colby

Analyzing the communication topology of concurrent programs

Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation, La Jolla, California, USA, June 21–23, 1995

(1995)

J.C. Corbett et al.

Bandera: extracting finite-state models from Java source code

P. Cousot et al.

Modular static program analysis

S. Crafa

Behavioural types for actor systems

CoRR

(2012)

F. Dagnat et al.

Static analysis of communications for Erlang

Proceedings of the 8th International Erlang/OTP User Conference

(2002)

M. Dam et al.

On the verification of open distributed systems

E. D’Osualdo

Verification of message passing concurrent systems

(2015)

E. D’Osualdo et al.

Automatic verification of Erlang-style concurrency

E. D’Osualdo et al.

Soter: an automatic safety verifier for Erlang

D.R. Engler et al.

Racerx: effective, static detection of race conditions and deadlocks

C. Flanagan et al.

Types for safe locking

C. Flanagan et al.

Type-based race detection for Java

C. Flanagan et al.

Thread-modular verification for shared-memory programs

C. Flanagan et al.

Modular verification of multithreaded programs

Theor. Comput. Sci.

(2005)

C. Flanagan et al.

Dynamic partial-order reduction for model checking software

C. Flanagan et al.

The essence of compiling with continuations

P. Fonseca et al.

Finding complex concurrency bugs in large multi-threaded applications

Proceedings of the Sixth European Conference on Computer Systems, EuroSys

(2011)

E. Giachino et al.

Actors may synchronize, safely!

Proceedings of the 18th International Symposium on Principles and Practice of Declarative Programming

(2016)

E. Giachino et al.

Deadlock analysis of unbounded process networks

Proceedings of the International Conference on Concurrency Theory

(2014)

P. Godefroid

Partial-Order methods for the verification of concurrent systems - An approach to the state-Explosion problem

Lecture Notes in Computer Science

(1996)

P. Godefroid

Model checking for programming languages using verisoft

A. Gotsman et al.

Thread-modular shape analysis

K. Havelund et al.

Model checking java programs using java pathfinder

STTT

(2000)

L. Henrio et al.

Analysis of synchronisations in stateful active objects

Proceedings of the International Conference on Integrated Formal Methods

(2017)

T.A. Henzinger et al.

Thread-modular abstraction refinement

C. Hewitt et al.

A universal modular ACTOR formalism for artificial intelligence

G.J. Holzmann

The SPIN Model Checker - Primer and Reference Manual

(2004)

K. Honda et al.

Multiparty asynchronous session types

Cited by (11)

A parallel worklist algorithm and its exploration heuristics for static modular analyses
2021, Journal of Systems and Software
Citation Excerpt :
Our approach falls within the domain of modular analyses, initially proposed by Cousot and Cousot (2002). In particular, we apply our approach to function-modular analyses (Nicolay et al., 2019) and process-modular analyses (Stiévenart et al., 2019). The sequential algorithm for the inter-component analysis (Algorithm 1) can be seen as an instantiation of Kildall’s worklist algorithm (Kildall, 1973; Fecht and Seidl, 1999) for a system of equations where variables and their dependencies are discovered dynamically.
One way to speed up static programme analysis is to make use of today’s multi-core CPUs by parallelising the analysis. Existing work on parallel analysis usually targets traditional data-flow analyses for static, first-order languages such as C. Less attention has been given so far to the parallelisation of more general analyses that can also target dynamic, higher-order languages such as JavaScript. These are significantly more challenging to parallelise, as dependencies between analysis results are only discovered during the analysis itself. State-of-the-art parallel analyses for such languages are therefore usually limited, both in their applicability and performance gains.
In this work, we propose the parallelisation of modular analyses. Modular analyses compute different parts of the analysis in isolation of one another, and therefore offer inherent opportunities for parallelisation that have not been explored so far. In addition, they can be used to develop a general class of analysers for dynamic, higher-order languages. We present a parallel variant of the worklist algorithm that is used to drive such modular analyses. To further speed up its convergence, we show how this algorithm can exploit the monotonicity of the analysis. Existing modular analyses can be parallelised without additional effort by instead employing this parallel worklist algorithm. We demonstrate this for ModF, an inter-procedural modular analysis, and for ModConc, an inter-process modular analysis. For ModConc, we reveal an additional opportunity to exploit even more parallelism in the analysis: analyses of individual ModConc components can themselves be parallel, resulting in a doubly-parallel exploration. Finally, we present several heuristics for the exploration order of the analysis and discuss how they can impact its performance.
The parallel worklist algorithm and the exploration heuristics are implemented for and integrated into MAF, a framework for modular programme analysis. On a set of Scheme benchmarks for ModF, we observe speedups between $3 \times$ and $8 \times$ when using 4 workers, and speedups between $8 \times$ and $32 \times$ when using 16 workers, with a maximum speedup of $333 \times$ using 128 workers. For ModConc, we achieve a maximum speedup of $37 \times$ with 32 workers. We observe that on a ModF analysis, among 11 exploration heuristics, the heuristics prioritising either components with smaller environments or with less dependencies result in consistent speedups that can reach $20 \times$ those of a random exploration strategy. We find a clear correlation between the mean number of dependencies in a programme and the speedup obtained by this heuristic.
Cross-Level Debugging for Static Analysers
2023, SLE 2023 - Proceedings of the 16th ACM SIGPLAN International Conference on Software Language Engineering, Co-located with: SPLASH 2023
Change Pattern Detection for Optimising Incremental Static Analysis
2023, Proceedings - 2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation, SCAM 2023
MODINF: Exploiting Reified Computational Dependencies for Information Flow Analysis
2023, International Conference on Evaluation of Novel Approaches to Software Engineering, ENASE - Proceedings
Result Invalidation for Incremental Modular Analyses
2023, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Summary-Based Compositional Analysis for Soft Contract Verification
2022, Proceedings - 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation, SCAM 2022

View all citing articles on Scopus

Jens Nicolay is a part-time professor at the Software Languages Lab of the VUB, where he obtained his Ph.D. in 2016. Before that, he was a software consultant for over 10 years. His main expertise is static analysis of functional and object-oriented languages. He currently coordinates the programming perspective of a research project which treats security of cloud applications in a holistic manner.

Coen De Roover s a professor in Software Engineering at the Software Languages Lab of the Vrije Universiteit Brussel, where he obtained his Ph.D. in 2009. The central theme of his research is the design of program analysis and transformation techniques, and their application in software engineering tools for quality assurance. Example analysis techniques include abstract interpretation of dynamically-typed programs in general, and of JavaScript programs in particular. Example tools include tools for detecting user-specified bug patterns in an implementation, or for validating an implementation with respect to a user-specified design. Here, an executable logic often serves as the tools specification language.

Prof. Dr. Wolfgang De Meuter is a professor at the Software Languages Lab of the Vrije Universiteit Brussel. His main research interests include the design and implementation of programming languages focusing on several different concepts such as formal semantics, aspect-oriented programming and meta-programming. He has been active in the field of object-orientation since the early nineties, and after obtaining his Ph.D. in 2006 he has led the distributed programming group at the Software Languages Lab. In 2008, he was awarded the Dahl–Nygaard Prize for his contribution to object-oriented programming of ambient systems.

View full text

A general method for rendering static analyses for diverse concurrency models modular

Highlights

Abstract

Introduction

Section snippets

Context: Models of concurrency and their dynamic behavior

The ModConc Approach

Base language: λ0

Shared-memory concurrency with threads: λτ

Actor-based concurrency

Empirical validation

Related work

Conclusion and future work

Acknowledgments

Logical Methods Comput. Sci.

Types for safe locking: static race detection for java

ACM Trans. Program. Lang. Syst.

Revised5 report on the algorithmic language scheme

High. Order Symb. Comput.

A foundation for actor computation

J. Funct. Program.

ACTORS - A model of concurrent computation in distributed systems

Systematic approaches for increasing soundness and precision of static analyzers

Proceedings of the 6th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis

Applying static analysis to large-scale, multi-threaded Java programs

Proceedings of the 13th Australian Software Engineering Conference (ASWEC 2001)

System description: Verification of distributed Erlang programs

Verifying generic Erlang client-server implementations

On bounded reachability analysis of shared memory systems

LIPIcs-Leibniz International Proceedings in Informatics

Verification of asynchronous programs with nested locks

LIPIcs-Leibniz International Proceedings in Informatics

The vercors tool for verification of concurrent programs

Proceedings of the International Symposium on Formal Methods

Ownership types for safe programming: preventing data races and deadlocks

A parameterized type system for race-free Java programs

Compositional shape analysis by means of bi-abduction

Reference capabilities for concurrency control

Proceedings of the ECOOP

Systematic testing for detecting concurrency errors in Erlang programs

Proceedings of the Sixth IEEE International Conference on Software Testing, Verification and Validation, ICST, Luxembourg, Luxembourg, March 18–22, 2013

Static detection of race conditions in Erlang

SATABS: SAT-Based Predicate Abstraction for ANSI-C

Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems

Analyzing the communication topology of concurrent programs

Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation, La Jolla, California, USA, June 21–23, 1995

Bandera: extracting finite-state models from Java source code

Modular static program analysis

Behavioural types for actor systems

CoRR

Static analysis of communications for Erlang

Proceedings of the 8th International Erlang/OTP User Conference

On the verification of open distributed systems

Verification of message passing concurrent systems

Automatic verification of Erlang-style concurrency

Soter: an automatic safety verifier for Erlang

Racerx: effective, static detection of race conditions and deadlocks

Types for safe locking

Type-based race detection for Java

Thread-modular verification for shared-memory programs

Modular verification of multithreaded programs

Theor. Comput. Sci.

Dynamic partial-order reduction for model checking software

The essence of compiling with continuations

Finding complex concurrency bugs in large multi-threaded applications

Proceedings of the Sixth European Conference on Computer Systems, EuroSys

Actors may synchronize, safely!

Proceedings of the 18th International Symposium on Principles and Practice of Declarative Programming

Deadlock analysis of unbounded process networks

Proceedings of the International Conference on Concurrency Theory

Partial-Order methods for the verification of concurrent systems - An approach to the state-Explosion problem

Lecture Notes in Computer Science

Model checking for programming languages using verisoft

Thread-modular shape analysis

Model checking java programs using java pathfinder

STTT

Analysis of synchronisations in stateful active objects

Proceedings of the International Conference on Integrated Formal Methods

Thread-modular abstraction refinement

A universal modular ACTOR formalism for artificial intelligence

Base language: λ₀

Shared-memory concurrency with threads: λ_τ