A general method for rendering static analyses for diverse concurrency models modular

https://doi.org/10.1016/j.jss.2018.10.001Get rights and content

Highlights

  • A discussion of the limitations of static analysis techniques for concurrent programs.

  • A general method for building scalable modular analysis of concurrent programs.

  • The application of this method to actors and to threads.

  • A theoretical and empirical evaluation of the resulting analyses.

  • A new benchmark suite for evaluating static analysis of multi-threaded applications.

Abstract

Shared-memory multi-threading and the actor model both share the notion of processes featuring communication, respectively by modifying shared state and by sending messages. Existing static analyses for concurrent programs either model every possible process interleavings and therefore suffer from the state explosion problem, or feature modularity but lack in precision or in their support for dynamic processes. In this paper we present a general method for obtaining a scalable analysis of concurrent programs featuring dynamic process creation. Our ModConc method transforms an abstract concurrent semantics modeling processes and communication into a modular static analysis treating the behavior of processes separately from their communication. We present ModConc in a generic way and demonstrate its applicability by instantiating it for multi-threaded and actor-based programs. The resulting analyses are evaluated in terms of precision, performance, scalability, and soundness. While a typical non-modular static analysis time out on half of our 56 benchmarks with a 30 min timeout, ModConc analyses successfully analyze all of them in less than 30 s, while remaining on par in terms of precision. Analyzing concurrent processes in isolation while modeling their communications is the key ingredient in supporting scalable analysis of concurrent programs featuring dynamic process communication.

Introduction

In most concurrent programming models, programs consist of entities called processes that run concurrently to each other and that interfere through communication effects. Interprocess communication effects can be accesses and modifications to shared variables in thread models, or messages exchanged between different actors in the actor model (Agha, 1986, Hewitt, Bishop, Steiger, 1973). At run time, a single process in a concurrent program may create an unbounded number of additional processes. This combination of process creation and communication effects in concurrent programs results in highly dynamic control flow and data flow.

Static analyses for concurrent programs have been proposed, and are discussed in details in Section 8. Most of these existing analyses either explicitly take into account every possible interleaving of concurrent executions at points where interprocess communication occurs, rendering them non-scalable as they are subject to the state explosion problem (Valmari, 1996), or are modular but limited in one of the following important properties: automation, precision, or support for programs in which the set of processes evolves dynamically. Performing static analysis of concurrent programs featuring unbounded processes in an automated, scalable, and precise way is therefore still an open problem, which we tackle in this paper.

Scalability of a static analysis can be achieved by modularizing it according to the general framework proposed by Cousot and Cousot (2002). A modular analysis treats the behavior of components (in our case, processes) separately from their interferences (in our case, communication). Modular static analysis has been explored in the context of shared-memory concurrency by Miné (2014) and for synchronous sequential processes by Midtgaard et al. (2016a). However, these and similar analyses are limited to programs with a known and fixed number of processes and therefore do not support analyzing programs where processes can be created dynamically.

In a modular analysis, the analysis of a component can trigger other components for re-analysis. This violates the notion of compositionality of an analysis, in which the results of the analysis for a whole program are the composition of the results of the analysis of each component. In a modular analysis, the results of the analysis follow from a fixed point obtained in the analysis of each component, having taken other components that interfere with it into account. In the case of concurrent programs, a thread that accesses a variable that is modified by another thread has to be reconsidered for analysis, as is one actor to which another actor may send a message.

This paper proposes ModConc, an approach for the modular static analysis of concurrent programs with unbounded dynamic process creation. This sets it apart from the aforementioned modular analyses for concurrent programs (Miné, 2014, Midtgaard, Nielson, Nielson, 2016a), which require a static process topology. We design ModConc as a general technique to design scalable modular analyses of concurrent programs based on their concurrent semantics. We demonstrate the use of ModConc to render an AAM-style analysis (Horn and Might, 2010) of concurrent programs scalable. AAM is a well-studied abstract interpretation method, and is used here to analyze the behavior of a single process in a flow-insensitive and context-insensitive manner. We demonstrate our approach on two concurrency models supporting dynamic process creation: threads and actors. Note that the choice of AAM and of the sensitivities is made only to demonstrate the application of ModConc, and that ModConc is not limited to such analyses.

The core insight behind ModConc is that the dynamic behavior of a process is entirely defined by its code and its communication effects. We therefore construct an intra-process analysis that analyzes a single process in isolation to infer the processes created and communication effects generated by this process under a given set of input conditions. The intra-process analysis is based on a modified version of the program semantics, replacing concurrent operations by operations that denote the corresponding generated effects but otherwise do not modify the analysis state of other processes in any way. The information obtained from the intra-process analyses is subsequently used by an inter-process analysis to compute the set of processes that interfered with the analyzed processes and therefore require (additional) intra-process analysis. When no new interprocess communication effects can be discovered, a sound over-approximation of the behavior of all processes in the program is obtained. The result is a modular —in the sense of Cousot and Cousot (2002) – whole-program analysis for concurrent programs that infers the set of all running processes and their communication effects in a sound and scalable manner.

A ModConc analysis is capable of inferring properties of concurrent programs that form the foundation of tool support for addressing pressing problems in software engineering such as program comprehension, bug detection and program verification. These inferred properties concern the processes created and their communication effects in addition to the traditional data and control flow properties computed by analyses for sequential programs. Our evaluation demonstrates that modular analyses designed with ModConc scale linearly with both the number of abstract processes created and the number of communication effects. The analyses do not suffer from the state explosion problem and are therefore able to analyze concurrent programs from a benchmark suite that consists of the actor-based programs from the well-known Savina benchmark suite (Imam and Sarkar, 2014) and their shared-memory multi-threaded equivalent, in a matter of seconds.

To summarize, the contributions of this paper are the following.

  • An extension of the framework by Cousot and Cousot of modular analysis (Cousot and Cousot, 2002) to concurrent programs with dynamic process creation.

  • The application of this extension to both thread-based and actor-based concurrency.

  • The formalization, empirical validation, and discussion of termination, soundness, and complexity of the approach on an analysis for thread-based and actor-based concurrency.

  • The construction of a benchmark suite composed of programs exposing dynamic creation of threads in a shared-memory concurrency setting, similar to the Savina benchmark suite for actor programs.

Section snippets

Context: Models of concurrency and their dynamic behavior

Most concurrency models share the concepts of processes and communication. A process is a unit of computation isolated from other processes, except for interferences (communications) that can occur between processes. In the thread model, a thread is a process and communication happens through shared variables, locks, and thread joining. In the actor programming paradigm, an actor is a process and communication happens through the exchange of messages.

Communication effects may trigger new

The ModConc Approach

Non-modular analyses explicitly explore all possible interleavings of the transition relation of a concurrent semantics (or a sound subset thereof) to derive how a concurrent program evolves at every program point. A non-modular analysis concurrently keeps track of the state of all processes and may step one of them at any given point. Every interference between two processes (process creation and interprocess communication) is immediately effected on the global analysis state.

Analyses

Base language: λ0

In this work we explore both thread-based (Section 5) and actor-based concurrency (Section 6) on top of a Scheme-like base language λ0. This base language is based on the λ-calculus in A-Normal Form (Flanagan et al., 1993), underlining the fact that ModConc readily supports higher-order languages. This restricted language is chosen to enable focusing on the core principles of ModConc, while still preserving the challenging aspects of combining higher-order with concurrent models that feature

Shared-memory concurrency with threads: λτ

We now add support for shared-memory concurrency to the base language λ0 defined in the previous section by adding three new concepts.

  • 1.

    Thread creation and joining. Threads can be created to compute a value in a different process, and a thread can join another thread to obtain the final value of the computation performed by that other thread. Thread joining is a blocking operation and is a form of synchronization. Note that the notion of thread joining (one thread blocking until another thread

Actor-based concurrency

In this section we add support for actors to the base language λ0 from Section 4 and then show how ModConc can be applied to this extended language. We present this on a higher-level, as most of the developments are similar to the ones made for λτ in the previous section. We add two new concepts to λ0.

  • 1.

    Actor definition, creation, and evolution. Actors can be created by instantiating some defined actor behavior, resulting in the creation of a new process. Actors are associated with state variables

Empirical validation

We applied the modular analyses described in this paper to a number of benchmark programs to compute flow graphs over-approximating the behavior of concurrent programs written in λτ and λα, and to deduce the communication effects that may be generated by each process. We performed a number of experiments on our implementation of the analyses and their result.

Related work

Our ModConc method derives a modular concurrency analysis from a sequentialized concurrency analysis that collects communication effects instead of directly applying them. In this paper, the sequential analysis we use is inspired by Van Horn and Might’s work on Abstracting Abstract Machines (AAM) (Horn and Might, 2010). An AAM intra-process analysis can be parameterized in such a way that it is able to infer communication effects of processes with sufficient precision. The resulting ModConc

Conclusion and future work

This paper describes ModConc, a method to derive scalable modular static analyses for concurrent programs. Scalability is achieved by avoiding the state explosion problem that arises when an analysis models every possible process interleaving at every point that processes can possibly interfere. Instead, ModConc uses an intra-process analysis to analyze the behavior of a single process in isolation to infer which processes and communication effects by this process generates. The information

Acknowledgments

Quentin Stiévenart is funded bythe strategic research program titled “Foundations of Programming Models for Next-Generation Computing Platforms” funded by Vrije Universiteit Brussel. Jens Nicolay is funded by the SeCloud project sponsored by Innoviris, the Brussels Institute for Research and Innovation.

Quentin Stiévenart is a post-doctoral researcher at the Software Languages Lab of the Vrije Universiteit Brussel, where he obtained his Ph.D. in 2018. His research interests are in the domain of program analysis, more specifically static program analysis for modern concurrent programs that can exhibit complex features for program analyses, going from higher-order functions to combinations of concurrency paradigms.

References (97)

  • L. Fredlund et al.

    McErlang: a model checker for a distributed functional programming language

  • A. Miné

    Static analysis of run-time errors in embedded real-time parallel c programs

    Logical Methods Comput. Sci.

    (2012)
  • M. Abadi et al.

    Types for safe locking: static race detection for java

    ACM Trans. Program. Lang. Syst.

    (2006)
  • H. Abelson et al.

    Revised5 report on the algorithmic language scheme

    High. Order Symb. Comput.

    (1998)
  • G. Agha et al.

    A foundation for actor computation

    J. Funct. Program.

    (1997)
  • G.A. Agha

    ACTORS - A model of concurrent computation in distributed systems

    (1986)
  • E.S. Andreasen et al.

    Systematic approaches for increasing soundness and precision of static analyzers

    Proceedings of the 6th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis

    (2017)
  • C. Artho et al.

    Applying static analysis to large-scale, multi-threaded Java programs

    Proceedings of the 13th Australian Software Engineering Conference (ASWEC 2001)

    (2001)
  • T. Arts et al.

    System description: Verification of distributed Erlang programs

  • T. Arts et al.

    Verifying generic Erlang client-server implementations

  • M.F. Atig et al.

    On bounded reachability analysis of shared memory systems

    LIPIcs-Leibniz International Proceedings in Informatics

    (2014)
  • M.F. Atig et al.

    Verification of asynchronous programs with nested locks

    LIPIcs-Leibniz International Proceedings in Informatics

    (2018)
  • S. Blom et al.

    The vercors tool for verification of concurrent programs

    Proceedings of the International Symposium on Formal Methods

    (2014)
  • C. Boyapati et al.

    Ownership types for safe programming: preventing data races and deadlocks

  • C. Boyapati et al.

    A parameterized type system for race-free Java programs

  • C. Calcagno et al.

    Compositional shape analysis by means of bi-abduction

  • E. Castegren et al.

    Reference capabilities for concurrency control

    Proceedings of the ECOOP

    (2016)
  • M. Christakis et al.

    Systematic testing for detecting concurrency errors in Erlang programs

    Proceedings of the Sixth IEEE International Conference on Software Testing, Verification and Validation, ICST, Luxembourg, Luxembourg, March 18–22, 2013

    (2013)
  • M. Christakis et al.

    Static detection of race conditions in Erlang

  • E. Clarke et al.

    SATABS: SAT-Based Predicate Abstraction for ANSI-C

    Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems

    (2005)
  • C. Colby

    Analyzing the communication topology of concurrent programs

    Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation, La Jolla, California, USA, June 21–23, 1995

    (1995)
  • J.C. Corbett et al.

    Bandera: extracting finite-state models from Java source code

  • P. Cousot et al.

    Modular static program analysis

  • S. Crafa

    Behavioural types for actor systems

    CoRR

    (2012)
  • F. Dagnat et al.

    Static analysis of communications for Erlang

    Proceedings of the 8th International Erlang/OTP User Conference

    (2002)
  • M. Dam et al.

    On the verification of open distributed systems

  • E. D’Osualdo

    Verification of message passing concurrent systems

    (2015)
  • E. D’Osualdo et al.

    Automatic verification of Erlang-style concurrency

  • E. D’Osualdo et al.

    Soter: an automatic safety verifier for Erlang

  • D.R. Engler et al.

    Racerx: effective, static detection of race conditions and deadlocks

  • C. Flanagan et al.

    Types for safe locking

  • C. Flanagan et al.

    Type-based race detection for Java

  • C. Flanagan et al.

    Thread-modular verification for shared-memory programs

  • C. Flanagan et al.

    Modular verification of multithreaded programs

    Theor. Comput. Sci.

    (2005)
  • C. Flanagan et al.

    Dynamic partial-order reduction for model checking software

  • C. Flanagan et al.

    The essence of compiling with continuations

  • P. Fonseca et al.

    Finding complex concurrency bugs in large multi-threaded applications

    Proceedings of the Sixth European Conference on Computer Systems, EuroSys

    (2011)
  • E. Giachino et al.

    Actors may synchronize, safely!

    Proceedings of the 18th International Symposium on Principles and Practice of Declarative Programming

    (2016)
  • E. Giachino et al.

    Deadlock analysis of unbounded process networks

    Proceedings of the International Conference on Concurrency Theory

    (2014)
  • P. Godefroid

    Partial-Order methods for the verification of concurrent systems - An approach to the state-Explosion problem

    Lecture Notes in Computer Science

    (1996)
  • P. Godefroid

    Model checking for programming languages using verisoft

  • A. Gotsman et al.

    Thread-modular shape analysis

  • K. Havelund et al.

    Model checking java programs using java pathfinder

    STTT

    (2000)
  • L. Henrio et al.

    Analysis of synchronisations in stateful active objects

    Proceedings of the International Conference on Integrated Formal Methods

    (2017)
  • T.A. Henzinger et al.

    Thread-modular abstraction refinement

  • C. Hewitt et al.

    A universal modular ACTOR formalism for artificial intelligence

  • G.J. Holzmann

    The SPIN Model Checker - Primer and Reference Manual

    (2004)
  • K. Honda et al.

    Multiparty asynchronous session types

  • Cited by (11)

    • A parallel worklist algorithm and its exploration heuristics for static modular analyses

      2021, Journal of Systems and Software
      Citation Excerpt :

      Our approach falls within the domain of modular analyses, initially proposed by Cousot and Cousot (2002). In particular, we apply our approach to function-modular analyses (Nicolay et al., 2019) and process-modular analyses (Stiévenart et al., 2019). The sequential algorithm for the inter-component analysis (Algorithm 1) can be seen as an instantiation of Kildall’s worklist algorithm (Kildall, 1973; Fecht and Seidl, 1999) for a system of equations where variables and their dependencies are discovered dynamically.

    • Cross-Level Debugging for Static Analysers

      2023, SLE 2023 - Proceedings of the 16th ACM SIGPLAN International Conference on Software Language Engineering, Co-located with: SPLASH 2023
    • Change Pattern Detection for Optimising Incremental Static Analysis

      2023, Proceedings - 2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation, SCAM 2023
    • MODINF: Exploiting Reified Computational Dependencies for Information Flow Analysis

      2023, International Conference on Evaluation of Novel Approaches to Software Engineering, ENASE - Proceedings
    • Result Invalidation for Incremental Modular Analyses

      2023, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    • Summary-Based Compositional Analysis for Soft Contract Verification

      2022, Proceedings - 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation, SCAM 2022
    View all citing articles on Scopus

    Quentin Stiévenart is a post-doctoral researcher at the Software Languages Lab of the Vrije Universiteit Brussel, where he obtained his Ph.D. in 2018. His research interests are in the domain of program analysis, more specifically static program analysis for modern concurrent programs that can exhibit complex features for program analyses, going from higher-order functions to combinations of concurrency paradigms.

    Jens Nicolay is a part-time professor at the Software Languages Lab of the VUB, where he obtained his Ph.D. in 2016. Before that, he was a software consultant for over 10 years. His main expertise is static analysis of functional and object-oriented languages. He currently coordinates the programming perspective of a research project which treats security of cloud applications in a holistic manner.

    Coen De Roover s a professor in Software Engineering at the Software Languages Lab of the Vrije Universiteit Brussel, where he obtained his Ph.D. in 2009. The central theme of his research is the design of program analysis and transformation techniques, and their application in software engineering tools for quality assurance. Example analysis techniques include abstract interpretation of dynamically-typed programs in general, and of JavaScript programs in particular. Example tools include tools for detecting user-specified bug patterns in an implementation, or for validating an implementation with respect to a user-specified design. Here, an executable logic often serves as the tools specification language.

    Prof. Dr. Wolfgang De Meuter is a professor at the Software Languages Lab of the Vrije Universiteit Brussel. His main research interests include the design and implementation of programming languages focusing on several different concepts such as formal semantics, aspect-oriented programming and meta-programming. He has been active in the field of object-orientation since the early nineties, and after obtaining his Ph.D. in 2006 he has led the distributed programming group at the Software Languages Lab. In 2008, he was awarded the Dahl–Nygaard Prize for his contribution to object-oriented programming of ambient systems.

    View full text