Discovering thread interactions in a concurrent system

https://doi.org/10.1016/j.jss.2004.04.029Get rights and content

Abstract

Understanding the behavior of a system is a central reverse engineering task, and is crucial for being able to modify, maintain, and improve the system. An often difficult aspect of some system behaviors is concurrency, in particular identifying those areas that exhibit mutual exclusion and those that exhibit synchronization. In this paper we present a technique that builds on our previous work in behavior discovery to find the points in the system that demonstrate mutually exclusive and synchronized behavior. Finding these points in the behavior of the system is an important aid in reverse engineering a complete and correct model of the system.

Introduction

Legacy systems that need reverse engineering often display concurrent behavior that needs to be understood. This might be in the form of multiple threads of control on a single system or it may be among distributed components. In either case, understanding the behavior of the overall system is a necessary task in reverse engineering.

One source of information about the behavior of a system is a trace of its execution, and many systems have the ability to generate some sort of trace. Such a trace might be captured from the system itself through some logging capability, from more extensive debugging options that can be enabled within a test environment, or from a protocol capture harness within a distributed component system. These traces, or more accurately a collection of such traces, can be used to help understand the behavior of a system needing reverse engineered.

In previous work, we have explored the possibilities of inferring state machine models of sequential behavior, and of inferring some aspects of concurrent behavior (Cook and Wolf, 1998a, Cook and Wolf, 1998b). In addition to creating robust methods for discovering sequential models, we found that it is possible to identify potential fork and join points based on frequencies of event sequences occurrences. The output of techniques such as these are concurrent state models that have per-thread sequencing, selection, and iteration, and have thread creation and destruction points (forks/joins).

In this paper we tackle the problem of filling in the details of the thread interaction behavior. Having behavior descriptions of separate threads is useful, but for true understanding of the system, we must be able to find the points at which the threads interact. These are of two types: mutual exclusion and synchronization.

Finding mutual exclusion and synchronization from the actual dynamic behavior is useful because observations of the dynamic behavior can elucidate a much different understanding of the system than an analysis of the static code. While it is true that finding all uses of the synchronized keyword in Java will show one the areas of enforced mutual exclusion,1 a system may have other areas that exhibit unintended mutual exclusion, and this may cause performance problems. Similarly, threads may exhibit unintended synchronization. Apart from programming-level behavior effects, one still may need to understand the thread interactions at a higher level than the programming language constructs, and analysis over captured events can be helpful in this.

This paper presents a technique that infers points of mutual exclusion and synchronization on a model that already contains thread behavior descriptions. Specifically, we match the event trace to the model and generate measurements from which mutual exclusion and synchronization can be inferred. We also employ techniques to reduce the amount of information output to the user, and allow interactive graphical browsing of the discovered relations directly on the graphical model. Our tool SyncMex implements these techniques.

The next section provides definitions and background discussion to place the technique in context. Section 3 introduces the technique and its component metrics. Section 4 details an example use of the technique and discusses the success of the methods on these examples. Finally, Section 5 presents related work and Section 6 concludes with some observations and some ideas for future work.

Section snippets

Background

In this section we describe our view of events, concurrency, and dependencies among events that constrain concurrency. We also discuss several assumptions that underlie our work. Throughout, we use the term system to mean the whole system, and the term thread to mean a sequential execution control path within the system, running concurrently with other threads.

Methods

In this section we describe the techniques we use to infer the points of mutual exclusion and synchronization in a system. The overall inference process is shown in Fig. 3. This process involves several different steps, each with their own background of research and techniques. It is the combination of steps 2 and 3, and the addition of the new step 4 (the last one) and its resulting output that is the contribution of this paper. Step 4 is the focus of presentation, although the first three

Example

To exercise our tools on a real-world example, we used the Spark98 sparse matrix kernel, which uses parallel computation to compute sparse matrix vector products (O’Hallaron, 1997). We used the lock-based shared memory version, and instrumented it to log a trace of events, including both application events (zeroing the vectors, updating shared partial products, and computing local partial products), and control events (thread begin/end, mutex lock set/unset, and thread synchronization barrier

Related work

There is a long history of theoretical work concerned with inferring grammars for languages given example sentences in the language (Angluin and Smith, 1983, Gold, 1967, Jain and Sharma, 1994, Lange et al., 1994, Pitt, 1989, Valiant, 1984). Other efforts have also used statistical methods (Carrasco and Oncina, 1994, Miclet, 1990). None of these early efforts looked at the problem of concurrency in the trace.

The area of dynamic analysis has been growing rapidly, and some representative work in

Conclusion

In this paper we developed and demonstrated a technique for discovering mutual exclusion and synchronization relations on states of a system behavior model that had no prior indication of such relations. This built on our previous work that could discover the structure of a concurrent model, but not the inter-thread relations. Discovering these relations encompassed extracting information from previous techniques in model discovery and model-to-behavior matching, and then processing that

Acknowledgments

I would like to thank the anonymous reviewers for their insightful comments that helped make this a better paper. This work was supported in part by the National Science Foundation under grant CCR-9804067, and by the Department of Education. The content of the information does not necessarily reflect the position or the policy of the Government and no official endorsement should be inferred.

Jonathan Cook is associate professor in the Computer Science Department at New Mexico State University. His research interests are in the areas of software process data analysis, dynamic analysis of software, reliable component-based systems, and large software system maintenance. Dr. Cook is a member of the ACM and the IEEE Computer Society.

References (33)

  • E. Gold

    Language identification in the limit

    Information and Control

    (1967)
  • R. Agrawal et al.

    Mining Process Models from Workflow Logs

    Lecture Notes in Computer Science

    (1998)
  • Alur, R., Etessami, K., Yannakakis, M., 2000. Inference of message sequence charts. In: Proceedings of 22nd...
  • D. Angluin et al.

    Inductive inference: theory and methods

    ACM Computing Surveys

    (1983)
  • G. Avrunin et al.

    Automated analysis of concurrent systems with the constrained expression toolset

    IEEE Transactions on Software Engineering

    (1991)
  • P. Bates

    Debugging heterogeneous systems using event-based models of behavior

  • R. Carrasco et al.

    Learning stochastic regular grammars by means of a state merging method

  • Cook, J., Du, Z., 2002. Discovering thread interactions in a concurrent system. In: Proceedings of 2002 Working...
  • J. Cook et al.

    Discovering Models of Software Processes from Event-Based Data

    ACM Transactions on Software Engineering and Methodology

    (1998)
  • J. Cook et al.

    Event-based detection of concurrency

  • J. Cook et al.

    Software process validation: quantitatively measuring the correspondence of a process to a model

    ACM Transactions on Software Engineering and Methodology

    (1999)
  • Cook, J., He, C., Ma, C., 2001. Measuring behavioral correspondence to a timed concurrent model. In: Proceedings of...
  • J. Cuny et al.

    The adriane debugger: scalable application of event-based abstraction

  • J. Devore

    Probability and Statistics for Engineering and the Sciences

    (1991)
  • M. Diaz et al.

    Observer—a concept for formal on-line validation of distributed systems

    IEEE Transactions on Software Engineering

    (1994)
  • Eisenhauer, G., Gu, W., Kraemer, E., Schwan, K., Stasko, J., 1997. Online displays of parallel programs: problems and...
  • Cited by (12)

    View all citing articles on Scopus

    Jonathan Cook is associate professor in the Computer Science Department at New Mexico State University. His research interests are in the areas of software process data analysis, dynamic analysis of software, reliable component-based systems, and large software system maintenance. Dr. Cook is a member of the ACM and the IEEE Computer Society.

    Zhidian Du received his M.S. degree in Computer Science in 2001 from New Mexico State University.

    View full text