Effective pattern-driven concurrency bug detection for operating systems

https://doi.org/10.1016/j.jss.2012.08.063Get rights and content

Abstract

As multi-core hardware has become more popular, concurrent programming is being more widely adopted in software. In particular, operating systems such as Linux utilize multi-threaded techniques heavily to enhance performance. However, current analysis techniques and tools for validating concurrent programs often fail to detect concurrency bugs in operating systems (OSes) due to the complex characteristics of OSes. To detect concurrency bugs in OSes in a practical manner, we have developed the COncurrency Bug dETector (COBET) framework based on composite bug patterns augmented with semantic conditions. The effectiveness, efficiency, and applicability of COBET were demonstrated by detecting 10 new bugs in file systems, device drivers, and network modules of Linux 2.6.30.4 as confirmed by the Linux maintainers.

Highlights

► We identified characteristics of concurrency bugs in Linux kernel by reviewing ChangeLog bug reports. ► We developed a pattern-driven COncurrency Bug dETector (COBET) that utilizes composite bug patterns with semantic conditions to detect complex bugs. ► TWe defined four concurrency bug patterns and detected ten new bugs in Linux kernel.

Introduction

As multi-core hardware becomes increasingly powerful and popular, operating systems (OSes) such as Linux utilize the cutting-edge multi-threaded techniques heavily to enhance performance. However, current analysis techniques and tools for concurrent programs have limitations when they are applied to operating systems due to the complex characteristics of OSes. In particular, the following three characteristics of OSes make concurrency bug detection on OSes difficult.

  • Various synchronization mechanisms utilized

    Most concurrency bug detection techniques (Choi et al., 2002, Engler and Ashcraft, 2003, Naik et al., 2009, Raza and Vogel, 2008, Savage et al., 1997, Voung et al., 2007) focus on lock usage, since a majority of user-level applications utilize simple mutexes/critical sections to enforce synchronization. However, OSes exploit various synchronization mechanisms (see Table 1) for performance enhancement.

  • Customized synchronization primitives

    OS developers sometimes implement their own synchronization primitives. Thus, concurrency bug detection tools for standard synchronization mechanisms do not recognize these customized synchronization primitives and produce imprecise results (Xiong et al., 2010).

  • High complexity of operating systems

    A dynamic analysis (i.e., testing) often fails to uncover hidden concurrency bugs due to the exponential number of possible interleaving scenarios between threads in OSes. In addition, replaying bugs is difficult, since it is hard to manipulate thread schedulers in OSes directly. A static analysis, on the other hand, has limited scalability to analyze OS code due to its high complexity and complicated data structures. Furthermore, the monolithic structure (i.e., tightly coupled large global data structure) of OSes severely hinder modular analyses.

For these reasons, in spite of much research on concurrent bug detection (see Section 6), such techniques have seldom been applied to OS development in practice.

To alleviate the above difficulties, we have developed the COncurrency Bug dETector (COBET) framework, which utilizes composite bug patterns augmented with semantic conditions. Note that concurrency errors are caused by unintended interference between multiple threads. A salient contribution of COBET is that it utilizes multiple sub-patterns, each of which represents a buggy pattern in one thread, and checks semantic information that determines possible interferences between multiple threads in a precise and scalable manner (see Section 3). In addition, since engineers who use COBET can define various concurrency bug patterns in a flexible manner, COBET can detect concurrency bugs that are due to customized synchronization mechanisms or not targeted by lock-based concurrency bug detection tools.

One drawback of COBET is that a user has to identify and define bug patterns. To identify effective (i.e., detecting many bugs) and precise (i.e., raising few false alarm) bug pattern requires user's domain knowledge on target code. In addition, it takes time to concretely define bug patterns for identified bugs in a machine processable form. Without such effort, it is easy to define imprecise bug patterns, which increases the burden to filter out false alarms manually and, thus, decreases practical usefulness of the COBET framework.1

However, once such bug patterns are well-defined, corresponding pattern detectors can be implemented to detect concurrency bugs in (1) subsequent releases of the target program, and/or (2) other modules in a similar domain. It has been frequently observed that although a given bug had been fixed previously, similar bugs often appeared in the subsequent releases or in the different modules of the target program (see Sections 5.1 Bug detection result on file systems, 5.3 Bug detection results on device drivers and network modules). Thus, initial efforts to define bug patterns could be sufficiently rewarded by detecting concurrency bugs in rapidly evolving large software systems such as Linux. Furthermore, to lessen the effort to define bug patterns and construct corresponding bug pattern detectors, the COBET framework provides a pattern description language (PDL) (see Section 3.2).

Currently, COBET provides four concurrency bug patterns that are identified based on a review of Linux kernel ChangeLog documents. The effectiveness of COBET was demonstrated by detecting 10 new bugs in file systems, network modules, and device drivers of Linux 2.6.30.4 (the latest Linux release at the moment of the experiments), which were confirmed by Linux maintainers.

The contributions of this research are as follows:

  • We have derived interesting observations on the Linux concurrency bugs from a review of the Linux ChangeLog documents on Linux 2.6.x releases (Section 2).

  • We have developed a pattern-based concurrency bug detection framework, which can define and match various bug patterns. To improve bug detection precision, our framework utilizes composite patterns with semantic conditions in a scalable manner (Section 3).

  • Based on previous bug reports, we have defined four concurrency bug patterns with various synchronization mechanisms, which are effective to detect new bugs in Linux that are not targeted by lock-based analysis techniques. (Sections 4 Composite bug patterns with semantic conditions, 5 Empirical results).

The remainder of this paper is organized as follows. Section 2 describes the characteristics of Linux to show the advantages of pattern-based bug detection approach on Linux. Section 3 overviews the COBET framework. Section 4 explains composite bug patterns with semantic conditions upon the COBET framework. Section 5 reports the evaluation of the COBET framework through the empirical results on Linux kernel. Section 6 discusses related work. Finally, Section 7 concludes the paper.

Section snippets

Characteristics of Linux operating system

In this section, we describe the characteristics of concurrent programming practices used in Linux.

COBET framework

The observations in Section 2 suggest that a pattern-based concurrency bug detection framework can be a practical solution for Linux. Thus, we have developed the COncurrency Bug dETector (COBET) framework for concurrent C programs based on a pattern matching approach.

Composite bug patterns with semantic conditions

After reviewing the bug reports on Linux file systems (see Section 2.2), we defined the following four bug patterns:

  • 1.

    misused test and test-and-set,

  • 2.

    unsynchronized communication at thread creation,

  • 3.

    incorrect usage of atomic operations, and

  • 4.

    waiting for an already terminated thread.

For each bug pattern, it takes approximately 3 h for one graduate student with the knowledge on the Linux file systems and the bug pattern to define a corresponding syntactic bug pattern in PDL and implement a bug pattern

Empirical results

To investigate the effectiveness, efficiency, and applicability of the COBET framework, we performed the following three empirical evaluations on Linux 2.6.30.4, the latest version at the time of this empirical study.

  • To determine whether pattern-driven bug detectors based on the old bug reports can detect new concurrency bugs in subsequent releases, we applied the four bug pattern detectors (based on the bug reports on the file systems in Linux 2.6.0–2.6.30.3) to the file systems in Linux

Related work

Pattern based techniques (Engler et al., 2000, Hallem et al., 2002, Hovemeyer and Pugh, 2004b, Otto and Moschny, 2008) can analyze large programs quickly, since these techniques perform pattern matching on a target program without sophisticated analyses. Engler et al. (Hallem et al., 2002) used a high-level state-machine language MetaL to specify system rules (i.e., programming idioms) over linear execution paths. They applied system rules such as a ‘holding lock’ rule (i.e., the acquired locks

Conclusion

We have developed a pattern-based COncurrency Bug dETector (COBET) framework for operating systems. To target complex concurrency bugs, COBET utilizes composite bug patterns and associates semantic information with code structures in bug pattern matching. While most concurrency bug detection techniques concentrate on lock usages, COBET targets various concurrency bug patterns specified by a user, so as to detect complex bugs. The effectiveness, efficiency, and applicability of COBET were

Acknowledgement

FThis work was supported by Basic Science Research Program through the NRF funded by the MEST (2009-0064639) and the Excellent Research Center (ERC) of Excellence Program of Korea MEST/NRF of Korea (Grant 2012-0000473).

Shin Hong received the MS degree in computer science from KAIST in 2010, where he is currently working toward the PhD degree in computer science. His research interest includes concurrent program testing, automated testing, and embedded software.

References (28)

  • R. Agarwal et al.

    Detecting potential deadlocks with static analysis and run-time monitoring

  • A. Bessey et al.

    A few billion lines of code later: using static analysis to find bugs in the real world

    Communications of the ACM

    (2010)
  • J.-D. Choi et al.

    Efficient and precise datarace detection for multithreaded object-oriented programs

  • Coverity 5.4 Checker Reference, 2011....
  • M. Dubiner et al.

    Faster tree pattern matching

    Journal of ACM

    (1994)
  • E.D. Group

    The C++ Front End

    (2011)
  • D. Engler et al.

    RacerX: effective, static detection of race conditions and deadlocks

  • D. Engler et al.

    Checking system rules using system-specific, programmer-written compiler extensions

  • J. Erickson et al.

    Effective data-race detection for the kernel

  • E. Farchi et al.

    Concurrent bug patterns and how to test them

  • S. Hallem et al.

    A system and language for building system-specific, static analyses

  • D. Hovemeyer et al.

    Finding bugs is easy

  • D. Hovemeyer et al.

    Finding concurrency bugs in Java

  • P. Joshi et al.

    CalFuzzer: an extensible active testing framework for concurrent programs

    Lecture Notes in Computer Science

    (2009)
  • Shin Hong received the MS degree in computer science from KAIST in 2010, where he is currently working toward the PhD degree in computer science. His research interest includes concurrent program testing, automated testing, and embedded software.

    Moonzoo Kim is an associate professor of computer science department at KAIST, South Korea. His research interests are in automated software testing, systematic analysis of concurrent programs, and formal analysis of embedded software. He has focused on practical application of automated analysis techniques to improve the quality of industry software through close collaboration with companies such as Samsung Electronics.

    View full text