Effective pattern-driven concurrency bug detection for operating systems
Highlights
► We identified characteristics of concurrency bugs in Linux kernel by reviewing ChangeLog bug reports. ► We developed a pattern-driven COncurrency Bug dETector (COBET) that utilizes composite bug patterns with semantic conditions to detect complex bugs. ► TWe defined four concurrency bug patterns and detected ten new bugs in Linux kernel.
Introduction
As multi-core hardware becomes increasingly powerful and popular, operating systems (OSes) such as Linux utilize the cutting-edge multi-threaded techniques heavily to enhance performance. However, current analysis techniques and tools for concurrent programs have limitations when they are applied to operating systems due to the complex characteristics of OSes. In particular, the following three characteristics of OSes make concurrency bug detection on OSes difficult.
- •
Various synchronization mechanisms utilized
Most concurrency bug detection techniques (Choi et al., 2002, Engler and Ashcraft, 2003, Naik et al., 2009, Raza and Vogel, 2008, Savage et al., 1997, Voung et al., 2007) focus on lock usage, since a majority of user-level applications utilize simple mutexes/critical sections to enforce synchronization. However, OSes exploit various synchronization mechanisms (see Table 1) for performance enhancement.
- •
Customized synchronization primitives
OS developers sometimes implement their own synchronization primitives. Thus, concurrency bug detection tools for standard synchronization mechanisms do not recognize these customized synchronization primitives and produce imprecise results (Xiong et al., 2010).
- •
High complexity of operating systems
A dynamic analysis (i.e., testing) often fails to uncover hidden concurrency bugs due to the exponential number of possible interleaving scenarios between threads in OSes. In addition, replaying bugs is difficult, since it is hard to manipulate thread schedulers in OSes directly. A static analysis, on the other hand, has limited scalability to analyze OS code due to its high complexity and complicated data structures. Furthermore, the monolithic structure (i.e., tightly coupled large global data structure) of OSes severely hinder modular analyses.
To alleviate the above difficulties, we have developed the COncurrency Bug dETector (COBET) framework, which utilizes composite bug patterns augmented with semantic conditions. Note that concurrency errors are caused by unintended interference between multiple threads. A salient contribution of COBET is that it utilizes multiple sub-patterns, each of which represents a buggy pattern in one thread, and checks semantic information that determines possible interferences between multiple threads in a precise and scalable manner (see Section 3). In addition, since engineers who use COBET can define various concurrency bug patterns in a flexible manner, COBET can detect concurrency bugs that are due to customized synchronization mechanisms or not targeted by lock-based concurrency bug detection tools.
One drawback of COBET is that a user has to identify and define bug patterns. To identify effective (i.e., detecting many bugs) and precise (i.e., raising few false alarm) bug pattern requires user's domain knowledge on target code. In addition, it takes time to concretely define bug patterns for identified bugs in a machine processable form. Without such effort, it is easy to define imprecise bug patterns, which increases the burden to filter out false alarms manually and, thus, decreases practical usefulness of the COBET framework.1
However, once such bug patterns are well-defined, corresponding pattern detectors can be implemented to detect concurrency bugs in (1) subsequent releases of the target program, and/or (2) other modules in a similar domain. It has been frequently observed that although a given bug had been fixed previously, similar bugs often appeared in the subsequent releases or in the different modules of the target program (see Sections 5.1 Bug detection result on file systems, 5.3 Bug detection results on device drivers and network modules). Thus, initial efforts to define bug patterns could be sufficiently rewarded by detecting concurrency bugs in rapidly evolving large software systems such as Linux. Furthermore, to lessen the effort to define bug patterns and construct corresponding bug pattern detectors, the COBET framework provides a pattern description language (PDL) (see Section 3.2).
Currently, COBET provides four concurrency bug patterns that are identified based on a review of Linux kernel ChangeLog documents. The effectiveness of COBET was demonstrated by detecting 10 new bugs in file systems, network modules, and device drivers of Linux 2.6.30.4 (the latest Linux release at the moment of the experiments), which were confirmed by Linux maintainers.
The contributions of this research are as follows:
- •
We have derived interesting observations on the Linux concurrency bugs from a review of the Linux ChangeLog documents on Linux 2.6.x releases (Section 2).
- •
We have developed a pattern-based concurrency bug detection framework, which can define and match various bug patterns. To improve bug detection precision, our framework utilizes composite patterns with semantic conditions in a scalable manner (Section 3).
- •
Based on previous bug reports, we have defined four concurrency bug patterns with various synchronization mechanisms, which are effective to detect new bugs in Linux that are not targeted by lock-based analysis techniques. (Sections 4 Composite bug patterns with semantic conditions, 5 Empirical results).
The remainder of this paper is organized as follows. Section 2 describes the characteristics of Linux to show the advantages of pattern-based bug detection approach on Linux. Section 3 overviews the COBET framework. Section 4 explains composite bug patterns with semantic conditions upon the COBET framework. Section 5 reports the evaluation of the COBET framework through the empirical results on Linux kernel. Section 6 discusses related work. Finally, Section 7 concludes the paper.
Section snippets
Characteristics of Linux operating system
In this section, we describe the characteristics of concurrent programming practices used in Linux.
COBET framework
The observations in Section 2 suggest that a pattern-based concurrency bug detection framework can be a practical solution for Linux. Thus, we have developed the COncurrency Bug dETector (COBET) framework for concurrent C programs based on a pattern matching approach.
Composite bug patterns with semantic conditions
After reviewing the bug reports on Linux file systems (see Section 2.2), we defined the following four bug patterns:
- 1.
misused test and test-and-set,
- 2.
unsynchronized communication at thread creation,
- 3.
incorrect usage of atomic operations, and
- 4.
waiting for an already terminated thread.
For each bug pattern, it takes approximately 3 h for one graduate student with the knowledge on the Linux file systems and the bug pattern to define a corresponding syntactic bug pattern in PDL and implement a bug pattern
Empirical results
To investigate the effectiveness, efficiency, and applicability of the COBET framework, we performed the following three empirical evaluations on Linux 2.6.30.4, the latest version at the time of this empirical study.
- •
To determine whether pattern-driven bug detectors based on the old bug reports can detect new concurrency bugs in subsequent releases, we applied the four bug pattern detectors (based on the bug reports on the file systems in Linux 2.6.0–2.6.30.3) to the file systems in Linux
Related work
Pattern based techniques (Engler et al., 2000, Hallem et al., 2002, Hovemeyer and Pugh, 2004b, Otto and Moschny, 2008) can analyze large programs quickly, since these techniques perform pattern matching on a target program without sophisticated analyses. Engler et al. (Hallem et al., 2002) used a high-level state-machine language MetaL to specify system rules (i.e., programming idioms) over linear execution paths. They applied system rules such as a ‘holding lock’ rule (i.e., the acquired locks
Conclusion
We have developed a pattern-based COncurrency Bug dETector (COBET) framework for operating systems. To target complex concurrency bugs, COBET utilizes composite bug patterns and associates semantic information with code structures in bug pattern matching. While most concurrency bug detection techniques concentrate on lock usages, COBET targets various concurrency bug patterns specified by a user, so as to detect complex bugs. The effectiveness, efficiency, and applicability of COBET were
Acknowledgement
FThis work was supported by Basic Science Research Program through the NRF funded by the MEST (2009-0064639) and the Excellent Research Center (ERC) of Excellence Program of Korea MEST/NRF of Korea (Grant 2012-0000473).
Shin Hong received the MS degree in computer science from KAIST in 2010, where he is currently working toward the PhD degree in computer science. His research interest includes concurrent program testing, automated testing, and embedded software.
References (28)
- et al.
Detecting potential deadlocks with static analysis and run-time monitoring
- et al.
A few billion lines of code later: using static analysis to find bugs in the real world
Communications of the ACM
(2010) - et al.
Efficient and precise datarace detection for multithreaded object-oriented programs
- Coverity 5.4 Checker Reference, 2011....
- et al.
Faster tree pattern matching
Journal of ACM
(1994) The C++ Front End
(2011)- et al.
RacerX: effective, static detection of race conditions and deadlocks
- et al.
Checking system rules using system-specific, programmer-written compiler extensions
- et al.
Effective data-race detection for the kernel
- et al.
Concurrent bug patterns and how to test them
A system and language for building system-specific, static analyses
Finding bugs is easy
Finding concurrency bugs in Java
CalFuzzer: an extensible active testing framework for concurrent programs
Lecture Notes in Computer Science
Cited by (15)
All Use-After-Free Vulnerabilities Are Not Created Equal: An Empirical Study on Their Characteristics and Detectability
2023, ACM International Conference Proceeding SeriesStructural-semantics Guided Program Simplification for Understanding Neural Code Intelligence Models
2023, ACM International Conference Proceeding SeriesDiagnosing Kernel Concurrency Failures with AITIA
2023, Proceedings of the 18th European Conference on Computer Systems, EuroSys 2023Automated Use-After-Free Detection and Exploit Mitigation: How Far Have We Gone?
2022, IEEE Transactions on Software EngineeringMining Python fix patterns via analyzing fine-grained source code changes
2022, Empirical Software EngineeringRecent Progress of Concurrency Bug Detection in Operating System Kernels
2021, Ruan Jian Xue Bao/Journal of Software
Shin Hong received the MS degree in computer science from KAIST in 2010, where he is currently working toward the PhD degree in computer science. His research interest includes concurrent program testing, automated testing, and embedded software.
Moonzoo Kim is an associate professor of computer science department at KAIST, South Korea. His research interests are in automated software testing, systematic analysis of concurrent programs, and formal analysis of embedded software. He has focused on practical application of automated analysis techniques to improve the quality of industry software through close collaboration with companies such as Samsung Electronics.