skip to main content
10.1145/3445814.3446727acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open access

Kard: lightweight data race detection with per-thread memory protection

Published: 17 April 2021 Publication History

Abstract

Finding data race bugs in multi-threaded programs has proven challenging. A promising direction is to use dynamic detectors that monitor the program’s execution for data races. However, despite extensive work on dynamic data race detection, most proposed systems for commodity hardware incur prohibitive overheads due to expensive compiler instrumentation of memory accesses; hence, they are not efficient enough to be used in all development and testing settings.
KARD is a lightweight system that dynamically detects data races caused by inconsistent lock usage—when a program concurrently accesses the same memory object using different locks or only some of the concurrent accesses are synchronized using a common lock. Unlike existing detectors, KARD does not monitor memory accesses using expensive compiler instrumentation. Instead, KARD leverages commodity per-thread memory protection, Intel Memory Protection Keys (MPK). Using MPK, KARD ensures that a shared object is only accessible to a single thread in its critical section, and captures all violating accesses from other concurrent threads. KARD overcomes various limitations of MPK by introducing key-enforced race detection, employing consolidated unique page allocation, carefully managing protection keys, and automatically pruning out non-racy or redundant violations. Our evaluation shows that KARD detects all data races caused by inconsistent lock usage and has a low geometric mean execution time overhead: 7.0% on PARSEC and SPLASH-2x benchmarks and 5.3% on a set of real-world applications (NGINX, memcached, pigz, and Aget).

References

[1]
[n. d.]. https://openbenchmarking.org/system/1909082-HV-ICELAKETE36/ TW20190905/cpuinfo.
[2]
[n. d.]. sloccount(1 )-Linux man page. https://linux.die.net/man/1/sloccount.
[3]
Martín Abadi, Tim Harris, and Mojtaba Mehrara. 2009. Transactional Memory with Strong Atomicity Using Of-the-shelf Memory Protection Hardware. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). San Francisco, CA.
[4]
Muhammad Abubakar, Adil Ahmad, Pedro Fonseca, and Dongyan Xu. 2021. SHARD: Fine-Grained Kernel Specialization with Context-Aware Hardening. In Proceedings of the 30th USENIX Security Symposium (Security). Vancouver, BC.
[5]
Mark Adler. [n. d.]. pigz-Parallel gzip. https://zlib.net/pigz/.
[6]
Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2020. Data Center TCP (DCTCP). In Proceedings of the 2020 ACM Special Interest Group on Data Communication (SIGCOMM). New Delhi, India.
[7]
Pramod Bhatotia, Pedro Fonseca, Umut A. Acar, Björn B. Brandenburg, and Rodrigo Rodrigues. 2015. iThreads: A Threading Library for Parallel Incremental Computation. In Proceedings of the 20th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Istanbul, Turkey.
[8]
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC Benchmark Suite: Characterization and Architectural Implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT). Toronto, ON.
[9]
Swarnendu Biswas, Minjia Zhang, Michael D. Bond, and Brandon Lucia. 2015. Valor: Eficient, Software-Only Region Conflict Exceptions. In Proceedings of the 26th Annual ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Pittsburgh, PA.
[10]
A. Bittau, P. Marchenko, M. Handley, and B. Karp. 2008. Wedge: Splitting Applications into Reduced-privilege Compartments. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation (NSDI). San Francisco, CA.
[11]
Sam Blackshear, Nikos Gorogiannis, Peter W. O'Hearn, and Ilya Sergey. 2018. RacerD: Compositional Static Race Detection. In Proceedings of 2018 ACM ObjectOriented Programming, Systems, Languages & Applications (OOPSLA). Boston, MA.
[12]
Stephen Blair-Chappell and Andrew Stokes. 2012. Parallel Programming with Intel Parallel Studio XE. John Wiley & Sons.
[13]
Michael D. Bond, Katherine E. Coons, and Kathryn S. McKinley. 2010. Pacer: Proportional Detection of Data Races. In Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Toronto, ON.
[14]
Jonathan Corbet. 2015. Memory protection keys. https://lwn.net/Articles/ 643797/.
[15]
W. Cui, X. Ge, B. Kasikci, B. Niu, U. Sharma, R. Wang, and I. Yun. 2018. REPT: Reverse Debugging of Failures in Deployed Software. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Carlsbad, CA.
[16]
Thurston H.Y. Dang, Petros Maniatis, and David Wagner. 2017. Oscar: A Practical Page-Permissions-Based Scheme for Thwarting Dangling Pointers. In Proceedings of the 26th USENIX Security Symposium (Security). Vancouver, BC, Canada.
[17]
Dormando. [n. d.]. memcached-a distributed memory object caching system. https://memcached.org.
[18]
Laura Efinger-Dean, Brandon Lucia, Luis Ceze, Dan Grossman, and Hans-J Boehm. 2012. IFRit: Interference-Free Regions for Dynamic Data-Race Detection. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA). Tucson, AZ.
[19]
EnderUNIX Software Development Team. [n. d.]. EnderUNIX Aget: Multithreaded HTTP Download Accelerator. http://www.enderunix.org/aget/.
[20]
John Erickson, Madanlal Musuvathi, Sebastian Burckhardt, and Kirk Olynyk. 2010. Efective Data-Race Detection for the Kernel. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Vancouver, Canada.
[21]
Pedro Fonseca, Cheng Li, and Rodrigo Rodrigues. 2011. Finding Complex Concurrency Bugs in Large Multi-Threaded Applications. In Proceedings of the 6th ACM European Conference on Computer Systems (EuroSys). Salzburg, Austria.
[22]
Pedro Fonseca, Cheng Li, Vishal Singhal, and Rodrigo Rodrigues. 2010. A study of the internal and external efects of concurrency bugs. In Proceedings of 2010 IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). Chicago, IL.
[23]
Pedro Fonseca, Rodrigo Rodrigues, and Björn B. Brandenburg. 2014. SKI: Exposing Kernel Concurrency Bugs through Systematic Schedule Exploration. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Broomfield, Colorado.
[24]
Pedro Fonseca, Kaiyuan Zhang, Xi Wang, and Arvind Krishnamurthy. 2017. An Empirical Study on the Correctness of Formally Verified Distributed Systems. In Proceedings of the 12th European Conference on Computer Systems (EuroSys). Belgrade, Serbia.
[25]
Wookhyun Han, Byunggill Joe, Byoungyoung Lee, Chengyu Song, and Insik Shin. 2018. Enhancing Memory Error Detection for Large-Scale Applications and Fuzz Testing. In Proceedings of the 2018 Annual Network and Distributed System Security Symposium (NDSS). San Diego, CA.
[26]
M. Hedayati, S. Gravani, E. Johnson, J. Criswell, M. L. Scott, K. Shen, and M. Marty. 2019. Hodor: Intra-Process Isolation for High-Throughput Data Plain Libraries. In Proceedings of the 2019 USENIX Annual Technical Conference (ATC). Renton, WA.
[27]
T. C.-H. Hsu, K. Hofman, P. Eugster, and M. Payer. 2016. Enforcing Least Privilege Memory Views for Multithreaded Applications. In Proceedings of the 23rd ACM Conference on Computer and Communications Security (CCS). Vienna, Austria.
[28]
Intel. 2020. Inconsistent Lock Use. https://software.intel.com/content/www/ us/en/develop/documentation/advisor-user-guide/top/reference/dependenciesproblem-and-message-types/inconsistent-lock-use.html.
[29]
Ayal Itzkovitz and Assaf Schuster. 1999. MultiView and Millipage-Fine-Grain Sharing in Page-Based DSM. In Proceedings of the 3rd USENIX Symposium on Operating Systems Design and Implementation (OSDI). New Orleans, LA.
[30]
Dae R. Jeong, Kyungtae Kim, Basavesh Shivakumar, Byoungyoung Lee, and Insik Shin. 2019. Razzer: Finding Kernel Race Bugs through Fuzzing. In Proceedings of the 40th IEEE Symposium on Security and Privacy (Oakland). San Francisco, CA.
[31]
Vineet Kahlon, Yu Yang, Sriram Sankaranarayanan, and Aarti Gupta. 2007. Fast and Accurate Static Data-Race Detection for Concurrent Programs. In Proceedings of the International Conference on Computer Aided Verification. Berlin, Heidelberg.
[32]
Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (CGO).
[33]
Viktor Leis, Alfons Kemper, and Thomas Neumann. 2014. Exploiting Hardware Transactional Memory in Main-Memory Databases. In Proceedings of the 30th IEEE International Conference on Data Engineering (ICDE). Chicago, IL.
[34]
Guangpu Li, Shan Lu, Madanlal Musuvathi, Suman Nath, and Rohan Padhye. 2019. Eficient Scalable Thread-Safety-Violation Detection: Finding Thousands of Concurrency Bugs during Testing. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP). Huntsville, ON.
[35]
J. Litton, A. Vahldiek-Oberwagner, E. Elnikety, D. Garg, B. Bhattacharjee, and P. Druschel. 2016. Light-weight Contexts: An OS Abstraction for Safety and Performance. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Savannah, GA.
[36]
Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. 2008. Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. In Proceedings of the 13th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Seattle, WA.
[37]
Brandon Lucia and Luis Ceze. 2009. Finding Concurrency Bugs with ContextAware Communication Graphs. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). New York, NY.
[38]
Brandon Lucia, Luis Ceze, Karin Strauss, Shaz Qadeer, and Hans-J. Boehm. 2010. Conflict Exceptions: Simplifying Concurrent Language Semantics with Precise Hardware Exceptions for Data-Races. In Proceedings of the 37th ACM/IEEE International Symposium on Computer Architecture (ISCA). San Jose, CA.
[39]
David Mulnix. 2019. Intel Xeon Processor Scalable Family Technical Overview. https://software.intel.com/en-us/articles/intel-xeon-processorscalable-family-technical-overview.
[40]
Santosh Nagarakatte, Jianzhou Zhao, Milo M.K. Martin, and Steve Zdancewic. 2009. SoftBound: Highly Compatible and Complete Spatial Memory Safety for C. In Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Dublin, Ireland.
[41]
NGINX Inc. [n. d.]. NGINX High Performance Load Balancer, Web Server, & Reverse Proxy. https://www.nginx.com.
[42]
Marek Olszewski, Qin Zhao, David Koh, Jason Ansel, and Saman Amarasinghe. 2012. Aikido: Accelerating Shared Data Dynamic Analyses. In Proceedings of the 17th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). London, UK.
[43]
Soyeon Park, Sangho Lee, Wen Xu, Hyungon Moon, and Taesoo Kim. 2019. libmpk: Software Abstraction for Intel Memory Protection Keys (Intel MPK). In Proceedings of the 2019 USENIX Annual Technical Conference (ATC).
[44]
Bobby Powers, David Tench, Emery D. Berger, and Andrew McGregor. 2019. Mesh: Compacting Memory Management for C/C++ Applications. In Proceedings of the 2019 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Phoenix, AZ.
[45]
Eli Pozniansky and Assaf Schuster. 2007. MultiRace: Eficient On-the-Fly Data Race Detection in Multithreaded C++ Programs. Concurrency and Computation: Practice and Experience 19, 3 ( 2007 ), 327-340.
[46]
Sriram Rajamani, G. Ramalingam, Venkatesh Prasad Ranganath, and Kapil Vaswani. 2009. ISOLATOR: Dynamically Ensuring Isolation in Concurrent Programs. In Proceedings of the 14th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Washington, DC.
[47]
M. Rajashekhar and C. Janet. [n. d.]. twemperf-A tool for measuring memcached server performance. https://zlib.net/pigz/.
[48]
Caitlin Sadowski and Jaeheon Yi. 2014. How developers use data race detection tools. In Proceedings of the 5th Workshop on Evaluation and Usability of Programming Languages and Tools.
[49]
Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson. 1997. Eraser: A Dynamic Data Race Detector for Multithreaded Programs. ACM Transactions on Computer Systems (TOCS) 15, 4 ( 1997 ), 391-411.
[50]
David Schrammel, Samuel Weiser, Stefan Steinegger, Martin Schwarzl, Michael Schwarz, Stefan Mangard, and Daniel Gruss. 2020. Donky: Domain Keys-Eficient In-Process Isolation for RISC-V and x86. In Proceedings of the 29th USENIX Security Symposium (Security). Virtual Event, USA.
[51]
selenic. [n. d.]. smem memory reporting tool. https://www.selenic.com/smem/.
[52]
Konstantin Serebryany and Timur Iskhodzhanov. 2009. ThreadSanitizer: data race detection in practice. In Proceedings of the Workshop on Binary Instrumentation and Applications (WBIA).
[53]
Tianwei Sheng, Neil Vachharajani, Stephane Eranian, Robert Hundt, Wenguang Chen, and Weimin Zheng. 2007. RACEZ: A Lightweight and Non-Invasive Race Detection Tool for Production Applications. In Proceedings of the 33th International Conference on Software Engineering (ICSE). Honolulu, HI.
[54]
Ming-Wei Shih, Sangho Lee, Taesoo Kim, and Marcus Peinado. 2017. T-SGX: Eradicating Controlled-Channel Attacks Against Enclave Programs. In Proceedings of the 2017 Annual Network and Distributed System Security Symposium (NDSS). San Diego, CA.
[55]
Yulei Sui and Jingling Xue. 2016. SVF: Interprocedural Static Value-Flow Analysis in LLVM. In Proceedings of the 25th International Conference on Compiler Construction (CC).
[56]
The Apache Software Foundation. [n. d.]. ab-Apache HTTP server benchmark tool. https://httpd.apache. org/docs/2.4/programs/ab.html.
[57]
Anjo Vahldiek-Oberwagner, Eslam Elnikety, Nuno O. Duarte, Michael Sammer, Peter Druschel, and Deepak Garg. 2019. ERIM: Secure, Eficient In-process Isolation with Protection Keys (MPK). In Proceedings of the 28th USENIX Security Symposium (Security). Santa Clara, CA.
[58]
Dmitry Vyukov. [n. d.]. ThreadSanitizerFoundBugs. https://github.com/google/ sanitizers/wiki/ThreadSanitizerFoundBugs.
[59]
Weiwei Xiong, Soyeon Park, Jiaqi Zhang, Yuanyuan Zhou, and Zhiqiang Ma. 2010. Ad Hoc Synchronization Considered Harmful. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Vancouver, Canada.
[60]
Yuanchao Xu, ChenCheng Ye, Yan Solihin, and Xipeng Shen. 2020. HardwareBased Domain Virtualization for Intra-Process Isolation of Persistent Memory Objects. In Proceedings of the 47th ACM/IEEE International Symposium on Computer Architecture (ISCA). Valencia, Spain.
[61]
Jie Yu and Satish Narayanasamy. 2009. A Case for an Interleaving Constrained Shared-memory Multi-processor. In Proceedings of the 36th ACM/IEEE International Symposium on Computer Architecture (ISCA). Austin, TX, USA.
[62]
Tong Zhang, Dongyoon Lee, and Changhee Jung. 2016. TxRace: Eficient Data Race Detection Using Commodity Hardware Transactional Memory. In Proceedings of the 21st ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Atlanta, GA.
[63]
Tong Zhang, Dongyoon Lee, and Changhee Jung. 2017. ProRace: Practical Data Race Detection for Production Use. In Proceedings of the 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Xi'an, China.
[64]
Diyu Zhou and Yuval Tamir. 2019. PUSh: Data Race Detection Based on HardwareSupported Prevention of Unintended Sharing. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). Columbus, OH.
[65]
Pin Zhou, Radu Teodorescu, and Yuanyuan Zhou. 2009. HARD: HardwareAssisted Lockset-based Race Detection. In Proceedings of the 15th IEEE Symposium on High Performance Computer Architecture (HPCA). Raleigh, NC, USA.

Cited By

View all
  • (2025)Enhancing concurrency vulnerability detection through AST-based static fuzz mutationJournal of Systems and Software10.1016/j.jss.2025.112352(112352)Online publication date: Jan-2025
  • (2025)Thread-sensitive fuzzing for concurrency bug detectionComputers & Security10.1016/j.cose.2024.104171148(104171)Online publication date: Jan-2025
  • (2024)PMC-Based Data Race Kernel Concurrency Bugs Detection2024 3rd International Conference on Computer Applications Technology (CCAT)10.1109/CCAT64370.2024.00010(7-12)Online publication date: 15-Nov-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
April 2021
1090 pages
ISBN:9781450383172
DOI:10.1145/3445814
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. concurrency
  2. data race
  3. lock
  4. memory protection

Qualifiers

  • Research-article

Conference

ASPLOS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)300
  • Downloads (Last 6 weeks)33
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Enhancing concurrency vulnerability detection through AST-based static fuzz mutationJournal of Systems and Software10.1016/j.jss.2025.112352(112352)Online publication date: Jan-2025
  • (2025)Thread-sensitive fuzzing for concurrency bug detectionComputers & Security10.1016/j.cose.2024.104171148(104171)Online publication date: Jan-2025
  • (2024)PMC-Based Data Race Kernel Concurrency Bugs Detection2024 3rd International Conference on Computer Applications Technology (CCAT)10.1109/CCAT64370.2024.00010(7-12)Online publication date: 15-Nov-2024
  • (2023)Snowcat: Efficient Kernel Concurrency Testing using a Learned Coverage PredictorProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613148(35-51)Online publication date: 23-Oct-2023
  • (2023)Always-On Recording Framework for Serverless Computations: Opportunities and ChallengesProceedings of the 1st Workshop on SErverless Systems, Applications and MEthodologies10.1145/3592533.3592810(41-49)Online publication date: 8-May-2023
  • (2023)KIT: Testing OS-Level Virtualization for Functional Interference BugsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575731(427-441)Online publication date: 27-Jan-2023
  • (2023)μSwitch: Fast Kernel Context Isolation with Implicit Context Switches2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179284(2956-2973)Online publication date: May-2023
  • (2023)Memory Protection Keys: Facts, Key Extension Perspectives, and DiscussionsIEEE Security and Privacy10.1109/MSEC.2023.325060121:3(8-15)Online publication date: 1-May-2023
  • (2021)SnowboardProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483549(66-83)Online publication date: 26-Oct-2021
  • (2021)Execution reconstruction: harnessing failure reoccurrences for failure reproductionProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454101(1155-1170)Online publication date: 19-Jun-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media