skip to main content
10.1145/3677333.3678160acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article
Open access

Detecting Data Races in OpenMP with Deep Learning and Large Language Models

Published: 12 August 2024 Publication History

Abstract

Transformer-based neural network models are increasingly employed to handle software engineering issues, such as bug localization and program repair. These models, equipped with a self-attention mechanism, excel at understanding source code context and semantics. Recently, large language models (LLMs) have emerged as a promising alternative for analyzing and understanding code structure. In this paper, we propose two novel methods for detecting data race bugs in OpenMP programs. The first method is based on a transformer encoder trained from scratch. The second method leverages LLMs, specifically extending GPT-4 Turbo through the use of prompt engineering and fine-tuning techniques. For training and testing our approach, we utilized two datasets comprising different OpenMP directives. Our experiments show that the transformer encoder achieves competitive accuracy compared to LLMs, whether through fine-tuning or prompt engineering techniques. This performance may be attributed to the complexity of many OpenMP directives and the limited availability of labeled datasets.

References

[1]
Samuel F. Antao, Alexey Bataev, Arpith C. Jacob, Gheorghe-Teodor Bercea, Alexandre E. Eichenberger, Georgios Rokos, Matt Martineau, Tian Jin, Guray Ozen, Zehra Sura, Tong Chen, Hyojin Sung, Carlo Bertolli, and Kevin O’Brien. 2016. Offloading Support for OpenMP in Clang and LLVM. In 2016 Third Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC). IEEE, 1–11. https://doi.org/10.1109/LLVM-HPC.2016.006
[2]
Simone Atzeni, Ganesh Gopalakrishnan, Zvonimir Rakamaric, Dong H. Ahn, Ignacio Laguna, Martin Schulz, Gregory L. Lee, Joachim Protze, and Matthias S. Müller. 2016. ARCHER: Effectively Spotting Data Races in Large OpenMP Applications. In 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 53–62. https://doi.org/10.1109/IPDPS.2016.68
[3]
Michael D. Bond, Katherine E. Coons, and Kathryn S. McKinley. 2010. PACER: proportional detection of data races. SIGPLAN Not. 45, 6 (jun 2010), 255–268. https://doi.org/10.1145/1809028.1806626
[4]
Jialun Cao, Meiziniu Li, Ming Wen, and Shing-Chi Cheung. 2023. A study on Prompt Design, Advantages and Limitations of ChatGPT for Deep Learning Program Repair. ArXiv abs/2304.08191 (04 2023). https://api.semanticscholar.org/CorpusID:258179639
[5]
J. Constine. 2013. NASDAQ’s Glitch Cost Facebook Investors $500M. Available: https://techcrunch.com/2013/03/25/ip-oh-my-gosh-all-that-money-just-disappeared/ [Accessed: 27-Dec-2017].
[6]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 4171–4186. https://doi.org/10.18653/v1/N19-1423
[7]
Elizabeth Dinella, Hanjun Dai, Ziyang Li, M. Naik, Le Song, and Ke Wang. 2020. Hoppity: Learning Graph Transformations to Detect and Fix Bugs in Programs. In International Conference on Learning Representations. https://api.semanticscholar.org/CorpusID:213089769
[8]
D. Engler and K. Ashcraft. 2003. RacerX: Effective, Static Detection of Race Conditions and Deadlocks. Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP ’03) (2003), 237–252. https://doi.org/10.1145/945445.945467
[9]
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. ArXiv abs/2002.08155 (2020). https://api.semanticscholar.org/CorpusID:211171605
[10]
Q. Guo, J. Cao, X. Xie, S. Liu, X. Li, B. Chen, and X. Peng. 2023. Exploring the Potential of ChatGPT in Automated Code Refinement: An Empirical Study. Journal of Advanced Research in Artificial Intelligence and Machine Learning 8, 3 (2023).
[11]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (nov 1997), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
[12]
Wei Hua, Yulei Sui, Yao Wan, Guangzhong Liu, and Guandong Xu. 2020. FCCA: Hybrid Code Representation for Functional Clone Detection Using Attention Networks. IEEE Transactions on Reliability 70 (2020), 304–318. https://api.semanticscholar.org/CorpusID:226421066
[13]
Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis Allamanis, and Marc Brockschmidt. 2019. CodeSearchNet Challenge: Evaluating the State of Semantic Code Search. arXiv preprint arXiv:1909.09436 (2019).
[14]
Intel. [n. d.]. Intel® Inspector - Simplify Memory and Threading Error Debugging. https://www.intel.com/content/www/us/en/developer/tools/oneapi/inspector.html. Accessed: 2024-05-29.
[15]
A.T. Jamsaz, M. Khaleel, R. Akbari, and A. Jannesari. 2021. DeepRace: A Learning-Based Data Race Detector. In 2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW). 226–233.
[16]
Vineet Kahlon, Yu Yang, Sriram Sankaranarayanan, and Aarti Gupta. 2007. Fast and Accurate Static Data-Race Detection for Concurrent Programs. In Computer Aided Verification, Werner Damm and Holger Hermanns (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 226–239.
[17]
Liuqing Li, He Feng, Wenjie Zhuang, Na Meng, and Barbara G. Ryder. 2017. CCLearner: A Deep Learning-Based Clone Detection Approach. 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME) (2017), 249–260. https://api.semanticscholar.org/CorpusID:1474148
[18]
Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. 2008. Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. SIGOPS Oper. Syst. Rev. 42, 2 (mar 2008), 329–339. https://doi.org/10.1145/1353535.1346323
[19]
Satish Narayanasamy, Zhenghao Wang, Jordan Tigani, Andrew Edwards, and Brad Calder. 2007. Automatically classifying benign and harmful data races using replay analysis. SIGPLAN Not. 42, 6 (jun 2007), 22–31. https://doi.org/10.1145/1273442.1250738
[20]
Robert O’Callahan and Jong-Deok Choi. 2003. Hybrid dynamic data race detection. SIGPLAN Not. 38, 10 (jun 2003), 167–178. https://doi.org/10.1145/966049.781528
[21]
Hao Peng, Lili Mou, Ge Li, Yuxuan Liu, Lu Zhang, and Zhi Jin. 2014. Building Program Vector Representations for Deep Learning. ArXiv abs/1409.3358 (2014). https://api.semanticscholar.org/CorpusID:13898232
[22]
Kevin Poulsen. 2004. Software bug contributed to blackout.
[23]
Yuan Yu, Tom Rodeheffer, and Wei Chen. 2005. RaceTrack: efficient detection of data race conditions via adaptive tracking. SIGOPS Oper. Syst. Rev. 39, 5 (oct 2005), 221–234. https://doi.org/10.1145/1095809.1095832

Index Terms

  1. Detecting Data Races in OpenMP with Deep Learning and Large Language Models

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ICPP Workshops '24: Workshop Proceedings of the 53rd International Conference on Parallel Processing
      August 2024
      131 pages
      ISBN:9798400718021
      DOI:10.1145/3677333
      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 12 August 2024

      Check for updates

      Author Tags

      1. CodeBERTa
      2. GPT-4 Turbo
      3. OpenMP
      4. bug detection
      5. data race
      6. large language model
      7. race condition
      8. transformer encoder

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      ICPP Workshops '24

      Acceptance Rates

      Overall Acceptance Rate 91 of 313 submissions, 29%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 115
        Total Downloads
      • Downloads (Last 12 months)115
      • Downloads (Last 6 weeks)42
      Reflects downloads up to 20 Jan 2025

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media