research-article

A comprehensive study of deep learning compiler bugs

Authors:
Qingchao Shen

Tianjin University, China

Tianjin University, China

0000-0002-6128-2123
View Profile

,
Haoyang Ma

Tianjin University, China

Tianjin University, China

0000-0002-2114-7058
View Profile

,
Junjie Chen

Tianjin University, China

Tianjin University, China

0000-0003-3056-9962
View Profile

,
Yongqiang Tian

University of Waterloo, Canada

University of Waterloo, Canada

0000-0003-1644-2965
View Profile

,
Shing-Chi Cheung

Hong Kong University of Science and Technology, China

Hong Kong University of Science and Technology, China

0000-0002-3508-7172
View Profile

,
Xiang Chen

Nantong University, China

Nantong University, China

0000-0002-1180-3891
View Profile

ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringAugust 2021Pages 968–980https://doi.org/10.1145/3468264.3468591

Published:18 August 2021Publication History

ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 968–980

ABSTRACT

There are increasing uses of deep learning (DL) compilers to generate optimized code, boosting the runtime performance of DL models on specific hardware. Like their traditional counterparts, DL compilers can generate incorrect code, resulting in unexpected model behaviors that may cause catastrophic consequences in mission-critical systems. On the other hand, the DL models processed by DL compilers differ fundamentally from imperative programs in that the program logic in DL models is implicit. As such, various characteristics of the bugs arising from traditional compilers need to be revisited in the context of DL compilers.

In this paper, we present the first systematic study of DL compiler bugs by analyzing 603 bugs arising in three popular DL compilers (i.e., TVM from Apache, Glow from Facebook, and nGraph from Intel). We analyzed these bugs according to their root causes, symptoms, and the stages where they occur during compilation. We obtain 12 findings, and provide a series of valuable guidelines for future work on DL compiler bug detection and debugging. For example, a large portion (nearly 20%) of DL compiler bugs are related to types, especially tensor types. The analysis of these bugs helps design new mutation operators (e.g., adding type cast for a tensor to promote implicit type conversion in subsequent tensor computations) to facilitate type-related bug detection. Further, we developed TVMfuzz as a proof-of-concept application of our findings to test the TVM DL compiler. It generates new tests based on TVM's original test suite. They expose 8 TVM bugs that are missed by the original test suite. The result demonstrates the usefulness of our findings.

References

February 2021. Glow. https://ai.facebook.com/tools/glow/Google Scholar
February 2021. Keras. https://keras.io/Google Scholar
February 2021. nGraph. https://www.intel.com/content/www/us/en/artificial-intelligence/ngraph.htmlGoogle Scholar
February 2021. PyTorch. https://pytorch.org/Google Scholar
February 2021. TensorFlow. https://www.tensorflow.org/Google Scholar
February 2021. TVM. https://tvm.apache.org/Google Scholar
Sven Amann, Sarah Nadi, Hoan A Nguyen, Tien N Nguyen, and Mira Mezini. 2016. MUBench: A benchmark for API-misuse detectors. In Proceedings of the 13th International Conference on Mining Software Repositories. 464–467.Google ScholarDigital Library
Sven Amann, Hoan Anh Nguyen, Sarah Nadi, Tien N Nguyen, and Mira Mezini. 2018. A systematic evaluation of static api-misuse detectors. IEEE Transactions on Software Engineering, 45, 12 (2018), 1170–1188.Google ScholarCross Ref
Kaibo Cao, Chunyang Chen, Sebastian Baltes, Christoph Treude, and Xiang Chen. 2021. Automated Query Reformulation for Efficient Search based on Query Logs From Stack Overflow. In Proceedings of 43rd IEEE/ACM International Conference on Software Engineering. 1273–1285.Google ScholarDigital Library
Chenyi Chen, Ari Seff, Alain Kornhauser, and Jianxiong Xiao. 2015. Deepdriving: Learning affordance for direct perception in autonomous driving. In Proceedings of the IEEE International Conference on Computer Vision. 2722–2730.Google ScholarDigital Library
Junjie Chen, Yanwei Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, and Bing Xie. 2017. Learning to prioritize test programs for compiler testing. In Proceedings of 39th IEEE/ACM International Conference on Software Engineering. 700–711.Google ScholarDigital Library
Junjie Chen, Yanwei Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, and Bing Xie. 2016. Test case prioritization for compilers: A text-vector based approach. In 2016 IEEE International Conference on Software Testing, Verification and Validation. 266–277.Google ScholarCross Ref
Junjie Chen, Jiaqi Han, Peiyi Sun, Lingming Zhang, Dan Hao, and Lu Zhang. 2019. Compiler bug isolation via effective witness test program generation. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 223–234.Google ScholarDigital Library
Junjie Chen, Wenxiang Hu, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, and Bing Xie. 2016. An empirical comparison of compiler testing techniques. In Proceedings of the 38th IEEE/ACM International Conference on Software Engineering. 180–190.Google ScholarDigital Library
Junjie Chen, Haoyang Ma, and Lingming Zhang. 2020. Enhanced Compiler Bug Isolation via Memoized Search. In Proceedings of 35th IEEE/ACM International Conference on Automated Software Engineering. 78–89.Google ScholarDigital Library
Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. 2020. A Survey of Compiler Testing. ACM Computing Surveys (CSUR), 53, 1 (2020), 1–36.Google ScholarDigital Library
Junjie Chen, Guancheng Wang, Dan Hao, Yingfei Xiong, Hongyu Zhang, and Lu Zhang. 2019. History-guided configuration diversification for compiler test-program generation. In Proceedings of 34th IEEE/ACM International Conference on Automated Software Engineering. 305–316.Google ScholarDigital Library
Junjie Chen, Guancheng Wang, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, and Bing Xie. 2021. Coverage Prediction for Accelerating Compiler Testing. IEEE Transactions on Software Engineering, 47, 2 (2021), 261–278.Google ScholarDigital Library
Junjie Chen, Zhuo Wu, Zan Wang, Hanmo You, Lingming Zhang, and Ming Yan. 2020. Practical Accuracy Estimation for Efficient Deep Neural Network Testing. ACM Transactions on Software Engineering and Methodology, 29, 4 (2020), 30:1–30:35.Google ScholarDigital Library
Xiang Chen, Chunyang Chen, Dun Zhang, and Zhenchang Xing. 2019. Sethesaurus: Wordnet in software engineering. IEEE Transactions on Software Engineering.Google ScholarCross Ref
Yang Chen, Alex Groce, Chaoqiang Zhang, Weng-Keen Wong, Xiaoli Z. Fern, Eric Eide, and John Regehr. 2013. Taming compiler fuzzers. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation. 197–208.Google ScholarDigital Library
Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. cudnn: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759.Google Scholar
Chris Cummins, Pavlos Petoumenos, Alastair Murray, and Hugh Leather. 2018. Compiler fuzzing through deep learning. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 95–105.Google ScholarDigital Library
Scott Cyphers, Arjun K Bansal, Anahita Bhiwandiwalla, Jayaram Bobba, Matthew Brookhart, Avijit Chakraborty, Will Constable, Christian Convey, Leona Cook, and Omar Kanawi. 2018. Intel ngraph: An intermediate representation, compiler, and executor for deep learning. arXiv preprint arXiv:1801.08058.Google Scholar
Prasun Dewan and Rajesh Hegde. 2007. Semi-synchronous conflict detection and resolution in asynchronous software development. In Proceedings of the 10th European Conference on Computer-Supported Cooperative Work. Springer, 159–178.Google ScholarCross Ref
Anthony Di Franco, Hui Guo, and Cindy Rubio-González. 2017. A comprehensive study of real-world numerical bug characteristics. In Proceedings of 32nd IEEE/ACM International Conference on Automated Software Engineering. 509–519.Google ScholarCross Ref
Alastair F Donaldson, Hugues Evrard, and Paul Thomson. 2020. Putting Randomized Compiler Testing into Production (Experience Report). In Proceedings of 34th European Conference on Object-Oriented Programming.Google Scholar
W Keith Edwards. 1997. Flexible conflict detection and management in collaborative applications. In Proceedings of the 10th annual ACM symposium on User interface software and technology. 139–148.Google ScholarDigital Library
Joshua Garcia, Yang Feng, Junjie Shen, Sumaya Almanee, Yuan Xia, and Qi Alfred Chen. 2020. A comprehensive study of autonomous vehicle bugs. In Proceedings of the 42nd IEEE/ACM International Conference on Software Engineering. 385–396.Google ScholarDigital Library
Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial networks. arXiv preprint arXiv:1406.2661.Google Scholar
Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, and John Regehr. 2012. Swarm testing. In Proceedings of the 2012 International Symposium on Software Testing and Analysis. 78–88.Google ScholarDigital Library
Muhammad Ali Gulzar, Yongkang Zhu, and Xiaofeng Han. 2019. Perception and practices of differential testing. In Proceedings of 41st IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice. 71–80.Google ScholarDigital Library
Qianyu Guo, Sen Chen, Xiaofei Xie, Lei Ma, Qiang Hu, Hongtao Liu, Yang Liu, Jianjun Zhao, and Xiaohong Li. 2019. An empirical study towards characterizing deep learning development and deployment across different frameworks and platforms. In Proceedings of 34th IEEE/ACM International Conference on Automated Software Engineering. 810–822.Google ScholarDigital Library
Xue Han and Tingting Yu. 2016. An Empirical Study on Performance Bugs for Highly Configurable Software Systems. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. ACM, 23:1–23:10.Google ScholarDigital Library
Foyzul Hassan and Xiaoyin Wang. 2018. Hirebuild: An automatic approach to history-driven repair of build scripts. In Proceedings of 40th IEEE/ACM International Conference on Software Engineering. 1078–1089.Google ScholarDigital Library
Brian Hickmann, Jieasheng Chen, Michael Rotzin, Andrew Yang, Maciej Urbanski, and Sasikanth Avancha. 2020. Intel Nervana Neural Network Processor-T (NNP-T) Fused Floating Point Many-Term Dot Product. In Proceedings of IEEE 27th Symposium on Computer Arithmetic. 133–136.Google ScholarCross Ref
Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of real faults in deep learning systems. In Proceedings of the 42nd IEEE/ACM International Conference on Software Engineering. 1110–1121.Google ScholarDigital Library
Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A comprehensive study on deep learning bug characteristics. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 510–520.Google ScholarDigital Library
Li Jia, Hao Zhong, Xiaoyin Wang, Linpeng Huang, and Xuansheng Lu. 2020. An Empirical Study on Bugs Inside TensorFlow. In Proceedings of International Conference on Database Systems for Advanced Applications. 604–620.Google ScholarDigital Library
Yue Jia and Mark Harman. 2010. An analysis and survey of the development of mutation testing. IEEE transactions on software engineering, 37, 5 (2010), 649–678.Google Scholar
Norman P Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, and Al Borchers. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th annual international symposium on computer architecture. 1–12.Google ScholarDigital Library
Kyle D Julian, Jessica Lopez, Jeffrey S Brush, Michael P Owen, and Mykel J Kochenderfer. 2016. Policy compression for aircraft collision avoidance systems. In Proceedings of 2016 IEEE/AIAA 35th Digital Avionics Systems Conference. 1–10.Google ScholarCross Ref
Adrian Kingsley-Hughes. 2017. Inside Apple’s new A11 Bionic processor. ZDNet, September.Google Scholar
Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation. 216–226.Google ScholarDigital Library
Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding deep compiler bugs via guided stochastic program mutation. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 386–399.Google ScholarDigital Library
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE, 86, 11 (1998), 2278–2324.Google ScholarCross Ref
Mingzhen Li, Yi Liu, Xiaoyan Liu, Qingxiao Sun, Xin You, Hailong Yang, Zhongzhi Luan, Lin Gan, Guangwen Yang, and Depei Qian. 2020. The Deep Learning Compiler: A Comprehensive Survey. arxiv:2002.03794.Google Scholar
Zhenmin Li, Lin Tan, Xuanhui Wang, Shan Lu, Yuanyuan Zhou, and Chengxiang Zhai. 2006. Have things changed now? An empirical study of bug characteristics in modern open source software. In Proceedings of the 1st workshop on Architectural and system support for improving software dependability. 25–33.Google ScholarDigital Library
Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. 2015. Many-core compiler fuzzing. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. 65–76.Google Scholar
Yiling Lou, Junjie Chen, Lingming Zhang, Dan Hao, and Lu Zhang. 2019. History-driven build failure fixing: how far are we? In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. 43–54.Google ScholarDigital Library
Yiling Lou, Zhenpeng Chen, Yanbin Cao, Dan Hao, and Lu Zhang. 2020. Understanding build issue resolution in practice: symptoms and fix patterns. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 617–628.Google ScholarDigital Library
Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. 2008. Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. In Proceedings of the 13th international conference on Architectural support for programming languages and operating systems. 329–339.Google ScholarDigital Library
Amin Nikanjam, Mehdi Morovati, Foutse Khomh, and Houssem Ben Braiek. 2021. Faults in Deep Reinforcement Learning Programs: A Taxonomy and A Detection Approach. arXiv preprint arXiv:2101.00135.Google Scholar
Hung Viet Pham, Thibaud Lutellier, Weizhen Qi, and Lin Tan. 2019. CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries. In Proceedings of the 41st IEEE/ACM International Conference on Software Engineering. 1027–1038.Google ScholarDigital Library
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. In Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation. 335–346.Google ScholarDigital Library
Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Garret Catron, Summer Deng, Roman Dzhabarov, Nick Gibson, James Hegeman, Meghan Lele, Roman Levenstein, Jack Montgomery, Bert Maher, Satish Nadathur, Jakob Olesen, Jongsoo Park, Artem Rakhov, Misha Smelyanskiy, and Man Wang. 2019. Glow: Graph Lowering Compiler Techniques for Neural Networks. arxiv:1805.00907.Google Scholar
David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. 1986. Learning representations by back-propagating errors. nature, 323, 6088 (1986), 533–536.Google Scholar
Forrest Shull, Sally Godfrey, Andre Bechtel, Raimund L Feldmann, Myrna Regardie, and Carolyn Seaman. 2008. Making Use of a Decade of Widely Varying Historical Data: SARP Project.Google Scholar
Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding compiler bugs via live code mutation. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 849–863.Google ScholarDigital Library
Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. 2016. Toward understanding compiler bugs in GCC and LLVM. In Proceedings of the 25th International Symposium on Software Testing and Analysis. 294–305.Google ScholarDigital Library
Yi Sun, Yuheng Chen, Xiaogang Wang, and Xiaoou Tang. 2014. Deep learning face representation by joint identification-verification. In Proceedings of Advances in neural information processing systems. 1988–1996.Google Scholar
Lin Tan, Chen Liu, Zhenmin Li, Xuanhui Wang, Yuanyuan Zhou, and Chengxiang Zhai. 2014. Bug characteristics in open source software. Empirical software engineering, 19, 6 (2014), 1665–1705.Google Scholar
Ferdian Thung, Shaowei Wang, David Lo, and Lingxiao Jiang. 2012. An empirical study of bugs in machine learning systems. In Proceedings of 23rd International Symposium on Software Reliability Engineering. 271–280.Google ScholarDigital Library
Susana M Vieira, Uzay Kaymak, and João MC Sousa. 2010. Cohen’s kappa coefficient as a performance measure for feature selection. In Proceedings of International Conference on Fuzzy Systems. 1–8.Google ScholarCross Ref
Zhiyuan Wan, David Lo, Xin Xia, and Liang Cai. 2017. Bug characteristics in blockchain systems: a large-scale empirical study. In Proceedings of 2017 IEEE/ACM 14th International Conference on Mining Software Repositories. 413–424.Google ScholarDigital Library
Zan Wang, Ming Yan, Junjie Chen, Shuang Liu, and Dongdi Zhang. 2020. Deep learning library testing via effective model generation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 788–799.Google ScholarDigital Library
Zan Wang, Hanmo You, Junjie Chen, Yingyi Zhang, Xuyuan Dong, and Wenbin Zhang. 2021. Prioritizing Test Inputs for Deep Neural Networks via Mutation Analysis. In Proceedings of 43rd IEEE/ACM International Conference on Software Engineering. 397–409.Google ScholarDigital Library
Cody Watson, Nathan Cooper, David Nader Palacio, Kevin Moran, and Denys Poshyvanyk. 2020. A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research. arXiv preprint arXiv:2009.06520.Google Scholar
Ming Yan, Junjie Chen, Xiangyu Zhang, Lin Tan, Gan Wang, and Zan Wang. 2021. Exposing Numerical Bugs in Deep Learning via Gradient Back-propagation. In Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. to appear.Google ScholarDigital Library
Lin Yang, Junjie Chen, Zan Wang, Weijing Wang, Jiajun Jiang, Xuyuan Dong, and Wenbin Zhang. 2021. Semi-supervised Log-based Anomaly Detection via Probabilistic Label Estimation. In Proceedings of 43rd IEEE/ACM International Conference on Software Engineering. 1448–1460.Google ScholarDigital Library
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation. 283–294.Google ScholarDigital Library
Jerrold H Zar. 2005. Spearman rank correlation. Encyclopedia of biostatistics, 7 (2005).Google Scholar
Tim Zerrell and Jeremy Bruestle. 2019. Stripe: Tensor compilation via the nested polyhedral model. arXiv preprint arXiv:1903.06498.Google Scholar
Qirun Zhang, Chengnian Sun, and Zhendong Su. 2017. Skeletal program enumeration for rigorous compiler testing. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. 347–361.Google ScholarDigital Library
Tianyi Zhang, Ganesha Upadhyaya, Anastasia Reinhardt, Hridesh Rajan, and Miryung Kim. 2018. Are code examples on an online Q&A forum reliable?: a study of API misuse on stack overflow. In Proceedings of 40th IEEE/ACM International Conference on Software Engineering. 886–896.Google ScholarDigital Library
Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An Empirical Study on TensorFlow Program Bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 129–140.Google ScholarDigital Library
Yuhao Zhang, Luyao Ren, Liqian Chen, Yingfei Xiong, Shing-Chi Cheung, and Tao Xie. 2020. Detecting numerical bugs in neural network architectures. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 826–837.Google ScholarDigital Library
Zhide Zhou, Zhilei Ren, Guojun Gao, and He Jiang. 2021. An empirical study of optimization bugs in GCC and LLVM. Journal of Systems and Software, 174 (2021), 110884.Google ScholarCross Ref

Index Terms

A comprehensive study of deep learning compiler bugs
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
  2. Software notations and tools
    1. Compilers

Recommendations

Toward understanding compiler bugs in GCC and LLVM
ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis

Compilers are critical, widely-used complex software. Bugs in them have significant impact, and can cause serious damage when they silently miscompile a safety-critical application. An in-depth understanding of compiler bugs can help detect and fix ...
Read More
Finding deep compiler bugs via guided stochastic program mutation
OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications

Compiler testing is important and challenging. Equivalence Modulo Inputs (EMI) is a recent promising approach for compiler validation. It is based on mutating the unexecuted statements of an existing program under some inputs to produce new equivalent ...
Read More
Finding compiler bugs via live code mutation
OOPSLA '16

Validating optimizing compilers is challenging because it is hard to generate valid test programs (i.e., those that do not expose any undefined behavior). Equivalence Modulo Inputs (EMI) is an effective, promising methodology to tackle this problem. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
August 2021
1690 pages
ISBN:9781450385626
DOI:10.1145/3468264
General Chairs:
Diomidis Spinellis
Athens University of Economics and Business, Greece
,
Georgios Gousios
Facebook, Netherlands / Delft University of Technology, Netherlands
,
Program Chairs:
Marsha Chechik
University of Toronto, Canada
,
Massimiliano Di Penta
University of Sannio, Italy
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 August 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Available / v1.1
Author Tags
Compiler Testing
Deep Learning
Deep Learning Compiler Bug
Empirical Study
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate112of543submissions,21%
Upcoming Conference
FSE '24

Sponsor:

sigsoft

32nd ACM International Conference on the Foundations of Software Engineering

July 15 - 19, 2024

Ipojuca (Pernambuco) , Brazil
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 32
  Total Citations
  View Citations
- 1,648
  Total Downloads
- Downloads (Last 12 months)440
- Downloads (Last 6 weeks)48
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A comprehensive study of deep learning compiler bugs

ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Toward understanding compiler bugs in GCC and LLVM

Finding deep compiler bugs via guided stochastic program mutation

Finding compiler bugs via live code mutation