research-article

Towards Demystifying the Impact of Dependency Structures on Bug Locations in Deep Learning Libraries

Authors:

Qingshan LiAuthors Info & Claims

ESEM '22: Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Pages 249 - 260

https://doi.org/10.1145/3544902.3546246

Published: 19 September 2022 Publication History

Abstract

Background: Many safety-critical industrial applications have turned to deep learning systems as a fundamental component. Most of these systems rely on deep learning libraries, and bugs of such libraries can have irreparable consequences. Aims: Over the years, dependency structure has shown to be a practical indicator of software quality, widely used in numerous bug prediction techniques. The problem is that when analyzing bugs in deep learning libraries, researchers are unclear whether dependency structures still have a high correlation and which forms of dependency structures perform the best. Method: In this paper, we present a systematic investigation of the above question and implement a dependency structure-centric bug analysis tool: Depend4BL, capturing the interaction between dependency structures and bug locations in deep learning libraries. Results: We employ Depend4BL to analyze the top 5 open-source deep learning libraries on Github in terms of stars and forks, with 279,788 revision commits and 8,715 bug fixes. The results demonstrate the significant differences among syntactic, history, and semantic structures, and their vastly different impacts on bug locations. Their combinations have the potential to further improve bug prediction for deep learning libraries. Conclusions: In summary, our work provides a new perspective regarding to the correlation between dependency structures and bug locations in deep learning libraries. We release a large set of benchmarks and a prototype toolkit to automatically detect various forms of dependency structures for deep learning libraries. Our study also unveils useful findings based on quantitative and qualitative analysis that benefit bug prediction techniques for deep learning libraries.

References

[1]

2022. a775e0c. https://github.com/tensorflow/tensorflow/commit/a775e0c

[2]

2022. ANTLR. https://github.com/antlr/antlr4

[3]

2022. Benchmark and Toolkit. https://anonymous.4open.science/r/ESEM22-Data-038D

[4]

2022. Depends. https://github.com/multilang-depends/depends

[5]

2022. Git. https://git-scm.com

[6]

2022. GumTree. https://github.com/GumTreeDiff/gumtree

[7]

2022. keras. https://keras.io/

[8]

2022. List of self-driving car fatalities. https://en.wikipedia.org/wiki/Self-driving_car#cite_note-15

[9]

2022. Networkx. https://networkx.org

[10]

2022. SVN. https://subversion.apache.org

[11]

2022. Uber is giving up on self-driving cars in California after deadly crash.https://www.vice.com/en_us/article/9kga85/uber-is-giving-up-on-self-driving-cars-in-california-after-deadly-crash

[12]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, 2016. {TensorFlow}: A System for {Large-Scale} Machine Learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16). 265–283.

[13]

Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints (2016), arXiv–1605.

[14]

Gabriele Bavota, Bogdan Dit, Rocco Oliveto, Massimiliano Di Penta, Denys Poshyvanyk, and Andrea De Lucia. 2013. An empirical study on the developers’ perception of software coupling. In Proceedings of the 2013 International Conference on Software Engineering. IEEE Press, 692–701.

Digital Library

[15]

Marcelo Cataldo, Audris Mockus, Jeffrey A Roberts, and James D Herbsleb. 2009. Software dependencies, work dependencies, and their impact on failures. IEEE Transactions on Software Engineering 35, 6 (2009), 864–878.

Digital Library

[16]

Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16 (2002), 321–357.

[17]

Chenyi Chen, Ari Seff, Alain Kornhauser, and Jianxiong Xiao. 2015. Deepdriving: Learning affordance for direct perception in autonomous driving. In Proceedings of the IEEE international conference on computer vision. 2722–2730.

Digital Library

[18]

Tianqi Chen, Tong He, Michael Benesty, Vadim Khotilovich, Yuan Tang, Hyunsu Cho, 2015. Xgboost: extreme gradient boosting. R package version 0.4-2 1, 4 (2015), 1–4.

[19]

Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078(2014).

[20]

Di Cui, Ting Liu, Yuanfang Cai, Qinghua Zheng, Qiong Feng, Wuxia Jin, Jiaqi Guo, and Yu Qu. 2019. Investigating the impact of multiple dependency structures on software defects. In Proceedings of the 41st International Conference on Software Engineering. IEEE Press, 584–595.

Digital Library

[21]

Daniel Alencar Da Costa, Shane McIntosh, Weiyi Shang, Uirá Kulesza, Roberta Coelho, and Ahmed E Hassan. 2016. A framework for evaluating the results of the szz approach for identifying bug-introducing changes. IEEE Transactions on Software Engineering 43, 7 (2016), 641–657.

Digital Library

[22]

Lingling Fan, Ting Su, Sen Chen, Guozhu Meng, Yang Liu, Lihua Xu, Geguang Pu, and Zhendong Su. 2018. Large-scale analysis of framework-specific exceptions in Android apps. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). IEEE, 408–419.

Digital Library

[23]

Ross Girshick. 2015. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision. 1440–1448.

Digital Library

[24]

Todd L Graves, Alan F Karr, James S Marron, and Harvey Siy. 2000. Predicting fault incidence using software change history. IEEE Transactions on software engineering 26, 7 (2000), 653–661.

Digital Library

[25]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855–864.

Digital Library

[26]

Kim Herzig. 2013. The Impact of Tangled Code Changes. In Working Conference on Mining Software Repositories.

[27]

Kim Herzig, Sascha Just, and Andreas Zeller. 2013. It’s not a bug, it’s a feature: how misclassification impacts bug prediction. In 2013 35th international conference on software engineering (ICSE). IEEE, 392–401.

[28]

David W Hosmer Jr, Stanley Lemeshow, and Rodney X Sturdivant. 2013. Applied logistic regression. Vol. 398. John Wiley & Sons.

[29]

Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of real faults in deep learning systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1110–1121.

Digital Library

[30]

Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A comprehensive study on deep learning bug characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 510–520.

Digital Library

[31]

Md Johirul Islam, Rangeet Pan, Giang Nguyen, and Hridesh Rajan. 2020. Repairing deep neural networks: Fix patterns and challenges. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 1135–1146.

Digital Library

[32]

Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia. 675–678.

Digital Library

[33]

N. Kambhatla. 2004. Combining lexical, syntactic, and semantic features with maximum entropy models for information extraction. In Annual Meeting of Association of Computational Linguistics, 2004.

[34]

Richard M Karp. 1975. On the computational complexity of combinatorial problems. Networks 5, 1 (1975), 45–68.

Digital Library

[35]

Donghwa Kim, Deokseong Seo, Suhyoun Cho, and Pilsung Kang. 2019. Multi-co-training for document classification using various document representations: TF–IDF, LDA, and Doc2Vec. Information Sciences 477(2019), 15–29.

[36]

JH Lau and T Baldwin. 2019. An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation, July 2016.

[37]

Duc Minh Le, Pooyan Behnamghader, Joshua Garcia, Daniel Link, Arman Shahbazian, and Nenad Medvidovic. 2015. An empirical study of architectural change in open-source software systems. In Proceedings of the 12th Working Conference on Mining Software Repositories. IEEE Press, 235–245.

Digital Library

[38]

Siqi Liu, Sidong Liu, Weidong Cai, Sonia Pujol, Ron Kikinis, and Dagan Feng. 2014. Early diagnosis of Alzheimer’s disease with deep learning. In 2014 IEEE 11th international symposium on biomedical imaging (ISBI). IEEE, 1015–1018.

[39]

Ran Mo, Yuanfang Cai, Rick Kazman, and Lu Xiao. 2015. Hotspot patterns: The formal definition and automatic detection of architecture smells. In Software Architecture (WICSA), 2015 12th Working IEEE/IFIP Conference on. IEEE, 51–60.

Digital Library

[40]

Ran Mo and Mengya Zhan. 2019. History coupling space: A new model to represent evolutionary relations. In 2019 26th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 126–133.

[41]

Ruihui Mu and Xiaoqin Zeng. 2019. A review of deep learning research. KSII Transactions on Internet and Information Systems (TIIS) 13, 4(2019), 1738–1764.

[42]

Nachiappan Nagappan, Thomas Ball, and Andreas Zeller. 2006. Mining metrics to predict component failures. In Proceedings of the 28th international conference on Software engineering. ACM, 452–461.

Digital Library

[43]

William S Noble. 2006. What is a support vector machine?Nature biotechnology 24, 12 (2006), 1565–1567.

[44]

Mahesh Pal. 2005. Random forest classifier for remote sensing classification. International journal of remote sensing 26, 1 (2005), 217–222.

[45]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).

[46]

Yu Qu, Xiaohong Guan, Qinghua Zheng, Ting Liu, Lidan Wang, Yuqiao Hou, and Zijiang Yang. 2015. Exploring community structure of software Call Graph and its applications in class cohesion measurement. Journal of Systems and Software 108 (2015), 193–210.

[47]

Yu Qu, Ting Liu, Jianlei Chi, Yangxu Jin, Di Cui, Ancheng He, and Qinghua Zheng. 2018. node2defect: using network embedding to improve software defect prediction. In 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 844–849.

Digital Library

[48]

Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. https://arxiv.org/abs/1908.10084

[49]

Irina Rish 2001. An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence, Vol. 3. 41–46.

[50]

S Rasoul Safavian and David Landgrebe. 1991. A survey of decision tree classifier methodology. IEEE transactions on systems, man, and cybernetics 21, 3(1991), 660–674.

[51]

Richard W. Selby and Victor R. Basili. 1991. Analyzing error-prone system structure. IEEE Transactions on Software Engineering 17, 2 (1991), 141–152.

Digital Library

[52]

Qingchao Shen, Haoyang Ma, Junjie Chen, Yongqiang Tian, Shing-Chi Cheung, and Xiang Chen. 2021. A comprehensive study of deep learning compiler bugs. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 968–980.

Digital Library

[53]

Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E Hassan, and Kenichi Matsumoto. 2018. The impact of automated parameter optimization on defect prediction models. IEEE Transactions on Software Engineering 45, 7 (2018), 683–711.

[54]

Ferdian Thung, Shaowei Wang, David Lo, and Lingxiao Jiang. 2012. An empirical study of bugs in machine learning systems. In 2012 IEEE 23rd International Symposium on Software Reliability Engineering. IEEE, 271–280.

Digital Library

[55]

Vassilios Tzerpos and Richard C Holt. 2000. Accd: an algorithm for comprehension-driven clustering. In Proceedings Seventh Working Conference on Reverse Engineering. IEEE, 258–267.

[56]

Vassilios Tzerpos and Richard C Holt. 2000. ACDC: An Algorithm for Comprehension-Driven Clustering. In wcre. 258–267.

[57]

Song Wang, Taiyue Liu, and Lin Tan. 2016. Automatically learning semantic features for defect prediction. In Ieee/acm International Conference on Software Engineering. 297–308.

[58]

Sunny Wong and Yuanfang Cai. 2011. Generalizing evolutionary coupling with stochastic dependencies. In Ieee/acm International Conference on Automated Software Engineering. 293–302.

[59]

Lu Xiao, Yuanfang Cai, and Rick Kazman. 2014. Titan: A toolset that connects software architecture with quality analysis. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 763–766.

Digital Library

[60]

Ming Yan, Junjie Chen, Xiangyu Zhang, Lin Tan, Gan Wang, and Zan Wang. 2021. Exposing numerical bugs in deep learning via gradient back-propagation. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 627–638.

Digital Library

[61]

Ru Zhang, Wencong Xiao, Hongyu Zhang, Yu Liu, Haoxiang Lin, and Mao Yang. 2020. An empirical study on program failures of deep learning jobs. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 1159–1170.

Digital Library

[62]

Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 129–140.

Digital Library

[63]

Hao Zhong and Zhendong Su. 2015. An empirical study on real bug fixes. In Proceedings of the 37th International Conference on Software Engineering-Volume 1. IEEE Press, 913–923.

Digital Library

[64]

Thomas Zimmermann and Nachiappan Nagappan. 2008. Predicting defects using network analysis on dependency graphs. In Proceedings of the 30th international conference on Software engineering. 531–540.

Digital Library

Index Terms

Towards Demystifying the Impact of Dependency Structures on Bug Locations in Deep Learning Libraries
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems

Recommendations

Constructing the Dependency Structure of a Multiagent Probabilistic Network

A probabilistic network consists of a dependency structure and corresponding probability tables. The dependency structure is a graphical representation of the conditional independencies that are known to hold in the problem domain. In this paper, we ...
A comprehensive study on deep learning bug characteristics
ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Deep learning has gained substantial popularity in recent years. Developers mainly rely on libraries and tools to add deep learning capabilities to their software. What kinds of bugs are frequently found in such software? What are the root causes of ...
Towards Semi-automatic Bug Triage and Severity Prediction Based on Topic Model and Multi-feature of Bug Reports
COMPSAC '14: Proceedings of the 2014 IEEE 38th Annual Computer Software and Applications Conference

Bug fixing is an essential activity in the software maintenance, because most of the software systems have unavoidable defects. When new bugs are submitted, triagers have to find and assign appropriate developers to fix the bugs. However, if the bugs are ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEM '22: Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

September 2022

318 pages

ISBN:9781450394277

DOI:10.1145/3544902

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 September 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ESEM '22

Sponsor:

SIGSOFT

ESEM '22: ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

September 19 - 23, 2022

Helsinki, Finland

Acceptance Rates

Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
165
Total Downloads

Downloads (Last 12 months)36
Downloads (Last 6 weeks)1

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten