A deep multimodal model for bug localization

Zhu, Ziye; Li, Yun; Wang, Yu; Wang, Yaojing; Tong, Hanghang

doi:10.1007/s10618-021-00755-7

A deep multimodal model for bug localization

Published: 28 April 2021

Volume 35, pages 1369–1392, (2021)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Ziye Zhu¹,
Yun Li ORCID: orcid.org/0000-0002-2079-9484^1,2,
Yu Wang¹,
Yaojing Wang² &
…
Hanghang Tong³

922 Accesses
11 Citations
3 Altmetric
Explore all metrics

Abstract

Bug localization utilizes the collected bug reports to locate the buggy source files. The state of the art falls short in handling the following three aspects, including (L1) the subtle difference between natural language and programming language, (L2) the noise in the bug reports and (L3) the multi-grained nature of programming language. To overcome these limitations, we propose a novel deep multimodal model named DeMoB for bug localization. It embraces three key features, each of which is tailored to address each of the three limitations. To be specific, the proposed DeMoB generates the multimodal coordinated representations for both bug reports and source files for addressing L1. It further incorporates the AttL encoder to process bug reports for addressing L2, and the MDCL encoder to process source files for addressing L3. Extensive experiments on four large-scale real-world data sets demonstrate that the proposed DeMoB significantly outperforms existing techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Modeling function-level interactions for file-level bug localization

Article 01 October 2022

Hongliang Liang, Dengji Hang & Xiangyu Li

TroBo: A Novel Deep Transfer Model for Enhancing Cross-Project Bug Localization

bjXnet: an improved bug localization model based on code property graph and attention mechanism

Article 07 March 2023

Jiaxuan Han, Cheng Huang, … Jiayong Liu

Notes

A bug tracking system for both free and open-source software, proprietary projects, and products. https://www.bugzilla.org.
https://www.eclipse.org/aspectj/.
https://www.eclipse.org/jdt/.
https://www.eclipse.org/swt/.
https://www.eclipse.org/eclipse/platform-ui/.
https://pytorch.org/.

References

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Baltrušaitis T, Ahuja C, Morency LP (2019) Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell 41(2):423–443
Article Google Scholar
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Article Google Scholar
Cao Y, Long M, Wang J, Yang Q, Yu PS (2016) Deep visual-semantic hashing for cross-modal retrieval. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining (pp 1445–1454). ACM
Cosi P, Caldognetto EM, Vagges K, Mian GA, Contolini M (1994) Bimodal recognition experiments with recurrent neural networks. In: Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP) (vol 2, pp II–553). IEEE
DeMillo RA, Pan H, Spafford EH (1997) Failure and fault analysis for software debugging. In: Proceedings of annual international computer software and applications conference (COMPSAC) (pp 515–521). IEEE
Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Mikolov T et al (2013) Devise: a deep visual-semantic embedding model. In: Advances in neural information processing systems, pp 2121–2129
Hoang T, Oentaryo RJ, Le TDB, Lo D (2018) Network-clustered multi-modal bug localization. IEEE Trans Software Eng 45(10):1002–1023
Article Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Huo X, Li M (2017) Enhancing the unified features to locate buggy files by exploiting the sequential nature of source code. In: Proceedings of international joint conference on artificial intelligence (IJCAI), pp 1909–1915
Huo X, Li M, Zhou ZH (2016) Learning unified features from natural and programming languages for locating buggy source code. In: Proceedings of international joint conference on artificial intelligence (IJCAI), pp 1606–1612
Huo X, Yang Y, Li M, Zhan DC (2018) Learning semantic features for software defect prediction by code comments embedding. In: 2018 IEEE international conference on data mining (ICDM) (pp 1049–1054). IEEE
Jie Z, Wang XY, Dan H, Bing X, Lu Z, Hong M (2015) A survey on bug-report analysis. Sci China Inf Sci 58(2):1–24
Article Google Scholar
Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188
Kim D, Tao Y, Kim S, Zeller A (2013) Where should we fix this bug? A two-phase recommendation model. IEEE Trans Softw Eng 39(11):1597–1610
Article Google Scholar
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Lam AN, Nguyen AT, Nguyen HA, Nguyen TN (2017) Bug localization with combination of deep learning and information retrieval. In: Proceedings of international conference on program comprehension (ICPC), pp 218–229
Li W, Li N (2012) A formal semantics for program debugging. Sci China Inf Sci 55(1):133–148
Article MathSciNet Google Scholar
Liu Z, Zhou D, He J (2019) Towards explainable representation of time-evolving graphs via spatial-temporal graph attention networks. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 2137–2140
Lukins SK, Kraft NA, Etzkorn LH (2008) Source code retrieval for bug localization using latent Dirichlet allocation. In: Proceedings of working conference on reverse engineering (WCRE), pp 155–164
Marcus A, Sergeyev A, Rajlich V, Maletic JI (2004) An information retrieval approach to concept location in source code. In: Proceedings of working conference on reverse engineering (WCRE), pp 214–223
Mihalcea R, Liu H, Lieberman H (2006) Nlp (natural language processing) for nlp (natural language programming). In: Proceedings of international conference on intelligent text processing and computational linguistics (CICLing) (pp 319–330). Springer
Mroueh Y, Marcheret E, Goel V (2015) Deep multimodal learning for audio-visual speech recognition. In: Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp 2130–2134). IEEE
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv preprint arXiv:1802.05365
Poria S, Chaturvedi I, Cambria E, Hussain A (2016) Convolutional mkl based multimodal emotion recognition and sentiment analysis. In: Proceedings of international conference on data mining (ICDM) (pp 439–448). IEEE
Rahman MM, Roy C (2018) Poster: improving bug localization with report quality dynamics and query reformulation. In: Proceedings of IEEE/ACM international conference on software engineering: companion (ICSE-Companion) (pp 348–349). IEEE
Rajagopalan SS, Morency LP, Baltrusaitis T, Goecke R (2016) Extending long short-term memory for multi-view structured learning. In: Proceedings of European conference on computer vision (pp 338–353). Springer
Saha RK, Lease M, Khurshid S, Perry DE (2013) Improving bug localization using structured information retrieval. In: Proceedings of IEEE/ACM international conference on automated software engineering (ASE), pp 345–355
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Article Google Scholar
Shi Z, Keung J, Bennin KE, Zhang X (2018) Comparing learning to rank techniques in hybrid bug localization. Appl Soft Comput 62:636–648
Article Google Scholar
Silberer C, Lapata M (2014) Learning grounded meaning representations with autoencoders. Proc Annu Meet Assoc Comput Linguist 1:721–732
Google Scholar
Srivastava N, Salakhutdinov RR (2012) Multimodal learning with deep Boltzmann machines. In: Proceedings of advances in neural information processing systems, pp 2222–2230
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Sterling CD, Olsson RA (2007) Automated bug isolation via program chipping. Softw Pract Exp 37(10):1061–1086
Article Google Scholar
Vendrov I, Kiros R, Fidler S, Urtasun R (2015) Order-embeddings of images and language. arXiv preprint arXiv:1511.06361
Wang Q, Parnin C, Orso A (2015a) Evaluating the usefulness of ir-based fault localization techniques. In: Proceedings of international symposium on software testing and analysis (ISSTA) (pp 1–11). ACM
Wang W, Arora R, Livescu K, Bilmes J (2015b) On deep multi-view representation learning. In: Proceedings of international conference on machine learning (ICML), pp 1083–1092
Wang Y, Yao Y, Tong H, Huo X, Li M, Xu F, Lu J (2018) Bug localization via supervised topic modeling. In: 2018 IEEE international conference on data mining (ICDM) (pp 607–616). IEEE
Wong WE, Debroy V (2009) A survey of software fault localization. Department of Computer Science, University of Texas at Dallas, Tech Rep UTDCS-45 9
Wong WE, Qi Y (2006) Effective program debugging based on execution slices and inter-block data dependency. J Syst Softw 79(7):891–903
Article Google Scholar
Xiao Y, Keung J, Mi Q, Bennin KE (2017) Improving bug localization with an enhanced convolutional neural network. In: 2017 24th Asia-Pacific software engineering conference (APSEC) (pp 338–347). IEEE
Xiao Y, Keung J, Mi Q, Bennin KE (2018) Bug localization with semantic and structural features using convolutional neural network and cascade forest. In: Proceedings of the 22nd international conference on evaluation and assessment in software engineering 2018, pp 101–111
Xu Y, Biswal S, Deshpande SR, Maher KO, Sun J (2018) Raim: recurrent attentive and intensive model of multimodal patient monitoring data. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining (pp 2565–2573). ACM
Ye X, Bunescu R, Liu C (2014) Learning to rank relevant files for bug reports using domain knowledge. In: Proceedings of ACM SIGSOFT international symposium on foundations of software engineering (FSE), pp 689–699
Zhang X, He H, Gupta N, Gupta R (2005) Experimental evaluation of using dynamic slices for fault location. In: Proceedings of international symposium on automated analysis-driven debugging, pp 33–42
Zhang Y, Zheng W, Li M (2019) Learning uniform semantic features for natural language and programming language globally, locally and sequentially. Proc AAAI Conf Artif Intell 33:5845–5852
Google Scholar
Zhou J, Zhang H, Lo D (2012) Where should the bugs be fixed?-more accurate information retrieval-based bug localization based on bug reports. In: Proceedings of international conference on software engineering (ICSE), pp 14–24

Download references

Acknowledgements

This research was supported by Natural Science Foundation of China (No. 61772284), State Key Lab. for Novel Software Technology (KFKT2020B21), and Postgraduate Research and Practice Innovation Program of Jiangsu Province (SJKY19_0763). Hanghang Tong is partially supported by NSF (1947135 and 2003924).

Author information

Authors and Affiliations

Jiangsu Key Lab. of Big Data Security and Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing, People’s Republic of China
Ziye Zhu, Yun Li & Yu Wang
State Key Lab. for Novel Software Technology, Nanjing University, Nanjing, People’s Republic of China
Yun Li & Yaojing Wang
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
Hanghang Tong

Authors

Ziye Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yun Li
View author publications
You can also search for this author in PubMed Google Scholar
Yu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yaojing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hanghang Tong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yun Li.

Additional information

Responsible editor: Johannes Fürnkranz.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, Z., Li, Y., Wang, Y. et al. A deep multimodal model for bug localization. Data Min Knowl Disc 35, 1369–1392 (2021). https://doi.org/10.1007/s10618-021-00755-7

Download citation

Received: 19 October 2019
Accepted: 19 April 2021
Published: 28 April 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s10618-021-00755-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A deep multimodal model for bug localization

Abstract

Access this article

Similar content being viewed by others

Modeling function-level interactions for file-level bug localization

TroBo: A Novel Deep Transfer Model for Enhancing Cross-Project Bug Localization

bjXnet: an improved bug localization model based on code property graph and attention mechanism

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A deep multimodal model for bug localization

Abstract

Access this article

Similar content being viewed by others

Modeling function-level interactions for file-level bug localization

TroBo: A Novel Deep Transfer Model for Enhancing Cross-Project Bug Localization

bjXnet: an improved bug localization model based on code property graph and attention mechanism

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation