Improving Defect Localization by Classifying the Affected Asset Using Machine Learning

Halali, Sam; Staron, Miroslaw; Ochodek, Miroslaw; Meding, Wilhelm

doi:10.1007/978-3-030-05767-1_8

Improving Defect Localization by Classifying the Affected Asset Using Machine Learning

Conference paper
First Online: 11 December 2018

915 Accesses
2 Citations

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 338))

Abstract

A vital part of a defect’s resolution is the task of defect localization. Defect localization is the task of finding the exact location of the defect in the system. The defect report, in particular, the asset attribute, helps the person assigned to handle the problem to limit the search space when investigating the exact location of the defect. However, research has shown that oftentimes reporters initially assign values to these attributes that provide incorrect information. In this paper, we propose and evaluate the way of automatically identifying the location of a defect using machine learning to classify the source asset. By training an Support-Vector-Machine (SVM) classifier with features constructed from both categorical and textual attributes of the defect reports we achieved an accuracy of 58.52% predicting the source asset. However, when we trained an SVM to provide a list of recommendations rather than a single prediction, the recall increased to up to 92.34%. Given these results, we conclude that software development teams can use these algorithms to predict up to ten potential locations, but already with three predicted locations, the teams can get useful results with the accuracy of over 70%.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
A feature is an attribute describing an entity (in our case a defect). A single entity is described as a vector of features.
2.
Since the information about defects is considered as sensitive data, we are not allowed to provide precise information about the size of the dataset.
3.
We are not allowed to provide the exact number of assets since it is a confidential information.

References

Anvik, J., Hiew, L., Murphy, G.C.: Who should fix this bug? In: Proceedings of the 28th International Conference on Software Engineering, ICSE 2006, pp. 361–370. ACM, New York (2006). http://doi.acm.org/10.1145/1134285.1134336
Banitaan, S., Alenezi, M.: TRAM: an approach for assigning bug reports using their metadata. In: 2013 Third International Conference on Communications and Information Technology (ICCIT), pp. 215–219, June 2013. https://doi.org/10.1109/ICCITechnology.2013.6579552
Bhattacharya, P., Neamtiu, I., Shelton, C.R.: Automated, highly-accurate, bug assignment using machine learning and tossing graphs. J. Syst. Softw. 85(10), 2275–2292 (2012). http://www.sciencedirect.com/science/article/pii/S0164121212001240. Automated Software Evolution
Article Google Scholar
Bosch, J.: Speed, data, and ecosystems: the future of software engineering. IEEE Softw. 33(1), 82–88 (2016)
Article Google Scholar
Cavalcanti, Y.A.C., da Mota Silveira Neto, P.A., Machado, I.D.C., Vale, T.F., de Almeida, E.S., Meira, S.R.D.L.: Challenges and opportunities for software change request repositories: a systematic mapping study. J. Softw. Evol. Process 26(7), 620–653 (2014). https://doi.org/10.1002/smr.1639. http://dx.doi.org/10.1002/smr.1639
Article Google Scholar
Goyal, A., Sardana, N.: Machine learning or information retrieval techniques for bug triaging: which is better? e-Informatica Softw. Eng. J. 11(1), 117–141 (2017)
Google Scholar
Jalbert, N., Weimer, W.: Automated duplicate detection for bug tracking systems. In: 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN), pp. 52–61, June 2008. https://doi.org/10.1109/DSN.2008.4630070
Jonsson, L.: Increasing anomaly handling efficiency in large organizations using applied machine learning. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 1361–1364, May 2013. https://doi.org/10.1109/ICSE.2013.6606717
Jonsson, L., Borg, M., Broman, D., Sandahl, K., Eldh, S., Runeson, P.: Automated bug assignment: ensemble-based machine learning in large scale industrial contexts. Empir. Softw. Eng. 21(4), 1533–1578 (2016)
Article Google Scholar
Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008). https://doi.org/10.1109/TSE.2008.35
Article Google Scholar
Lin, Z., Shu, F., Yang, Y., Hu, C., Wang, Q.: An empirical study on bug assignment automation using Chinese bug data. In: 2009 3rd International Symposium on Empirical Software Engineering and Measurement, pp. 451–455, October 2009. https://doi.org/10.1109/ESEM.2009.5315994
Martini, A., Pareto, L., Bosch, J.: Enablers and inhibitors for speed with reuse. In: Proceedings of the 16th International Software Product Line Conference-Volume 1, pp. 116–125. ACM (2012)
Google Scholar
Rana, R., Staron, M., Hansson, J., Nilsson, M., Meding, W.: A framework for adoption of machine learning in industry for software defect prediction. In: 2014 9th International Conference on Software Engineering and Applications (ICSOFT-EA), pp. 383–392, August 2014
Google Scholar
Runeson, P., Alexandersson, M., Nyholm, O.: Detection of duplicate defect reports using natural language processing. In: Proceedings of the 29th International Conference on Software Engineering, ICSE 2007, pp. 499–510. IEEE Computer Society, Washington, DC (2007). https://doi.org/10.1109/ICSE.2007.32. http://dx.doi.org.proxy.lib.chalmers.se/10.1109/ICSE.2007.32
Sammut, C., Webb, G.I.: Encyclopedia of Machine Learning, 1st edn. Springer, Boston (2011)
MATH Google Scholar
Scikit-learn: Scikit-learn Framework. http://scikit-learn.org/stable/
Ståhl, D., Bosch, J.: Experienced benefits of continuous integration in industry software product development: a case study. In: The 12th IASTED International Conference on Software Engineering, Innsbruck, Austria, pp. 736–743 (2013)
Google Scholar
Staron, M., Meding, W.: Predicting short-term defect inflow in large software projects: an initial evaluation. In: Proceedings of the 11th International Conference on Evaluation and Assessment in Software Engineering, EASE 2007, pp. 33–42, British Computer Society, Swinton (2007). http://dl.acm.org/citation.cfm?id=2227134.2227138
Staron, M., Meding, W.: Predicting weekly defect inflow in large software projects based on project planning and test status. Inf. Softw. Technol. 50(7–8), 782–796 (2008). https://doi.org/10.1016/j.infsof.2007.10.001
Article Google Scholar
Staron, M., Meding, W.: Predicting weekly defect inflow in large software projects based on project planning and test status. Inf. Softw. Technol. 50(7–8), 782–796 (2008)
Article Google Scholar
Sun, C., Lo, D., Wang, X., Jiang, J., Khoo, S.C.: A discriminative model approach for accurate duplicate bug report retrieval. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE 2010, pp. 45–54. ACM, New York (2010). http://doi.acm.org/10.1145/1806799.1806811
Tian, Y., Sun, C., Lo, D.: Improved duplicate bug report identification. In: 2012 16th European Conference on Software Maintenance and Reengineering, pp. 385–390, March 2012. https://doi.org/10.1109/CSMR.2012.48
Xia, X., Lo, D., Wang, X., Zhou, B.: Accurate developer recommendation for bug resolution. In: 2013 20th Working Conference on Reverse Engineering (WCRE), pp. 72–81, October 2013. https://doi.org/10.1109/WCRE.2013.6671282

Download references

Author information

Authors and Affiliations

Chalmers | University of Gothenburg, Gothenburg, Sweden
Sam Halali, Miroslaw Staron & Miroslaw Ochodek
Institute of Computing Science, Poznan University of Technology, Poznań, Poland
Miroslaw Ochodek
Ericsson AB, Gothenburg, Sweden
Wilhelm Meding

Authors

Sam Halali
View author publications
You can also search for this author in PubMed Google Scholar
Miroslaw Staron
View author publications
You can also search for this author in PubMed Google Scholar
Miroslaw Ochodek
View author publications
You can also search for this author in PubMed Google Scholar
Wilhelm Meding
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Miroslaw Ochodek .

Editor information

Editors and Affiliations

Vienna University of Technology, Vienna, Austria
Dietmar Winkler
Vienna University of Technology, Vienna, Austria
Stefan Biffl
Software Quality Lab GmbH, Linz, Austria
Johannes Bergsmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Halali, S., Staron, M., Ochodek, M., Meding, W. (2019). Improving Defect Localization by Classifying the Affected Asset Using Machine Learning. In: Winkler, D., Biffl, S., Bergsmann, J. (eds) Software Quality: The Complexity and Challenges of Software Engineering and Software Quality in the Cloud. SWQD 2019. Lecture Notes in Business Information Processing, vol 338. Springer, Cham. https://doi.org/10.1007/978-3-030-05767-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-05767-1_8
Published: 11 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05766-4
Online ISBN: 978-3-030-05767-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics