ABSTRACT
In the software development process, how to develop better software at lower cost has been a major issue of concern. One way that helps is to find more defects as early as possible, on which defect prediction can provide effective guidance. The most popular defect prediction technique is to build defect prediction models based on machine learning. To improve the performance of defect prediction model, selecting appropriate features is critical. On the other hand, static analysis is usually used in defect detection. As static defect analyzers detects defects by matching some well-defined "defect patterns", its result is useful for locating defects. However, defect prediction and static defect analysis are supposed to be two parallel areas due to the differences in research motivation, solution and granularity.
In this paper, we present a possible approach to improve the performance of defect prediction with the help of static analysis techniques. Specifically, we present to extract features based on defect patterns from static defect analyzers to improve the performance of defect prediction models. Based on this approach, we implemented a defect prediction tool and set up experiments to measure the effect of the features.
- F. Akiyama. An example of software system debugging. In Proceedings of the International Federation of Information Processing Societies Congress, volume 71, pages 353--359. New York: Springer Science and Business Media, 1971.Google Scholar
- C. Catal. Software fault prediction: A literature review and current trends. Expert systems with applications, 38(4):4626--4636, 2011. Google ScholarDigital Library
- C. Catal and B. Diri. A systematic review of software fault prediction studies. Expert systems with applications, 36(4):7346--7354, 2009. Google ScholarDigital Library
- R. Chansler, R. Bryant, R. Bryant, R. Canino-Koening, F. Cesarini, E. Allman, K. Bostic, and T. Brown. The architecture of open source applications. 2011.Google Scholar
- S. R. Chidamber and C. F. Kemerer. A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 20(6):476--493, 1994. Google ScholarDigital Library
- B. Clark and D. Zubrow. How good is the software: a review of defect prediction techniques. Software Engineering Symposium, IEEE Computer Press, 2001.Google Scholar
- P. Cuoq, J. Signoles, P. Baudin, R. Bonichon, G. Canet, L. Correnson, B. Monate, V. Prevosto, and A. Puccetti. Experience report: Ocaml for an industrial-strength static analysis framework. In ACM Sigplan Notices, volume 44, pages 281--286. ACM, 2009. Google ScholarDigital Library
- M. D'Ambros, M. Lanza, and R. Robbes. An extensive comparison of bug prediction approaches. In 7th IEEE Working Conference on Mining Software Repositories, pages 31--41. IEEE, 2010.Google ScholarCross Ref
- A. M. Dutko. The Relational Database: A New Static Analysis Tool? PhD thesis, Cleveland State University, 2011.Google Scholar
- N. E. Fenton and N. Ohlsson. Quantitative analysis of faults and failures in a complex software system. IEEE Transactions on Software Engineering, 26(8):797--814, 2000. Google ScholarDigital Library
- T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. Predicting fault incidence using software change history. IEEE Transactions on Software Engineering, 26(7):653--661, 2000. Google ScholarDigital Library
- T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell. A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering, 38(6):1276--1304, 2012. Google ScholarDigital Library
- M. H. Halstead. Elements of Software Science (Operating and programming systems series). Elsevier Science Inc., 1977. Google ScholarDigital Library
- R. Harrison, S. J. Counsell, and R. V. Nithi. An evaluation of the mood set of object-oriented software metrics. IEEE Transactions on Software Engineering, 24(6):491--496, 1998. Google ScholarDigital Library
- D. Hovemeyer and W. Pugh. Finding bugs is easy. ACM Sigplan Notices, 39(12):92--106, 2004. Google ScholarDigital Library
- Y. Jiang, B. Cukic, and T. Menzies. Fault prediction using early lifecycle data. In The 18th IEEE International Symposium on Software Reliability, pages 237--246. IEEE, 2007. Google ScholarDigital Library
- M. Jureczko. Significance of different software metrics in defect prediction. Software Engineering: An International Journal, 1(1):86--95, 2011.Google Scholar
- M. Jureczko and L. Madeyski. Towards identifying software project clusters with regard to defect prediction. In Proceedings of the 6th International Conference on Predictive Models in Software Engineering, page 9. ACM, 2010. Google ScholarDigital Library
- S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering, 34(4):485--496, 2008. Google ScholarDigital Library
- Z. Li and Y. Zhou. Pr-miner: automatically extracting implicit programming rules and detecting violations in large software code. In ACM SIGSOFT Software Engineering Notes, volume 30, pages 306--315. ACM, 2005. Google ScholarDigital Library
- M. Lipow. Number of faults per line of code. IEEE Transactions on Software Engineering, (4):437--439, 1982. Google ScholarDigital Library
- T. J. McCabe. A complexity measure. IEEE Transactions on Software Engineering, (4):308--320, 1976. Google ScholarDigital Library
- H. Mei and X. Liu. Internetware: An emerging software paradigm for internet computing. Journal of computer science and technology, 26(4):588--599, 2011.Google Scholar
- T. Menzies, J. Greenwald, and A. Frank. Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 33(1):2--13, 2007. Google ScholarDigital Library
- R. Moser, W. Pedrycz, and G. Succi. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In Proceedings of the ACM/IEEE 30th International Conference on Software Engineering, pages 181--190. IEEE, 2008. Google ScholarDigital Library
- N. Nagappan and T. Ball. Use of relative code churn measures to predict system defect density. In Proceedings of the 27th International Conference on Software Engineering, pages 284--292. IEEE, 2005. Google ScholarDigital Library
- T. Nakamura, L. Hochstein, and V. R. Basili. Identifying domain-specific defect classes using inspections and change history. In Proceedings of the ACM/IEEE 28th international symposium on Empirical software engineering, pages 346--355. ACM, 2006. Google ScholarDigital Library
- T. J. Ostrand, E. J. Weyuker, and R. M. Bell. Where the bugs are. In ACM SIGSOFT Software Engineering Notes, volume 29, pages 86--96. ACM, 2004. Google ScholarDigital Library
- F. Rahman and P. Devanbu. How, and why, process metrics are better. In Proceedings of the 2013 International Conference on Software Engineering, pages 432--441. IEEE Press, 2013. Google ScholarDigital Library
- F. Rahman, S. Khatri, E. T. Barr, and P. Devanbu. Comparing static bug finders and statistical prediction. In Proceedings of the 36th International Conference on Software Engineering, pages 424--434. ACM, 2014. Google ScholarDigital Library
- F. Rahman, D. Posnett, and P. Devanbu. Recalling the imprecision of cross-project defect prediction. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, page 61. ACM, 2012. Google ScholarDigital Library
- N. Rutar, C. B. Almazan, and J. S. Foster. A comparison of bug finding tools for java. In In Proceedings of the 15th International Symposium on Software Reliability Engineering, pages 245--256. IEEE, 2004. Google ScholarDigital Library
- C. Sadowski, J. van Gogh, C. Jaspan, E. Soederberg, and C. Winter. Tricorder: Building a program analysis ecosystem. In Proceedings of the International Conference on Software Engineering, 2015. Google ScholarDigital Library
- F. Shull, V. Basili, B. Boehm, P. Costa, M. Lindvall, D. Port, I. Rus, R. Tesoriero, M. Zelkowitz, et al. What we have learned about fighting defects. In Proceedings of 8th IEEE Symposium on Software Metrics, pages 249--258. IEEE, 2002. Google ScholarDigital Library
- M. Takahashi and Y. Kamayachi. An empirical study of a model for program error prediction. In Proceedings of the 8th International Conference on Software Engineering, pages 330--336. IEEE Computer Society Press, 1985. Google ScholarDigital Library
- A. Tosun, A. B. Bener, and R. Kale. Ai-based software defect predictors: Applications and benefits in a case study. In Proceedings of the 22th Innovative Applications of Artificial Intelligence Conference, pages 1748--1755, 2010.Google Scholar
- C.-P. Wong, Y. Xiong, H. Zhang, D. Hao, L. Zhang, and H. Mei. Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution, pages 181--190, 2014. Google ScholarDigital Library
- D. Worth, C. Greenough, and L. Chin. A survey of c and c++ software tools for computational science. Science and Technologies Facilities Council, pages 1--38, 2009.Google Scholar
- J. Zhang, X. Wang, D. Hao, B. Xie, L. Zhang, and H. Mei. A survey on bug-report analysis. SCIENCE CHINA Information Sciences, 58(2):1--24, 2015.Google ScholarCross Ref
- T. Zimmermann, R. Premraj, and A. Zeller. Predicting defects for eclipse. In International Workshop on Predictor Models in Software Engineering, pages 9--9. IEEE, 2007. Google ScholarDigital Library
Index Terms
- Enhancing Defect Prediction with Static Defect Analysis
Recommendations
Heterogeneous defect prediction
ESEC/FSE 2015: Proceedings of the 2015 10th Joint Meeting on Foundations of Software EngineeringSoftware defect prediction is one of the most active research areas in software engineering. We can build a prediction model with defect data collected from a software project and predict defects in the same project, i.e. within-project defect ...
Software defect prediction: do different classifiers find the same defects?
During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of ...
Towards building a universal defect prediction model
MSR 2014: Proceedings of the 11th Working Conference on Mining Software RepositoriesTo predict files with defects, a suitable prediction model must be built for a software project from either itself (within-project) or other projects (cross-project). A universal defect prediction model that is built from the entire set of diverse ...
Comments