Skip to main content

Statistical Debugging Using a Hierarchical Model of Correlated Predicates

  • Conference paper
Artificial Intelligence and Computational Intelligence (AICI 2011)

Abstract

The aim of statistical debugging is to identify faulty predicates that have strong effect on program failure. In this paper predicates are fitted into a linear regression model to consider the vertical effect of predicates on each other and on program termination status. Prior approaches have merely considered predicates in isolation. The proposed approach in this paper is a two step procedure which includes hierarchical clustering and the Lasso regression method. Hierarchical clustering builds a tree structure of correlated predicates. The Lasso method is applied on the clusters in some specified levels of the tree. This makes the method scalable in terms of the size of a program. Unlike other statistical methods which do not provide any context of the failure, the predicates contained in the group that is provided by this method can be used as the bug signature. The method has been evaluated on two well-known test suites, Space and Siemens. The experimental results reveal the accuracy and precision of the approach comparing with similar techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Liblit, B.: Cooperative Bug Isolation. PhD thesis. University of California, Berkeley (2004)

    Google Scholar 

  2. Jones, J.A., Harrold, M.J.: Empirical evaluation of the tarantula automatic fault localization technique. In: 20th IEEE/ACM International Conference on Automated Software Engineering, pp. 273–282. ACM Press, Long Beach (2005)

    Google Scholar 

  3. Liu, C., Yan, X., Fei, L., Han, J., Midkiff, S.P.: Sober: Statistical model-based bug localization. In: 10th European Software Eng. Conf./13th ACM SIGSOFT Int’l Symposium Foundations of Software Engineering, pp. 286–295. ACM Press, Lisbon (2005)

    Google Scholar 

  4. Jiang, L., Su, Z.: Context-aware statistical debugging: from bug predictors to faulty control flow paths. In: Twenty-Second IEEE/ACM International Conference on Automated Software Engineering, pp. 184–193. ACM Press, Atlanta (2007)

    Chapter  Google Scholar 

  5. Arumuga Nainar, P., Chen, T., Rosin, J., Liblit, B.: Statistical debugging using compound Boolean predicates. In: International Symposium on Software Testing and Analysis, pp. 5–15. ACM Press, London (2007)

    Google Scholar 

  6. Zeller, A.: Why Programs Fail: A Guide to Systematic Debugging. Morgan Kaufmann, San Francisco (2006)

    Google Scholar 

  7. Liblit, B., Naik, M., Zheng, A., Aiken, A., Jordan, M.: Scalable Statistical Bug Isolation. In: Int’l Conference Programming Language Design and Implementation, Chicago, pp. 15–26 (2005)

    Google Scholar 

  8. Fei, L., Lee, K., Li, F., Midkiff, S.P.: Argus: Online statistical bug detection. In: Baresi, L., Heckel, R. (eds.) FASE 2006. LNCS, vol. 3922, pp. 308–323. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Chatterjee, S., Hadi, A., Price, B.: Regression Analysis by Example, 4th edn. Wiley Series in Probability and Statistics, New York (2006)

    Book  MATH  Google Scholar 

  10. Hastie, T.J., Tibshirani, R.J., Friedman, J.: The Elements of Statistical Learning: Data Mining Inference and Prediction. Springer, New York (2001)

    Book  MATH  Google Scholar 

  11. Tibshirani, R.: Optimal Reinsertion: Regression shrinkage and selection via the lasso. J. R. Statist. Soc. 58, 267–288 (1996)

    MATH  Google Scholar 

  12. Zheng, A.X., Jordan, M.I., Liblit, B., Naik, M., Aiken, A.: Statistical debugging: simultaneous identification of multiple bugs. In: 23rd International Conference on Machine Learning, pp. 1105–1112. ACM Press, NY (2006)

    Google Scholar 

  13. Liblit, B., Aiken, A., Zheng, X., Jordan, M.I.: Bug isolation via remote program sampling. In: ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation, pp. 141–154. ACM Press, San Diego (2003)

    Chapter  Google Scholar 

  14. Friedman, J., Hastie, T., Tibshirani, R.: Lasso: Glmnet for Matlab - Lasso (L1) and elastic-net regularized generalized linear models

    Google Scholar 

  15. Friedman, J., Hastie, T., Tibshirani, R.: Lasso: Glmnet for R (2011), http://cran.r-project.org/web/packages/glmnet/index.html

  16. Software-artifact infrastructure repository, http://sir.unl.edu/portal

  17. Cleve, H., Zeller, A.: Locating causes of program failures. In: 27th International Conf. on Software Engineering, St. Louis Missouri, pp. 342–351 (2005)

    Google Scholar 

  18. Renieris, M., Reiss, S.: Fault localization with nearest neighbor queries. In: 18th IEEE International Conference on Automated Software Engineering, Montreal, pp. 30–39 (2003)

    Google Scholar 

  19. Parsa, S., Vahidi-Asl, M., Arabi, S.: Finding Causes of Software Failure Using Ridge Regression and Association Rule Generation Methods. In: Ninth ACIS International Conference on Parallel/Distributed Computing, Phuket, pp. 873–878 (2008)

    Google Scholar 

  20. Parsa, S., Arabi, S., Vahidi-Asl, M.: Statistical Software Debugging: From Bug Predictors to the Main Causes of Failure. In: Software Metrics and Measurement: SMM 2009 in Conjunction with the Second International Conference on Application of Digital Information and Web Technologies, London, pp. 802–807 (2009)

    Google Scholar 

  21. Cheng, H., Lo, D., Zhou, Y., Wang, X.: Identifying Bug Signatures Using Discriminative Graph Mining. In: International Symptoms on Software Testing and Analysis, pp. 141–151. ACM Press, Chicago (2009)

    Google Scholar 

  22. Park, M., Hastie, T., Tibshirani, R.: Averaged gene expressions for regression. Biostatistics Journal, 212–227 (2007)

    Google Scholar 

  23. Eisen, M.: Hierarchical Clustering: Cluster and TreeView are an integrated pair of programs for analyzing and visualizing the results of complex microarray experiments, http://rana.lbl.gov/EisenSoftware.htm

  24. Eisen, M., Spellman, P., Brown, P., Botstein, D.: Cluster analysis and display of genomewide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 95, 14863–14868 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Parsa, S., Asadi-Aghbolaghi, M., Vahidi-Asl, M. (2011). Statistical Debugging Using a Hierarchical Model of Correlated Predicates. In: Deng, H., Miao, D., Lei, J., Wang, F.L. (eds) Artificial Intelligence and Computational Intelligence. AICI 2011. Lecture Notes in Computer Science(), vol 7002. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23881-9_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23881-9_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23880-2

  • Online ISBN: 978-3-642-23881-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics