Skip to main content
Log in

Alattin: mining alternative patterns for defect detection

Automated Software Engineering Aims and scope Submit manuscript

Abstract

To improve software quality, static or dynamic defect-detection tools accept programming rules as input and detect their violations in software as defects. As these programming rules are often not well documented in practice, previous work developed various approaches that mine programming rules as frequent patterns from program source code. Then these approaches use static or dynamic defect-detection techniques to detect pattern violations in source code under analysis. However, these existing approaches often produce many false positives due to various factors. To reduce false positives produced by these mining approaches, we develop a novel approach, called Alattin, that includes new mining algorithms and a technique for detecting neglected conditions based on our mining algorithm. Our new mining algorithms mine patterns in four pattern formats: conjunctive, disjunctive, exclusive-disjunctive, and combinations of these patterns. We show the benefits and limitations of these four pattern formats with respect to false positives and false negatives among detected violations by applying those patterns to the problem of detecting neglected conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Acharya, M., Xie, T., Pei, J., Xu, J.: Mining API patterns as partial orders from source code: from usage scenarios to specifications. In: Proc. ESEC/FSE, pp. 25–34 (2007)

    Google Scholar 

  • Acharya, M., Xie, T., Xu, J.: Mining interface specifications for generating checkable robustness properties. In: Proc. ISSRE, pp. 311–320 (2006)

    Google Scholar 

  • Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, pp. 307–328 (1996)

    Google Scholar 

  • Ammons, G., Bodik, R., Larus, J.R.: Mining specifications. In: Proc. POPL, pp. 4–16 (2002)

    Google Scholar 

  • Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: a maximal frequent itemset algorithm for transactional databases. In: Proc. ICDE, pp. 443–452 (2001)

    Google Scholar 

  • Chang, R.-Y., Podgurski, A., Yang, J.: Finding what’s not there: a new approach to revealing neglected conditions in software. In: Proc. ISSTA, pp. 163–173 (2007)

    Chapter  Google Scholar 

  • Bibliography on mining software engineering data. https://sites.google.com/site/asergrp/dmse/ (2010)

  • Engler, D., Chen, D.Y., Hallem, S., Chou, A., Chelf, B.: Bugs as deviant behavior: a general approach to inferring errors in systems code. In: Proc. SOSP, pp. 57–72 (2001)

    Google Scholar 

  • Ernst, M., Cockrell, J., Griswold, W., Notkin, D.: Dynamically discovering likely program invariants to support program evolution. IEEE Trans. Softw. Eng. 27(2), 99–123 (2001)

    Article  Google Scholar 

  • Google code search engine. http://www.google.com/codesearch (2006)

  • Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Mateo (2000)

    Google Scholar 

  • The Koders source code search engine. http://www.koders.com (2005)

  • Lethbridge, T., Singer, J., Forward, A.: How software engineers use documentation: the state of the practice. In: IEEE Software, pp. 35–39 (2003)

    Google Scholar 

  • Li, Z., Zhou, Y.: PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software codes. In: Proc. FSE, pp. 306–315 (2005)

    Google Scholar 

  • Livshits, V.B., Zimmermann, T.: Dynamine: finding common error patterns by mining software revision histories. In: Proc. ESEC/FSE, pp. 296–305 (2005)

    Chapter  Google Scholar 

  • Nanavati, A.A., Chitrapura, K.P., Joshi, S., Krishnapuram, R.: Mining generalised disjunctive association rules. In: Proc. CIKM, pp. 482–489 (2001)

    Google Scholar 

  • Nguyen, T.T., Nguyen, H.A., Pham, N.H., Al-Kofahi, J.M., Nguyen, T.N.: Graph-based mining of multiple object usage patterns. In: Proc. ESEC/FSE, pp. 383–392 (2009)

    Chapter  Google Scholar 

  • Ramanathan, M.K., Grama, A., Jagannathan, S.: Path-sensitive inference of function precedence protocols. In: Proc. ICSE, pp. 240–250 (2007)

    Google Scholar 

  • Shimizu, K., Miura, T.: Disjunctive sequential patterns on single data sequence and its anti-monotonicity. In: Proc. MLDM, pp. 376–383 (2005)

    Google Scholar 

  • Shoham, S., Yahav, E., Fink, S., Pistoia, M.: Static specification mining using automata-based abstractions. In: Proc. ISSTA, pp. 174–184 (2007)

    Chapter  Google Scholar 

  • Srikant, R., Agrawal, R.: Mining sequential patterns: generalizations and performance improvements. In: Proc. EDBT, pp. 3–17 (1996)

    Google Scholar 

  • Thummalapenta, S., Xie, T.: PARSEWeb: a programmer assistant for reusing open source code on the web. In: Proc. ASE, pp. 204–213 (2007)

    Google Scholar 

  • Thummalapenta, S., Xie, T.: Alattin: mining alternative patterns for detecting neglected conditions. In: Proc. ASE, pp. 283–294 (2009)

    Google Scholar 

  • Thummalapenta, S., Xie, T.: Mining exception-handling rules as sequence association rules. In: Proc. ICSE, pp. 496–506 (2009)

    Google Scholar 

  • Wasylkowski, A., Zeller, A., Lindig, C.: Detecting object usage anomalies. In: Proc. ESEC/FSE, pp. 35–44 (2007)

    Google Scholar 

  • Weimer, W., Necula, G.: Mining temporal specifications for error detection. In: Proc. TACAS, pp. 461–476 (2005)

    Google Scholar 

  • Williams, C.C., Hollingsworth, J.K.: Recovering system specific rules from software repositories. In: Proc. MSR, pp. 1–5 (2005)

    Chapter  Google Scholar 

  • Yang, J., Evans, D., Bhardwaj, D., Bhat, T., Das, M.: Perracotta: mining temporal API rules from imperfect traces. In: Proc. ICSE, pp. 282–291 (2006)

    Google Scholar 

  • Zhao, L., Zaki, M.J., Ramakrishnan, N.: BLOSOM: a framework for mining arbitrary boolean expressions. In: Proc. KDD, pp. 827–832 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suresh Thummalapenta.

Additional information

This work was primarily done when the first author is at North Carolina State University.

This paper is an extended version of our previous work published at ASE 2009  (Thummalapenta and Xie 2009). Our previous work introduced the concept of balanced and imbalanced patterns that are expressed in the Or pattern format. In this work, we propose additional new pattern formats Xor and Combo. We also propose new mining algorithms for mining patterns in Or, Xor, and Combo pattern formats. Furthermore, we show the benefits and limitations of And, Or, Xor, and Combo pattern formats by applying the patterns mined using these formats to the problem of detecting neglected conditions in applications under analysis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thummalapenta, S., Xie, T. Alattin: mining alternative patterns for defect detection. Autom Softw Eng 18, 293–323 (2011). https://doi.org/10.1007/s10515-011-0086-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10515-011-0086-z

Keywords

Navigation