Skip to main content
Log in

How effective are existing Java API specifications for finding bugs during runtime verification?

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Runtime verification can be used to find bugs early, during software development, by monitoring test executions against formal specifications (specs). The quality of runtime verification depends on the quality of the specs. While previous research has produced many specs for the Java API, manually or through automatic mining, there has been no large-scale study of their bug-finding effectiveness. Our conference paper presented the first in-depth study of the bug-finding effectiveness of previously proposed specs. We used JavaMOP to monitor 182 manually written and 17 automatically mined specs against more than 18K manually written and 2.1M automatically generated test methods in 200 open-source projects. The average runtime overhead was under \(4.3{\times }\). We inspected 652 violations of manually written specs and (randomly sampled) 200 violations of automatically mined specs. We reported 95 bugs, out of which developers already fixed or accepted 76. However, most violations, 82.81% of 652 and 97.89% of 200, were false alarms. Based on our empirical results, we conclude that (1) runtime verification technology has matured enough to incur tolerable runtime overhead during testing, and (2) the existing API specifications can find many bugs that developers are willing to fix; however, (3) the false alarm rates are worrisome and suggest that substantial effort needs to be spent on engineering better specs and properly evaluating their effectiveness. We repeated our experiments on a different set of 18 projects and inspected all resulting 742 violations. The results are similar, and our conclusions are the same.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. These specs are publicly available (Pradel 2015).

References

  • Allan, C., Avgustinov, P., Christensen, A.S., Hendren, L., Kuzins, S., Lhoták, O., de Moor, O., Sereni, D., Sittampalam, G., Tibble, J.: Adding trace matching with free variables to AspectJ. In: OOPSLA, pp. 345–364 (2005)

  • Arnold, M., Vechev, M., Yahav, E.: QVM: An efficient runtime for detecting defects in deployed systems. In: OOPSLA, pp. 143–162 (2008)

  • Beckman, N.E., Nori, A.V.: Probabilistic, modular and scalable inference of typestate specifications. In: PLDI, pp. 211–221 (2011)

  • Blackburn, S.M., Garner, R., Hoffmann, C., Khang, A.M., McKinley, K.S., Bentzur, R., Diwan, A., Feinberg, D., Frampton, D., Guyer, S.Z., Hirzel, M., Hosking, A., Jump, M., Lee, H., Moss, J.E.B., Phansalkar, A., Stefanović, D., VanDrunen, T., von Dincklage, D., Wiedermann, B. The DaCapo benchmarks: Java benchmarking development and analysis. In: OOPSLA, pp. 169–190 (2006)

  • Bodden, E.: MOPBox: a library approach to runtime verification. In: RV Tool Demo, pp. 365–369 (2011)

    Chapter  Google Scholar 

  • Bodden, E., Hendren, L., Lam, P., Lhoták, O., Naeem, N.A.: Collaborative runtime verification with tracematches. In: RV, pp. 22–37 (2007a)

  • Bodden, E., Hendren, L.J., Lhoták, O.: A staged static program analysis to improve the performance of runtime monitoring. In: ECOOP, pp. 525–549 (2007b)

  • Bodden, E., Lam, P., Hendren, L.: Finding programming errors earlier by evaluating runtime monitors ahead-of-time. In: FSE, pp. 36–47 (2008)

  • Chen, D., Zhang, Y., Wang, R., Li, X., Peng, L., Wei, W.: Mining universal specification based on probabilistic model. In: SEKE, pp. 471–476 (2015)

  • Chen, F., Roşu, G.: Towards monitoring-oriented programming: a paradigm combining specification and implementation. In: RV, pp. 108–127 (2003)

  • Cochran, W.G.: Sampling Techniques. Wiley, New York (1977)

    MATH  Google Scholar 

  • Dallmeier, V., Knopp, N., Mallon, C., Hack, S., Zeller, A.: Generating test cases for specification mining. In: ISSTA, pp. 85–96 (2010)

  • Dwyer, M.B., Purandare, R., Person, S.: Runtime verification in context: can optimizing error detection improve fault diagnosis? In: RV, pp. 36–50 (2010)

  • Emopers: Closing ObjectOutputStream before calling toByteArray on the underlying ByteArrayOutputStream. https://github.com/JodaOrg/joda-time/pull/339 (2015). Accessed 15 Nov 2019

  • Emopers: Checking the validity of input ListIterators. https://github.com/imglib/imglib2/pull/259 (2019). Accessed 15 Nov 2019

  • Forejt, V., Kwiatkowska, M., Parker, D., Qu, H., Ujma, M.: Incremental runtime verification of probabilistic systems. In: RV, pp. 314–319 (2012)

  • Formal Systems Laboratory: JavaMOP. http://fsl.cs.illinois.edu/index.php/JavaMOP (2014). Accessed 15 Nov 2019

  • Formal Systems Laboratory: Collections\(\_\)SynchronizedCollection. http://fsl.cs.illinois.edu/annotated-java/__properties/html/java/util/Collections_SynchronizedCollection.html (2015a). Accessed 15 Nov 2019

  • Formal Systems Laboratory: JavaMOPAgent Documentation. https://github.com/runtimeverification/javamop/blob/master/docs/JavaMOPAgentUsage.md (2015b). Accessed 15 Nov 2019

  • Formal Systems Laboratory: FSL Specification Database. https://runtimeverification.com/monitor/propertydb (2016). Accessed 15 Nov 2019

  • Gabel, M., Su, Z.: Online inference and enforcement of temporal properties. In: ICSE, pp. 15–24 (2010)

  • Gabel, M., Su, Z.: Testing mined specifications. In: FSE, pp. 1–11 (2012)

  • Hussein, S., Meredith, P., Roşu, G.: Security-policy monitoring and enforcement with JavaMOP. In: PLAS, pp. 1–11 (2012)

  • Jin, D., Meredith, P.O., Griffith, D., Roşu, G.: Garbage collection for monitoring parametric properties. In: PLDI, pp. 415–424 (2011)

  • Jin, D., Meredith, P.O., Lee, C., Roşu, G.: JavaMOP: Efficient parametric runtime monitoring framework. In: ICSE Demo, pp. 1427–1430 (2012a)

  • Jin, D., Meredith, P.O., Roşu, G.: Scalable parametric runtime monitoring. Technical report, Computer Science Department, UIUC (2012b)

  • Joda, S.: Joda-Time. http://www.joda.org/joda-time/ (2016). Accessed 15 Nov 2019

  • Karaorman, M., Freeman, J.: jMonitor: Java runtime event specification and monitoring library. In: RV, pp. 181–200 (2004)

    Article  Google Scholar 

  • Krka, I., Brun, Y., Medvidovic, N.: Automatic mining of specifications from invocation traces and method invariants. In: FSE, pp. 178–189 (2014)

  • Le Goues, C., Weimer, W.: Specification mining with few false positives. In: TACAS, pp. 292–306 (2009)

  • Lee, C., Chen, F., Roşu, G.: Mining parametric specifications. In: ICSE, pp. 591–600 (2011)

  • Lee, C., Jin, D., Meredith, P.O., Roşu, G.: Towards categorizing and formalizing the JDK API. Technical report, Computer Science Department, UIUC (2012)

  • Legunsen, O., Marinov, D., Roşu, G.: Evolution-aware monitoring-oriented programming. In: ICSE NIER, pp. 615–618 (2015)

  • Legunsen, O., Hariri, F., Shi, A., Lu, Y., Zhang, L., Marinov, D.: An extensive study of static regression test selection in modern software evolution. In: FSE, pp. 583–594 (2016a)

  • Legunsen, O., Hassan, W.U., Xu, X., Rosu, G., Marinov, D.: How good are the specs? A study of the bug-finding effectiveness of existing Java API specifications. In: ASE, pp. 602–613 (2016b)

  • Legunsen, O., Hassan, W.U., Xu, X., Roşu, G., Marinov, D.: Supplementary material for this paper. http://fsl.cs.illinois.edu/spec-eval (2016c). Accessed 15 Nov 2019

  • Legunsen, O., Shi, A., Marinov, D.: STARTS: STAtic Regression Test Selection. In: ASE, pp. 949–954 (2017)

  • Legunsen, O., Zhang, Y., Hadzi-Tanovic, M., Roşu, G., Marinov, D.: Techniques for evolution-aware runtime verification. In: ICST, pp. 300–311 (2019)

  • Lemieux, C.: Mining temporal properties of data invariants. In: ICSE SRC, pp. 751–753 (2015)

  • Lemieux, C., Park, D., Beschastnikh, I.: General LTL specification mining. In: ASE, pp. 81–92 (2015)

  • Ley, M.: CompleteSearch DBLP. http://www.dblp.org/search/index.php (2015). Accessed 15 Nov 2019

  • Luo, Q., Zhang, Y., Lee, C., Jin, D., Meredith, P.O., Şerbănuţă, T.F., Roşu, G.: RV-Monitor: efficient parametric runtime verification with simultaneous properties. In: RV, pp. 285–300 (2014)

    Chapter  Google Scholar 

  • Mao, D., Chen, L., Zhang, L.: An extensive study on cross-project predictive mutation testing. In: ICST, pp. 160–171 (2019)

  • Meredith, P., Roşu, G.: Efficient parametric runtime verification with deterministic string rewriting. In: ASE, pp. 70–80 (2013)

  • Meredith, P., Jin, D., Chen, F., Roşu, G.: Efficient monitoring of parametric context-free patterns. In: ASE, pp. 148–157 (2008)

  • Navabpour, S., Wu, C.W.W., Bonakdarpour, B., Fischmeister, S.: Efficient techniques for near-optimal instrumentation in time-triggered runtime verification. In: RV, pp. 208–222 (2011)

    Chapter  Google Scholar 

  • Nguyen, A.C., Khoo, S.C.: Extracting significant specifications from mining through mutation testing. In: ICFEM, pp. 472–488 (2011)

  • Nguyen, H.A., Dyer, R., Nguyen, T.N., Rajan, H.: Mining preconditions of APIs in large-scale code corpus. In: FSE, pp. 166–177 (2014)

  • Oracle: java.lang.instrument. http://docs.oracle.com/javase/7/docs/api/java/lang/instrument/package-summary.html (2015a). Accessed 15 Nov 2019

  • Oracle: java.lang.Math. https://docs.oracle.com/javase/7/docs/api/java/lang/Math.html (2015b). Accessed 15 Nov 2019

  • Oracle: java.net.URL. https://docs.oracle.com/javase/7/docs/api/java/net/URL.html (2015c). Accessed 15 Nov 2019

  • Oracle: java.util.Collections. https://docs.oracle.com/javase/7/docs/api/java/util/Collections.html (2015d). Accessed 15 Nov 2019

  • Pacheco, C., Ernst, M.D.: Randoop: feedback-directed random testing for Java. In: OOPSLA Companion, pp. 815–816 (2007)

  • Pacheco, C., Ernst, M.D.: Randoop. https://randoop.github.io/randoop/ (2016). Accessed 15 Nov 2019

  • Pacheco, C., Lahiri, S.K., Ernst, M.D., Ball, T.: Feedback-directed random test generation. In: ICSE, pp. 75–84 (2007)

  • Pacheco, C., Lahiri, S.K., Ball, T.: Finding errors in .NET with feedback-directed random testing. In: ISSTA, pp. 87–96 (2008)

  • Pradel, M.: Dynamically inferring, refining, and checking API usage protocols. In: OOPSLA Companion, pp. 773–774 (2009)

  • Pradel, M.: Statically checking API protocol conformance with mined multi-object specifications (supplementary material). http://mp.binaervarianz.de/icse2012-statically/ (2015). Accessed 15 Nov 2019

  • Pradel, M., Gross, T.R.: Automatic generation of object usage specifications from large method traces. In: ASE, pp. 371–382 (2009)

  • Pradel, M., Gross, T.R.: Leveraging test generation and specification mining for automated bug detection without false positives. In: ICSE, pp. 288–298 (2012)

  • Pradel, M., Bichsel, P., Gross, T.R.: A framework for the evaluation of specification miners based on finite state machines. In: ICSM, pp. 1–10 (2010)

  • Pradel, M., Jaspan, C., Aldrich, J., Gross, T.R.: Statically checking API protocol conformance with mined multi-object specifications. In: ICSE, pp. 925–935 (2012)

  • Purandare, R., Dwyer, M.B., Elbaum, S.: Optimizing monitoring of finite state properties through monitor compaction. In: ISSTA, pp. 280–290 (2013)

  • Reger, G., Barringer, H., Rydeheard, D.: A pattern-based approach to parametric specification mining. In: ASE, pp. 658–663 (2013)

  • Robillard, M.P., Bodden, E., Kawrykow, D., Mezini, M., Ratchford, T.: Automated API property inference techniques. TSE 39(5), 613–637 (2013)

    Google Scholar 

  • Shamshiri, S., Just, R., Rojas, J., Fraser, G., McMinn, P., Arcuri, A.: Do automatically generated unit tests find real faults? An empirical study of effectiveness and challenges. In: ASE, pp. 201–211 (2015)

  • Sun, J., Xiao, H., Liu, Y., Lin, S.W., Qin, S.: TLV: abstraction through testing, learning, and validation. In: ESEC/FSE, pp. 698–709 (2015)

  • Tan, S.H., Marinov, D., Tan, L., Leavens, G.T.: @tComment: testing Javadoc comments to detect comment-code inconsistencies. In: ICST, pp. 260–269 (2012)

  • The JaCoCo Team: JaCoCo Java Code Coverage Library. https://www.jacoco.org/jacoco (2018). Accessed 15 Nov 2019

  • Thummalapenta, S., Xie, T.: Alattin: mining alternative patterns for detecting neglected conditions. In: ASE, pp. 283–294 (2009)

  • Wasylkowski, A., Zeller, A.: Mining temporal specifications from object usage. In: ASE, pp. 295–306 (2009)

  • Weimer, W., Necula, G.: Mining temporal specifications for error detection. In: TACAS, pp. 461–476 (2005)

  • Wu, C.W.W., Kumar, D., Bonakdarpour, B., Fischmeister, S.: Reducing monitoring overhead by integrating event- and time-triggered techniques. In: RV, pp. 304–321 (2013)

    Chapter  Google Scholar 

  • Wu, Q., Liang, G., Wang, Q., Xie, T., Mei, H.: Iterative mining of resource-releasing specifications. In: ASE, pp. 233–242 (2011)

  • Zhang, J., Wang, Z., Zhang, L., Hao, D., Zang, L., Cheng, S., Zhang, L.: Predictive mutation testing. In: ISSTA, pp. 342–353 (2016)

  • Zhang, J., Zhang, L., Harman, M., Hao, D., Jia, Y., Zhang, L.: Predictive mutation testing. In: TSE, pp. 898–918 (2018)

    Article  Google Scholar 

  • Zhong, H., Zhang, L., Xie, T., Mei, H.: Inferring resource specifications from natural language API documentation. In: ASE, pp. 307–318 (2009)

Download references

Acknowledgements

Karl Hajal, Milica Hadzi-Tanovic and Igor Lima helped with inspecting violations in our validation study and submitting pull requests. We thank Alex Gyori, Farah Hariri, Cosmin Radoi, and August Shi for feedback on early drafts of this paper, Rahul Gopinath for discussions and help with Randoop, and He Xiao and Yi Zhang for help with JavaMOP. We also thank all authors of papers who replied to our emails concerning their mined specs. This research was partially supported by the NSF Grants CCF-1421503, CCF-1421575, CCF-1438982, CCF-1439957, CNS-1646305, CNS-1740916, and CCF-1763788. Wajih Ul Hassan was partially supported by the Sohaib and Sara Abassi Fellowship. We gratefully acknowledge support for research on testing from Microsoft and Qualcomm.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Owolabi Legunsen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Legunsen, O., Al Awar, N., Xu, X. et al. How effective are existing Java API specifications for finding bugs during runtime verification?. Autom Softw Eng 26, 795–837 (2019). https://doi.org/10.1007/s10515-019-00267-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10515-019-00267-1

Keywords

Navigation