ABSTRACT
We present Bugs.jar, a large-scale dataset for research in automated debugging, patching, and testing of Java programs. Bugs.jar is comprised of 1,158 bugs and patches, drawn from 8 large, popular open-source Java projects, spanning 8 diverse and prominent application categories. It is an order of magnitude larger than Defects4J, the only other dataset in its class. We discuss the methodology used for constructing Bugs.jar, the representation of the dataset, several use-cases, and an illustration of three of the use-cases through the application of 3 specific tools on Bugs.jar, namely our own tool, Elixir, and two third-party tools, Ekstazi and JaCoCo.
- Marcel Böhme and Abhik Roychoudhury. 2014. CoREBench: Studying Complexity of Regression Errors. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA 2014). ACM, New York, NY, USA, 11. Google ScholarDigital Library
- Cambridge University. 2013. Cambridge University Study States Software Bugs Cost Economy $312 Billion Per Year. http://www.prweb.com/releases/2013/1/prweb10298185.htm. (2013).Google Scholar
- M. Gligoric, L. Eloussi, and D. Marinov. 2015. Ekstazi: Lightweight Test Selection. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering. Google ScholarDigital Library
- C. Le Goues, N. Holtschulte, E. K. Smith, Y. Brun, P. Devanbu, S. Forrest, and W. Weimer. 2015. The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs. IEEE Transactions on Software Engineering 41, 12 (Dec 2015).Google ScholarDigital Library
- René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis. ACM. Google ScholarDigital Library
- Qingzhou Luo, Farah Hariri, Lamyaa Eloussi, and Darko Marinov. 2014. An empirical analysis of flaky tests. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM. Google ScholarDigital Library
- Matias Martinez, Thomas Durieux, Romain Sommerard, Jifeng Xuan, and Martin Monperrus. 2016. Automatic repair of real bugs in Java: A large-scale experiment on the Defects4J dataset. Empirical Software Engineering (2016), 1--29. Google ScholarDigital Library
- Renaud Pawlak, Martin Monperrus, Nicolas Petitprez, Carlos Noguera, and Lionel Seinturier. 2015. Spoon: A Library for Implementing Analyses and Transformations of Java Source Code. Software: Practice and Experience (2015), na. Google ScholarDigital Library
- Ripon K Saha, Yingjun Lyu, Hiroaki Yoshida, and Mukul R Prasad. 2017. ELIXIR: effective object oriented program repair. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, 648--659. Google ScholarDigital Library
- TIOBE Index. 2018. http://www.tiobe.com/tiobe-index/. (January 2018).Google Scholar
- Bugs.jar: a large-scale, diverse dataset of real-world Java bugs
Recommendations
Empirical review of automated analysis tools on 47,587 Ethereum smart contracts
ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software EngineeringOver the last few years, there has been substantial research on automated analysis, testing, and debugging of Ethereum smart contracts. However, it is not trivial to compare and reproduce that research. To address this, we present an empirical ...
AirBirds: A Large-scale Challenging Dataset for Bird Strike Prevention in Real-world Airports
Computer Vision – ACCV 2022AbstractOne fundamental limitation to the research of bird strike prevention is the lack of a large-scale dataset taken directly from real-world airports. Existing relevant datasets are either small in size or not dedicated for this purpose. To advance ...
Light‐weight resource leak testing based on finalisers
Despite garbage collectors, programmers must manually manage many non‐memory ‘finite system resources’ such as file descriptors and database connections. Unreleased resources result in ‘resource leaks’ that degrade application performance and can even ...
Comments