ABSTRACT
Automated program repair (APR) has great potential to reduce bug-fixing effort and many approaches have been proposed in recent years. APRs are often treated as a search problem where the search space consists of all the possible patches and the goal is to identify the correct patch in the space. Many techniques take a data-driven approach and analyze data sources such as existing patches and similar source code to help identify the correct patch. However, while existing patches and similar code provide complementary information, existing techniques analyze only a single source and cannot be easily extended to analyze both.
In this paper, we propose a novel automatic program repair approach that utilizes both existing patches and similar code. Our approach mines an abstract search space from existing patches and obtains a concrete search space by differencing with similar code snippets. Then we search within the intersection of the two search spaces. We have implemented our approach as a tool called SimFix, and evaluated it on the Defects4J benchmark. Our tool successfully fixed 34 bugs. To our best knowledge, this is the largest number of bugs fixed by a single technology on the Defects4J benchmark. Furthermore, as far as we know, 13 bugs fixed by our approach have never been fixed by the current approaches.
- 2017. Apache Ant. https://github.com/apache/ant. (2017). 2017. Apache Groovy. https://github.com/apache/groovy. (2017). 2017. Apache Hadoop. https://github.com/apache/hadoop. (2017). 2017. Apache Lucene. https://lucene.apache.org. (2017).Google Scholar
- Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2006. An Evaluation of Similarity Coefficients for Software Fault Localization (PRDC). IEEE Computer Society, Washington, DC, USA, 39–46. Google ScholarDigital Library
- Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke. 2015. Automated Software Transplantation (ISSTA). ACM, New York, NY, USA, 257–269. Google ScholarDigital Library
- George EP Box and R Daniel Meyer. 1986. An analysis for unreplicated fractional factorials. Technometrics 28, 1 (1986), 11–18.Google ScholarCross Ref
- E. C. Campos and M. d. A. Maia. 2017. Common Bug-Fix Patterns: A Large-Scale Observational Study. In 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). 404–413. ESEM.2017.55 Google ScholarDigital Library
- Liushan Chen, Yu Pei, and Carlo A. Furia. 2017. Contract-based program repair without the contracts. In ASE. Google ScholarDigital Library
- Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In ASE. 313–324. Google ScholarDigital Library
- B. Fluri, M. Wuersch, M. PInzger, and H. Gall. 2007.Google Scholar
- Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction. IEEE Transactions on Software Engineering (Nov 2007), 725–743. Google ScholarDigital Library
- Qing Gao, Yingfei Xiong, Yaqing Mi, Lu Zhang, Weikun Yang, Zhaoping Zhou, Bing Xie, and Hong Mei. 2015.Google Scholar
- Safe Memory-Leak Fixing for C Programs. In ICSE.Google Scholar
- Qing Gao, Hansheng Zhang, Jie Wang, and Yingfei Xiong. 2015. Fixing Recurring Crash Bugs via Analyzing Q&A Sites. In ASE. 307–318.Google Scholar
- Daniela Micucci Gazzola, Luca and Leonardo Mariani. 2017. Automatic Software Repair: A Survey. TSE PP, 99 (2017), 1–1.Google Scholar
- Divya Gopinath, Muhammad Zubair Malik, and Sarfraz Khurshid. 2011.Google Scholar
- Specification-based Program Repair Using SAT (TACAS’11/ETAPS’11). 173–188. Google ScholarDigital Library
- James Gosling. 2000.Google Scholar
- The Java language specification. Addison-Wesley Professional. Google ScholarDigital Library
- Patricia Jablonski and Daqing Hou. 2007.Google Scholar
- CReN: A Tool for Tracking Copy- and-paste Code Clones and Renaming Identifiers Consistently in the IDE. In Proceedings of the 2007 OOPSLA Workshop on Eclipse Technology eXchange (eclipse ’07). ACM, New York, NY, USA, 16–20. Google ScholarDigital Library
- T. Ji, L. Chen, X. Mao, and X. Yi. 2016.Google Scholar
- Automated Program Repair by Using Similar Code Containing Fix Ingredients. In COMPSAC, Vol. 1. 197–202.Google Scholar
- Jiajun Jiang. 2017. SimFix. https://github.com/xgdsmileboy/SimFix. (2017).Google Scholar
- Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stephane Glondu. 2007.Google Scholar
- DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones (ICSE ’07). 96–105. Google ScholarDigital Library
- Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu. 2012.Google Scholar
- Understanding and Detecting Real-world Performance Bugs. In PLDI. ACM. Google ScholarDigital Library
- René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In ISSTA. 437–440. Google ScholarDigital Library
- Shalini Kaleeswaran, Varun Tulsian, Aditya Kanade, and Alessandro Orso. 2014.Google Scholar
- MintHint: Automated Synthesis of Repair Hints. In ICSE. 266–276. org/10.1145/2568225.2568258Google Scholar
- T. Kamiya, S. Kusumoto, and K. Inoue. 2002. CCFinder: a multilinguistic tokenbased code clone detection system for large scale source code. IEEE Transactions on Software Engineering 28 (Jul 2002), 654–670. Google ScholarDigital Library
- Y. Ke, K. T. Stolee, C. L. Goues, and Y. Brun. 2015.Google Scholar
- Repairing Programs with Semantic Code Search (T). In ASE. 295–306.Google Scholar
- Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. In ICSE. 802–811. Google ScholarDigital Library
- Raghavan Komondoor and Susan Horwitz. 2000. Semantics-preserving procedure extraction. In POPL. ACM, 155–169. Google ScholarDigital Library
- R. Koschke, R. Falke, and P. Frenzel. 2006. Clone Detection Using Abstract Syntax Suffix Trees. In 2006 13th Working Conference on Reverse Engineering. 253–262. Google ScholarDigital Library
- Xuan-Bach D Le, David Lo, and Claire Le Goues. 2016. History Driven Program Repair. In SANER. 213–224.Google Scholar
- Xuan-Bach D. Le, Duc-Hiep Chu, David Lo, Claire Le Goues, and Willem Visser. 2017.Google Scholar
- S3: syntax- and semantic-guided repair synthesis via programming by examples. In ESEC/FSE. 593–604.Google Scholar
- C. Le Goues, ThanhVu Nguyen, S. Forrest, and W. Weimer. 2012. GenProg: A Generic Method for Automatic Software Repair. TSE 38, 1 (Jan 2012), 54–72. Google ScholarDigital Library
- Z. Li, S. Lu, S. Myagmar, and Y. Zhou. 2006. CP-Miner: finding copy-paste and related bugs in large-scale software code. TSE 32 (March 2006), 176–192. Google ScholarDigital Library
- Chen Liu, Jinqiu Yang, Lin Tan, and Munawar Hafiz. 2013. R2Fix: Automatically Generating Bug Fixes from Bug Reports. In ICST. 2013.24 Google ScholarDigital Library
- Xuliang Liu and Hao Zhong. 2018. Mining StackOverflow for Program Repair. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering. 118–129.Google Scholar
- Fan Long, Peter Amidon, and Martin Rinard. 2017. Automatic Inference of Code Transforms for Patch Generation. In ESEC/FSE. 727–739. 3106237.3106253 Google ScholarDigital Library
- Fan Long and Martin Rinard. 2016. An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems. In ICSE. 702–713. 1145/2884781.2884872 Google ScholarDigital Library
- Fan Long and Martin Rinard. 2016.Google Scholar
- Automatic patch generation by learning correct code. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 298–312. Google ScholarDigital Library
- 2837617Google Scholar
- Matias Martinez, Thomas Durieux, Romain Sommerard, Jifeng Xuan, and Martin Monperrus. 2017. Automatic repair of real bugs in java: a large-scale experiment on the defects4j dataset. Empirical Software Engineering 22, 4 (01 Aug 2017), 1936–1964. Google ScholarDigital Library
- Matias Martinez and Martin Monperrus. 2015. Mining Software Repair Models for Reasoning on the Search Space of Automated Program Fixing. Empirical Softw. Engg. (2015), 176–205. 013- 9282- 8 Google ScholarDigital Library
- Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2015. DirectFix: Looking for Simple Program Repairs. In ICSE. 448–458. Google ScholarDigital Library
- 63Google Scholar
- Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2016. Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis. In ICSE. Google ScholarDigital Library
- Hong Mei and Lu Zhang. 2018. Can big data bring a breakthrough for software automation? Science China Information Sciences 61(5), 056101 (2018).Google Scholar
- 017- 9355- 3Google Scholar
- Na Meng, Miryung Kim, and Kathryn S. McKinley. 2011.Google Scholar
- Sydit: Creating and Applying a Program Transformation from an Example (ESEC/FSE ’11). 440–443.Google Scholar
- Na Meng, Miryung Kim, and Kathryn S. McKinley. 2013. LASE: Locating and Applying Systematic Edits by Learning from Examples (ICSE ’13). 502–511. Google ScholarDigital Library
- Martin Monperrus. 2017.Google Scholar
- Automatic Software Repair: a Bibliography. Technical Report. 1–24 pages.Google Scholar
- Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H. Pham, Jafar Al-Kofahi, and Tien N. Nguyen. 2010. Recurring Bug Fixes in Object-oriented Programs (ICSE). 315–324.Google Scholar
- Thanaporn Ongkosit and Shingo Takada. 2014. Responsiveness Analysis Tool for Android Application. In Proceedings of the 2Nd International Workshop on Software Development Lifecycle for Mobile (DeMobile 2014). ACM. Google ScholarDigital Library
- 10.1145/2661694.2661695Google Scholar
- Spencer Pearson, José Campos, René Just, Gordon Fraser, Rui Abreu, Michael D. Ernst, Deric Pang, and Benjamin Keller. 2017. Evaluating and Improving Fault Localization (ICSE ’17). 609–620. Google ScholarDigital Library
- Yuhua Qi, Xiaoguang Mao, Yan Lei, Ziying Dai, and Chengsong Wang. 2014. The Strength of Random Search on Automated Program Repair. In ICSE. 254–265. Google ScholarDigital Library
- Yuhua Qi, Xiaoguang Mao, Yanjun Wen, Ziying Dai, and Bin Gu. 2012.Google Scholar
- More efficient automatic repair of large-scale programs using weak recompilation. SCIENCE CHINA Information Sciences 55, 12 (2012), 2785–2799.Google Scholar
- Zichao Qi, Fan Long, Sara Achour, and Martin Rinard. 2015. An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In ISSTA. 257–269. Google ScholarDigital Library
- Donald B Roberts. 1999.Google Scholar
- Practical Analysis for Refactoring. Technical Report. Champaign, IL, USA.Google Scholar
- Reudismam Rolim, Gustavo Soares, Loris D’Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, and Björn Hartmann. 2017. Learning syntactic program transformations from examples. In ICSE. 404–415. Google ScholarDigital Library
- Ripon K. Saha, Yingjun Lyu, Hiroaki Yoshida, and Mukul R. Prasad. 2017. ELIXIR: Effective Object Oriented Program Repair. In ASE. IEEE Press. http://dl.acm.org/ citation.cfm?id=3155562.3155643 Google ScholarDigital Library
- Hitesh Sajnani, Vaibhav Saini, Jeffrey Svajlenko, Chanchal K. Roy, and Cristina V. Lopes. 2016. SourcererCC: Scaling Code Clone Detection to Big-code. In ICSE. ACM, New York, NY, USA, 1157–1168. Google ScholarDigital Library
- Gerald Salton (Ed.). 1988.Google Scholar
- Automatic Text Processing. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.Google Scholar
- Stelios Sidiroglou-Douskos, Eric Lahtinen, Fan Long, and Martin Rinard. 2015.Google Scholar
- Automatic Error Elimination by Horizontal Code Transfer Across Multiple Applications (PLDI ’15). 43–54.Google Scholar
- Shin Hwei Tan, Hiroaki Yoshida, Mukul R Prasad, and Abhik Roychoudhury. 2016. Anti-patterns in Search-Based Program Repair. In FSE. 1145/2950290.2950295Google Scholar
- Chenglong Wang, Jiajun Jiang, Jun Li, Yingfei Xiong, Xiangyu Luo, Lu Zhang, and Zhenjiang Hu. 2016. Transforming Programs between APIs with Many-to-Many Mappings. In ECOOP. 25:1–25:26. ISSTA’18, July 16–21, 2018, Amsterdam, Netherlands Jiajun Jiang, Yingfei Xiong, Hongyu Zhang, Qing Gao, and Xiangqun ChenGoogle Scholar
- W. Weimer, Z.P. Fry, and S. Forrest. 2013.Google Scholar
- Leveraging program equivalence for adaptive program repair: Models and first results. In ASE. 356–366. Google ScholarDigital Library
- Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009.Google Scholar
- Automatically finding patches using genetic programming. In ICSE. 364–374.Google Scholar
- Ming Wen, Junjie Chen, Rongxin Wu, Dan Hao, and Shing-Chi Cheung. 2018.Google Scholar
- Context-Aware Patch Generation for Better Automated Program Repair. In ICSE.Google Scholar
- M. White, M. Tufano, M. Martinez, M. Monperrus, and D. Poshyvanyk. 2017.Google Scholar
- Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities. ArXiv e-prints (July 2017). arXiv: cs.SE/1707.04742Google Scholar
- Qi Xin and Steven Reiss. 2017. Identifying Test-Suite-Overfitted Patches through Test Case Generation. In ISSTA. 226–236. Google ScholarDigital Library
- Qi Xin and Steven P. Reiss. 2017. Leveraging Syntax-related Code for Automated Program Repair (ASE). http://dl.acm.org/citation.cfm?id=3155562.3155644 Google ScholarDigital Library
- Yingfei Xiong, Zhenjiang Hu, Haiyan Zhao, Hui Song, Masato Takeichi, and Hong Mei. 2009. Supporting automatic model inconsistency fixing. In ESEC/FSE. 315–324. Google ScholarDigital Library
- Yingfei Xiong, Xinyuan Liu, Muhan Zeng, Lu Zhang, and Gang Huang. 2018.Google Scholar
- Identifying Patch Correctness in Test-Based Program Repair. In ICSE.Google Scholar
- Yingfei Xiong, Jie Wang, Runfa Yan, Jiachen Zhang, Shi Han, Gang Huang, and Lu Zhang. 2017. Precise Condition Synthesis for Program Repair. In ICSE. Google ScholarDigital Library
- Yingfei Xiong, Hansheng Zhang, Arnaud Hubaux, Steven She, Jie Wang, and Krzysztof Czarnecki. 2015. Range fixes: Interactive error resolution for software configuration. Software Engineering, IEEE Transactions on 41, 6 (2015), 603–619.Google ScholarDigital Library
- Jifeng Xuan and Martin Monperrus. 2014. Test Case Purification for Improving Fault Localization. In FSE. New York, NY, USA, 52–63. 2635868.2635906 Google ScholarDigital Library
- Jinqiu Yang, Alexey Zhikhartsev, Yuefei Liu, and Lin Tan. 2017. Better Test Cases for Better Automated Program Repair. In FSE. 831–841. 3106237.3106274 Google ScholarDigital Library
- Haruki Yokoyama, Yoshiki Higo, Keisuke Hotta, Takafumi Ohta, Kozo Okano, and Shinji Kusumoto. 2016. Toward Improving Ability to Repair Bugs Automatically: A Patch Candidate Location Mechanism Using Code Similarity (SAC ’16). 1364– 1370. Google ScholarDigital Library
Index Terms
- Shaping program repair space with existing patches and similar code
Recommendations
Better code search and reuse for better program repair
GI '19: Proceedings of the 6th International Workshop on Genetic ImprovementA branch of automated program repair (APR) techniques look at finding and reusing existing code for bug repair. ssFix is one of such techniques that is syntactic search-based: it searches a code database for code fragments that are syntactically similar ...
Leveraging syntax-related code for automated program repair
ASE '17: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software EngineeringWe present our automated program repair technique ssFix which leverages existing code (from a code database) that is syntax-related to the context of a bug to produce patches for its repair. Given a faulty program and a fault-exposing test suite, ssFix ...
MCRepair: Multi-Chunk Program Repair via Patch Optimization with Buggy Block
SAC '23: Proceedings of the 38th ACM/SIGAPP Symposium on Applied ComputingAutomated program repair (APR) is a technology that identifies and repairs bugs automatically. However, repairing multi-chunk bugs remains a long-standing and challenging problem because an APR technique must consider dependencies and then reduce the ...
Comments