Abstract
Extract method is the “Swiss army knife” of refactorings: developers perform method extraction to introduce alternative signatures, decompose long code, improve testability, among many other reasons. Although the rationales behind method extraction are well explored, we are not yet aware of its characteristics. Assessing this information can provide the basis to better understand this important refactoring operation as well as improve refactoring tools and techniques based on the actual behavior of developers. In this paper, we assess characteristics of the extract method refactoring. We rely on a state-of-the-art technique to detect method extraction, and analyze over 70K instances of this refactoring, mined from 124 software systems. We investigate five aspects of this operation: magnitude, content, transformation, size, and degree. We find that (i) the extract method is among the most popular refactorings; (ii) extracted methods are over represented on operations related to creation, validation, and setup; (iii) methods that are targets of the extractions are 2.2x longer than the average, and they are reduced by one statement after the extraction; and (iv) single method extraction represents most, but not all, of the cases. We conclude by proposing improvements to refactoring detection, suggestion, and automation tools and techniques to support both practitioners and researchers.
Similar content being viewed by others
Notes
Data collected with the Stack Exchange API: https://data.stackexchange.com
Question ID: 1155947
Question ID: 10289461
Question ID: 2619228
Question ID: 26674797
Question ID: 511211
Question ID: 29257032
Question ID: 4930742
Question ID: 1898645
1247835
19972611
Example from Arduino project: https://goo.gl/aD8n1N
Example from Arduino project: https://goo.gl/CaQWiB
Example from Arduino project: https://goo.gl/yPvj5M
The authors excluded the rename operations in their analysis.
Suffixes were much more widespread, with the top 10 prefixes covering only 7% of methods.
We only count the out-degree in the target methods with respect to the extracted methods, that is, the dashed lines in Fig. 5.
References
Alhindawi N, Dragan N, Collard ML, Maletic JI (2013) Improving feature location by enhancing source code with stereotypes. In: International conference on software maintenance. IEEE, pp 300–309
Allamanis M, Barr ET, Bird C, Sutton C (2015) Suggesting accurate method and class names. In: Joint meeting on foundations of software engineering, pp 38–49
Allamanis M, Peng H, Sutton C (2016) A convolutional attention network for extreme summarization of source code. In: International conference on machine learning, pp 2091–2100
Ambler SW, Sadalage PJ (2006) Refactoring databases: Evolutionary database design. Pearson Education
Ayewah N, Pugh W, Hovemeyer D, Morgenthaler JD, Penix J (2008) Using static analysis to find bugs. IEEE Softw 25(5):22–29
Bavota G, Oliveto R, De Lucia A, Antoniol G, Gueheneuc YG (2010) Playing with refactoring: Identifying extract class opportunities through game theory. In: International conference on software maintenance (ICSM), pp 1–5
Bavota G, De Lucia A, Marcus A, Oliveto R (2014a) Automating extract class refactoring: an improved method and its evaluation. Empir Softw Eng 19(6):1617–1664
Bavota G, Oliveto R, Gethers M, Poshyvanyk D, De Lucia A (2014b) Methodbook: Recommending move method refactorings via relational topic models. IEEE Trans Softw Eng 40(7):671–694
Borges H, Valente MT (2018) What’s in a GitHub star? understanding repository starring practices in a social coding platform. J Sys Softw
Brown WH, Malveau RC, McCormick HW, Mowbray TJ (1998) AntiPatterns: refactoring software, architectures, and projects in crisis. Wiley, Hoboken
Copeland T (2005) PMD applied, vol 10. Centennial Books Arexandria, Va, USA
Dig D, Comertoglu C, Marinov D, Johnson R (2006) Automated detection of refactorings in evolving components. In: European conference on object-oriented programming, pp 404–428
Dragan N, Collard ML, Hammad M, Maletic JI (2011) Using stereotypes to help characterize commits. In: International conference on software maintenance (ICSM). IEEE, pp 520–523
Dragan N, Collard ML, Maletic JI (2006) Reverse engineering method stereotypes. In: International conference on software maintenance. IEEE, pp 24–34
Dragan N, Collard ML, Maletic JI (2009) Using method stereotype distribution as a signature descriptor for software systems. In: International conference on software maintenance. IEEE, pp 567–570
Fowler M, Beck K (1999) Refactoring: improving the design of existing code. Addison-Wesley Professional
Hora A, Robbes R, Anquetil N, Etien A, Ducasse S, Valente MT (2015) How do developers react to API evolution? the Pharo ecosystem case. In: International conference on software maintenance and evolution, pp 251–260
Hora A, Robbes R, Valente MT, Anquetil N, Etien A, Ducasse S (2018) How do developers react to API evolution? a large-scale empirical study. Softw Qual J 26(1):161–191
Hora A, Silva D, Robbes R, Valente MT (2018) Assessing the threat of untracked changes in software evolution. In: International conference on software engineering, pp 1102–1113
Host EW, Ostvold BM (2007) The programmer’s lexicon, volume i: The verbs. In: International working conference on source code analysis and manipulation, pp 193–202
Kalliamvakou E, Gousios G, Blincoe K, Singer L, German DM, Damian D (2014) The promises and perils of mining github. In: Working conference on mining software repositories, pp 92–101
Kim M, Gee M, Loh A, Rachatasumrit N (2010) Ref-Finder: a refactoring reconstruction tool based on logic query templates. In: International symposium on the foundations of software engineering, pp 371–372
Kim M, Zimmermann T, Nagappan N (2012) A field study of refactoring challenges and benefits. In: International symposium on the foundations of software engineering, p 50
Kim M, Zimmermann T, Nagappan N (2014) An empirical study of refactoring challenges and benefits at microsoft. IEEE Trans Softw Eng 40(7):633–649
Lippert M, Roock S (2006) Refactoring in large software projects: performing complex restructurings successfully. Wiley, Hoboken
Livshits B, Zimmermann T (2005) DynaMine: finding common error patterns by mining software revision histories. In: International symposium on the foundations of software engineering, pp 296–305
Martin RC (2009) Clean code: a handbook of agile software craftsmanship. Pearson Education
Meng S, Wang X, Zhang L, Mei H (2012) A history-based matching approach to identification of framework evolution. In: International conference on software engineering, pp 353–363
Mens T, Tourwé T (2004) A survey of software refactoring. IEEE Trans Softw Eng 30(2):126–139
Meszaros G (2007) xUnit test patterns: Refactoring test code. Pearson Education
Murphy GC, Kersten M, Findlater L (2006) How are Java software developers using the Elipse IDE? IEEE Softw 23(4):76–83
Murphy-Hill E, Black AP (2008) Breaking the barriers to successful refactoring: observations and tools for extract method. In: International conference on software engineering, pp 421–430
Murphy-Hill E, Black AP (2008) Refactoring tools: Fitness for purpose. IEEE Software 25(5)
Murphy-Hill E, Parnin C, Black AP (2012) How we refactor, and how we know it. IEEE Trans Softw Eng 38(1):5–18
Murphy-Hill E, Zimmermann T, Bird C, Nagappan N (2015) The design space of bug fixes and how developers navigate it. IEEE Trans Softw Eng 41(1):65–81
Negara S, Chen N, Vakilian M, Johnson RE, Dig D (2013) A comparative study of manual and automated refactorings. In: European conference on object-oriented programming. Springer, pp 552–576
Roberts D, Brant J, Johnson R (1997) A refactoring tool for smalltalk. Theory and Practice of Object Systems 3(4)
Silva D, Terra R, Valente MT (2014) Recommending automated extract method refactorings. In: International conference on program comprehension (ICPC), pp 146–156
Silva D, Tsantalis N, Valente MT (2016) Why we refactor? confessions of GitHub contributors. In: International symposium on the foundations of software engineering, pp 858–870
Silva D, Valente MT (2017) RefDiff: detecting refactorings in version histories. In: International conference on mining software repositories, pp 269–279
Simon F, Steinbruckner F, Lewerentz C (2001) Metrics based refactoring. In: European conference on software maintenance and reengineering, pp 30–38
Terra R, Valente MT, Miranda S, Sales V (2018) JMove: A novel heuristic and tool to detect move method refactoring opportunities. J Sys Softw 138:19–36
Tourwé T, Mens T (2003) Identifying refactoring opportunities using logic meta programming. In: European conference on software maintenance and reengineering, pp 91–100
Tsantalis N, Chatzigeorgiou A (2009) Identification of move method refactoring opportunities. IEEE Transactions on Software Engineering 35(3)
Tsantalis N, Chatzigeorgiou A (2011) Identification of extract method refactoring opportunities for the decomposition of methods. J Syst Softw 84(10):1757–1782
Tsantalis N, Guana V, Stroulia E, Hindle A (2013) A multidimensional empirical study on refactoring activity. In: Conference of the centre for advanced studies on collaborative research, pp 132–146
Tsantalis N, Mansouri M, Eshkevari LM, Mazinanian D, Dig D (2018) Accurate and efficient refactoring detection in commit history. In: International conference on software engineering, pp 483–494
Vakilian M, Johnson RE (2014) Alternate refactoring paths reveal usability problems. In: Proceedings of the 36th international conference on software engineering, pp 1106–1116
Vasilescu B, Casalnuovo C, Devanbu P (2017) Recovering clear, natural identifiers from obfuscated js names. In: Joint meeting on foundations of software engineering, pp 683–693
Vassallo C, Grano G, Palomba F, Gall HC, Bacchelli A (2019) A large-scale empirical exploration on refactoring activities in open source software projects. Sci Comput Program 180:1–15
Wang Y (2009) What motivate software engineers to refactor source code? evidences from professional developers. In: International conference on software maintenance, pp 413–416
Weissgerber P, Diehl S (2006) Identifying refactorings from source-code changes. In: International conference on automated software engineering, pp 231–240
Wu W, Gueheneuc YG, Antoniol G, Kim M (2010) AURA: a hybrid approach to identify framework evolution. In: International conference on software engineering, pp 325–334
Xavier L, Brito A, Hora A, Valente MT (2017) Historical and impact analysis of API breaking changes: A large scale study. In: International conference on software analysis, evolution and reengineering, pp 138–147
Xing Z, Stroulia E (2006) Refactoring detection based on umldiff change-facts queries. In: Working conference on reverse engineering, pp 263–274
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Christoph Treude
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hora, A., Robbes, R. Characteristics of method extractions in Java: a large scale empirical study. Empir Software Eng 25, 1798–1833 (2020). https://doi.org/10.1007/s10664-020-09809-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-020-09809-8