ABSTRACT
Software evolves with continuous source-code changes. These code changes usually need to be understood by software engineers when performing their daily development and maintenance tasks. However, despite its high importance, such change-understanding practice has not been systematically studied. Such lack of empirical knowledge hinders attempts to evaluate this fundamental practice and improve the corresponding tool support.
To address this issue, in this paper, we present a large-scale quantitative and qualitative study at Microsoft. The study investigates the role of understanding code changes during software-development process, explores engineers' information needs for understanding changes and their requirements for the corresponding tool support. The study results reinforce our beliefs that understanding code changes is an indispensable task performed by engineers in software-development process. A number of insufficiencies in the current practice also emerge from the study results. For example, it is difficult to acquire important information needs such as a change's completeness, consistency, and especially the risk imposed by it on other software components. In addition, for understanding a composite change, it is valuable to decompose it into sub-changes that are aligned with individual development issues; however, currently such decomposition lacks tool support.
- T. Apiwattanapong, A. Orso, and M. J. Harrold. JDiff: A differencing technique and tool for object-oriented programs. Automated Software Eng., vol. 14, no. 1, pp. 3--36, Mar. 2007. Google ScholarDigital Library
- A. Begel and N. Nagappan. Usage and perceptions of agile software development in an industrial context: An exploratory study. In ESEM'07, pp.255--264, 2007. Google ScholarDigital Library
- D. Binkley, R. Capellini, L. R. Raszewski, and C. Smith. An implementation of and experiment with semantic differencing. In ICSM'01, pp.82--91, 2001. Google ScholarDigital Library
- R. P. L. Buse and W. R. Weimer. Automatically documenting program changes. In ASE'10, pp. 33--42, 2010. Google ScholarDigital Library
- R. P. L. Buse and T. Zimmermann. Information needs for software development analytics. In ICSE'12, pp. 987--996, 2012. Google ScholarDigital Library
- G. Canfora, L. Cerulo, and M. D. Penta. Ldiff: An enhanced line differencing tool. In ICSE'09, pp. 595--598, 2009. Google ScholarDigital Library
- M. Ceccarelli, L. Cerulo, G. Canfora, and M. D. Penta. An eclectic approach for change impact analysis. In ICSE'10, pp. 163--166, 2010. Google ScholarDigital Library
- R. Chern, and K. De Volder. The impact of static-dynamic coupling on remodularization. In OOPSLA'08, pp. 261--276, 2008. Google ScholarDigital Library
- B. Cornelissen, A. Zaidman, A. van Deursen, L. Moonen, and R. Koschke. A systematic survey of program comprehension through dynamic analysis. IEEE Transactions on Software Engineering, vol.35, no.5, pp.684--702, Sept.-Oct. 2009. Google ScholarDigital Library
- B. Dagenais and M. P. Robillard. SemDiff: Analysis and recommendation support for API. In ICSE'09, pp.599--602, 2009. Google ScholarDigital Library
- B. Dagenais and M. P. Robillard. Recommending Adaptive Changes for Framework Evolution. ACM Transactions on Software Engineering and Methodology, vol.20, no.4, September 2011. Google ScholarDigital Library
- S. G. Eick, T. L. Graves, A. F. Karr, J. S. Marron, and A. Mockus. Does code decay? Assessing the evidence from change management data. IEEE Transactions on Software Engineering, vol.27, no.1, pp.112, Jan 2001. Google ScholarDigital Library
- B. Fluri, M. Wursch, M. Pinzger, and H. C. Gall. Change distilling: Tree differencing for fine-grained source code change extraction. IEEE Transactions on Software Engineering, vol.33, no.11, pp.725--743, Nov 2007. Google ScholarDigital Library
- T. Fritz and G. C. Murphy. Using information fragments to answer the questions developers ask. In ICSE'10, pp. 175--184, 2010. Google ScholarDigital Library
- M. Gabel, J. Yang, Y. Yu, M. Goldszmidt, and Z. Su. Scalable and systematic detection of buggy inconsistencies in source code. In OOPSLA'10, pp. 175--190, 2010. Google ScholarDigital Library
- T. Gorschek, E. Tempero and L. Angelis. A large-scale empirical study of practitioners' use of object-oriented concepts. In ICSE'10, pp.115--124, 2010. Google ScholarDigital Library
- Z. Gu, E. T. Barr, D. J. Hamilton, and Z. Su. Has the bug really been fixed?. In ICSE'10, pp. 55--64, 2010. Google ScholarDigital Library
- A. E. Hassan and R. C. Holt, The top ten list: dynamic fault prediction. In ICSM'05, pp. 263--272, 2005. Google ScholarDigital Library
- K. Herzig and A. Zeller. Untangling Changes. Unpublished manuscript, September 2011. http://www.st.cs.uni-saarland.de/publications/details/herzig-tmp-2011/Google Scholar
- R. Holmes and R. J. Walker. Customized awareness: Recommending relevant external change events. In ICSE'10, pp.465--474, 2010. Google ScholarDigital Library
- R. Holmes and D. Notkin. Identifying program, test, and environmental changes that affect behavior. In ICSE'11, pp. 371--380, 2011. Google ScholarDigital Library
- D. Jackson and D. A. Ladd. Semantic Diff: a tool for summarizing the effects of modifications. In ICSM'94, pp.243--252, 1994. Google ScholarDigital Library
- L. Jiang, Z. Su, and E. Chiu. Context-based detection of clone-related bugs. In ESEC/FSE'07, pp. 55--64, 2007. Google ScholarDigital Library
- E. Juergens, F. Deissenboeck, B. Hummel, and S. Wagner, Do code clones matter?. In ICSE'09, pp. 485--495, 2009. Google ScholarDigital Library
- D. Kawrykow and M. P. Robillard. Non-essential changes in version histories. In ICSE'11, pp.351--360, 2011. Google ScholarDigital Library
- T. M. Khoshgoftaar, E. B. Allen, N. Goel, A. Nandi, and J. McMullan. Detection of software modules with high debug code churn in a very large legacy system. In ISSRE'96, pp. 364--371, 1996. Google ScholarDigital Library
- M. Kim and D. Notkin. Discovering and representing systematic code changes. In ICSE'09, pp. 309--319, 2009. Google ScholarDigital Library
- S. Kim, T. Zimmermann, E. J. Whitehead Jr., and A. Zeller. Predicting faults from cached history. In ICSE'07, pp. 489--498, 2007. Google ScholarDigital Library
- S. Kim, E. J. Whitehead, and Y. Zhang. Classifying Software Changes: Clean or Buggy?. IEEE Transactions on Software Engineering, vol.34, no.2, pp.181--196, March-April 2008. Google ScholarDigital Library
- A. J. Ko, B. A. Myers, M. J. Coblenz, and H. H. Aung. An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Transactions on Software Engineering, vol.32, no.12, pp.971--987, Dec. 2006. Google ScholarDigital Library
- A. J. Ko, R. DeLine, and G. Venolia. Information needs in collocated software development teams. In ICSE'07, pp. 344--353, 2007. Google ScholarDigital Library
- T. D. LaToza, G. Venolia, and R. DeLine. Maintaining mental models: a study of developer work habits. In ICSE'06, pp. 492--501, 2006. Google ScholarDigital Library
- T. D. LaToza and B. A. Myers. Developers ask reachability questions. In ICSE'10, pp.185--194, 2010. Google ScholarDigital Library
- Z. Li, S. Lu, S. Myagmar, and Y. Zhou. CP-Miner: Finding copy-paste and related bugs in large-scale software code. IEEE Transactions on Software Engineering, vol.32, no.3, pp. 176--192, March 2006. Google ScholarDigital Library
- D. R. Licata, C. D. Harris, and S. Krishnamurthi. The feature signatures of evolving programs. In ASE'03, pp. 281--285, 2003.Google ScholarDigital Library
- H. Malik and A. E. Hassan. Supporting software evolution using adaptive change propagation heuristics. In ICSM'08, pp.177--186, 2008.Google ScholarCross Ref
- S. McCamant and M. D. Ernst. Predicting problems caused by component upgrades. In ESEC/FSE'03, pp. 287--296, 2003. Google ScholarDigital Library
- N. Meng, M. Kim, and K. S. McKinley. Systematic editing: generating program transformations from an example. In PLDI'11, pp. 329--342, 2011. Google ScholarDigital Library
- E. Murphy-Hill, C. Parnin, and A. P. Black. How we refactor, and how we know it. In ICSE'09, pp. 287--297, 2009. Google ScholarDigital Library
- T. T. Nguyen, H. A. Nguyen, N. H. Pham, J. Al-Kofahi, and T. N. Nguyen. Clone-aware configuration management. In ASE'09, pp.123--134, 2009. Google ScholarDigital Library
- T. T. Nguyen, H. A. Nguyen, N. H. Pham, J. Al-Kofahi, and T. N. Nguyen. Recurring bug fixes in object-oriented programs. In ICSE'10, pp. 315--324, 2010. Google ScholarDigital Library
- B. Nuseibeh and S. Easterbrook. Requirements engineering: a roadmap. In ICSE'00, pp. 35--46, 2000. Google ScholarDigital Library
- D. Qi, A. Roychoudhury, and Z. Liang. Test generation to expose changes in evolving programs. In ASE'10, pp. 397--406, 2010. Google ScholarDigital Library
- X. Ren, F. Shah, F. Tip, B. G. Ryder, and O. Chesley. Chianti: A tool for change impact analysis of Java programs. In OOPSLA'04, pp. 432--448, 2004. Google ScholarDigital Library
- X. Ren and B. G. Ryder. Heuristic ranking of java program edits for fault localization. In ISSTA'07, pp. 239--249, 2007. Google ScholarDigital Library
- W. W. Royce. Managing the development of large software systems. In WESCON, 1970.Google Scholar
- R. Santelices, P. K. Chittimalli, T. Apiwattanapong, A. Orso, and M. J. Harrold. Test suite augmentation for evolving software, In ASE'08, pp.218--227, 2008. Google ScholarDigital Library
- A. Schröter, J. Aranda, D. Damian and I. Kwan. To talk or not to talk: factors that influence communication around changesets. In CSCW'12, pp. 1317--1326, 2012. Google ScholarDigital Library
- J. Sillito, G. C. Murphy, and K. De Volder. Asking and answering questions during a programming change task. IEEE Transactions on Software Engineering, vol.34, no.4, pp.434--451, July-Aug. 2008. Google ScholarDigital Library
- C. R. B. de Souza and D. F. Redmiles. An empirical study of software developers' management of dependencies and changes. In ICSE'08, pp. 241--250, 2008. Google ScholarDigital Library
- M. Stoerzer, B. G. Ryder, X. Ren, and F. Tip. Finding failure-inducing changes in Java programs using change classification. In ESEC/FSE'06, pp. 57--68, 2006. Google ScholarDigital Library
- X. Wang, D. Lo, J. Cheng, L. Zhang, H. Mei, and J. X. Yu. Matching dependence-related queries in the system dependence graph. In ASE'10, pp. 457--466, 2010. Google ScholarDigital Library
- J. Wloka, B. G. Ryder, and F. Tip. JUnitMX - A change-aware unit testing tool. In ICSE'09, pp.567--570, 2009. Google ScholarDigital Library
- S. Wong, Y. Cai, M. Kim, and M. Dalton. Detecting software modularity violations. In ICSE'11, pp. 411--420, 2011. Google ScholarDigital Library
- W. Wu, YG. Guéhéneuc, G. Antoniol, and M. Kim. AURA: A hybrid approach to identify framework evolution. In ICSE'10, pp. 325--334, 2010. Google ScholarDigital Library
- Z. Yin, D. Yuan, Y. Zhou, S. Pasupathy, and L. Bairavasundaram, How do fixes become bugs?. In ESEC/FSE'11, pp. 26--36, 2011. Google ScholarDigital Library
- A. T. T. Ying, G. C. Murphy, R. Ng, and M. C. Chu-Carroll. Predicting source code changes by mining change history. IEEE Transactions on Software Engineering, vol.30, no.9, pp. 574--586, Sept. 2004. Google ScholarDigital Library
- T. Zimmermann, P. Weibgerber, S. Diehl, and A. Zeller. Mining version histories to guide software changes. In ICSE'04, pp. 563--572, 2004. Google ScholarDigital Library
- T. Zimmermann, R. Premraj, N. Bettenburg, S. Just, A. Schroter, and C. Weiss. What makes a good bug report?. IEEE Transactions on Software Engineering, vol.36, no.5, pp.618--643, Sept.-Oct. 2010. Google ScholarDigital Library
Index Terms
- How do software engineers understand code changes?: an exploratory study in industry
Recommendations
Information Needs in Contemporary Code Review
Contemporary code review is a widespread practice used by software engineers to maintain high software quality and share project knowledge. However, conducting proper code review takes time and developers often have limited time for review. In this ...
Salient-class location: help developers understand code change in code review
ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringCode review involves a significant amount of human effort to understand the code change, because the information required to inspect code changes may distribute across multiple files that reviewers are not familiar with. Code changes are often organized ...
Developers perception of peer code review in research software development
AbstractContextResearch software is software developed by and/or used by researchers, across a wide variety of domains, to perform their research. Because of the complexity of research software, developers cannot conduct exhaustive testing. As a result, ...
Comments