Code smells are indicators of deeper design problems that may cause difficulties in the evolution of a software system. This paper investigates the capability of twelve code smells to reflect actual maintenance problems. Four medium-sized systems with equivalent functionality but dissimilar design were examined for code smells. Three change requests were implemented on the systems by six software developers, each of them working for up to four weeks. During that period, we recorded problems faced by developers and the associated Java files on a daily basis. We developed a binary logistic regression model, with “problematic file” as the dependent variable. Twelve code smells, file size, and churn constituted the independent variables. We found that violation of the Interface Segregation Principle (a.k.a. ISP violation) displayed the strongest connection with maintenance problems. Analysis of the nature of the problems, as reported by the developers in daily interviews and think-aloud sessions, strengthened our view about the relevance of this code smell. We observed, for example, that severe instances of problems relating to change propagation were associated with ISP violation. Based on our results, we recommend that code with ISP violation should be considered potentially problematic and be prioritized for refactoring.
Similar content being viewed by others
Note: one maintenance problem could be related to several problematic Java files.
Positive and negative loadings can be associated with the same factor. For example, in surveys, negative loadings are caused by questions that are negatively oriented to a factor. A combination of positive and negative questions is normally used to minimize an automatic response bias by the respondents (Dunteman 1989).
In J2EE environments, it is common to use Bean files as data transfer objects. Their counterparts, the Action files (which in turn contain the business logic) access the Bean files.
These interfaces are part of the Persistence Framework. As explained previously, Persistence Framework is used as part of Java technology for managing relational data (more specifically data entities). For more information on Java persistence, see www.oracle.com.
In the social sciences, triangulation is often used to indicate that more than two methods are used in a study with a view to double (or triple) checking results. This is also called “cross examination.”
Abbes M, Khomh F, Guéhéneuc YG, Antoniol G (2011) An empirical study of the impact of two antipatterns, blob and spaghetti code, on program comprehension. In: European conf. softw. maint. and reeng., pp 181–190
Alikacem EH, Sahraoui HA (2009) A metric extraction framework based on a high-level description language. In: Int’l conf. on source code analysis and manipulation (SCAM), pp 159–167
Anda BCD, Sjøberg DIK, Mockus A (2009) Variability and reproducibility in software engineering: a study of four companies that developed the same system. IEEE Trans Softw Eng 35(3):407–429
Baxter ID, Yahin A, Moura L, Sant’Anna M, Bier L (1998) Clone detection using abstract syntax trees. In: Int’l conf. softw. maint., pp 368–377
Bergersen GR, Gustafsson JE (2011) Programming skill, knowledge, and working memory among professional software developers from an investment theory perspective. J Individ Differences 32(4):201–209
Borland (2012) Borland Together. http://www.borland.com/us/products/together. Accessed 10 May 2012
Brown W, Malveau R, McCormick S, Tom Mowbray (1998) AntiPatterns: refactoring software, architectures, and projects in crisis. John Wiley & Sons, Inc
Clarke K (2005) The phantom menace: omitted variable bias in econometric research. Confl Manage Peace Sci 22(4):341–352
Coad P, Yourdon E (1991) Object-oriented design. Prentice Hall, London
D’Ambros M, Bacchelli A, Lanza M (2010) On the impact of design flaws on software defects. In: Int’l conf. quality softw., pp 23–31
Deligiannis I, Shepperd M, Roumeliotis M, Stamelos I (2003) An empirical investigation of an object-oriented design heuristic for maintainability. J Syst Softw 65(2):127–139
Deligiannis I, Stamelos I, Angelis L, Roumeliotis M, Shepperd M (2004) A controlled experiment investigation of an object-oriented design heuristic for maintainability. J Syst Softw 72(2):129–143
Dunteman GE (1989) Principal components analysis. Sage university paper series on quantitative applications in the social sciences. Sage, Newbury Park, CA
Edberg D, Olfman L (2001) Organizational learning through the process of enhancing information systems. In: Int’l conf. on system sciences, pp 1–10
Edgewall-Software (2012) Trac. http://trac.edgewall.org. Accessed 10 May 2012
Field A (2009) Discovering statistics using SPSS, 3rd edn. SAGE Publications
Fischer G, Lusiardi J, Wolff von Gudenberg J (2007) Abstract syntax trees—and their role in model driven software development. In: Int’l conf. on advances in softw. eng., pp 38–38
Fokaefs M, Tsantalis N, Chatzigeorgiou A (2007) JDeodorant: identification and removal of feature envy bad smells. In: Int’l conf. softw. maint., pp 519–520
Fowler M (1999) Refactoring: improving the design of existing code. Addison-Wesley, Reading, MA
Gamma E, Helm R, Johnson R, Vlissides J (1994) Design patterns: elements of reusable object-oriented software. Addison-Wesley, Reading, MA
Intooitus (2012) InCode. http://www.intooitus.com/inCode.html. Accessed 10 May 2012
Juergens E, Deissenboeck F, Hummel B, Wagner S (2009) Do code clones matter? In: Int’l conf. softw. eng., pp 485–495
Kaiser H (1974) An index of factorial simplicity. Psychometrika 39(1):31–36
Khomh F, Di Penta M, Guéhéneuc YG (2009) An exploratory study of the impact of code smells on software change-proneness. In: Working conf. reverse eng., pp 75–84
Kiefer C, Bernstein A, Tappolet J (2007) Mining software repositories with iSPAROL and a software evolution ontology. In: Int’l workshop on mining softw. Repositories, pp 1–10
Kim M, Sazawal V, Notkin D, Murphy GC (2005) An empirical study of code clone genealogies. In: European softw. eng. conf. and ACM SIGSOFT symposium on foundations of softw. eng., pp 187–196
Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25(2–3):259–284
Lanza M, Marinescu R (2005) Object-oriented metrics in practice. Springer
Larman C (2004) Applying UML and patterns: an introduction to object-oriented analysis and design and iterative development, 3rd edn. Prentice Hall
Layman LM, Williams LA, St Amant R (2008) MimEc. In: Int’l workshop on coperative and human aspects of softw. eng., CHASE ’08, pp 73–76
Li W, Shatnawi R (2007) An empirical study of the bad smells and class error probability in the post-release object-oriented system evolution. J Syst Softw 80(7):1120–1128
Lozano A, Wermelinger M (2008) Assessing the effect of clones on changeability. In: Int’l conf. softw. maint., pp 227–236
Mamas E, Kontogiannis K (2000) Towards portable source code representations using XML. In: Working conf. reverse eng., pp 172–182
Mäntylä MV (2005) An experiment on subjective evolvability evaluation of object-oriented software: explaining factors and interrater agreement. In: Int’l conf. softw. eng., pp 277–286
Mäntylä MV, Lassenius C (2006) Subjective evaluation of software evolvability using code smells: an empirical study. Empir Softw Eng 11(3):395–431
Mäntylä MV, Vanhanen J, Lassenius C (2004) Bad smells -humans as code critics. In: Int’l conf. softw. maint., pp 399–408
Marinescu R (2002) Measurement and quality in object oriented design. Doctoral thesis, “Politehnica” University of Timisoara
Marinescu R (2005) Measurement and quality in object-oriented design. In: Int’l conf. softw. maint., pp 701–704
Marinescu R, Ratiu D (2004) Quantifying the quality of object-oriented design: the factor-strategy model. In: Working conf. reverse eng., pp 192–201
Martin RC (2002) Agile software development, principles, patterns and practice. Prentice Hall
Menard S (1995) Applied logistic regression analysis. SAGE Publications
Moha N (2007) Detection and correction of design defects in object-oriented designs. In: ACM SIGPLAN conf. on object-oriented programming, systems, languages, and applications, pp 949–950
Moha N, Guéhéneuc YG, Leduc P (2006) Automatic generation of detection algorithms for design defects. In: IEEE/ACM int’l conf. on automated softw. eng., pp 297–300
Moha N, Guéhéneuc YG, Le Meur AF, Duchien L (2008) A domain analysis to specify design defects and generate detection algorithms. In: Fundamental approaches to softw. eng., pp 276–291
Moha N, Guéhéneuc YG, Duchien L, Le Meur AF (2010) DECOR: a method for the specification and detection of code and design smells. IEEE Trans Softw Eng 36(1):20–36
Monden A, Nakae D, Kamiya T, Sato S, Matsumoto K (2002) Software quality analysis by code clones in industrial legacy software. In: IEEE symposium on softw. metr., pp 87–94
Myers R (1990) Classical and modern regression with applications. PSW-Kent Publishing Company, Boston, MA
Olbrich SM, Cruzes DS, Sjøberg DIK (2010) Are all code smells harmful? A study of God classes and brain classes in the evolution of three open source systems. In: Int’l conf. softw. maint., pp 1–10
Oracle (2012) My Sql http://www.mysql.com. Accessed 10 May 2012
Plone Foundation (2012) Plone CMS: open source content management. http://plone.org. Accessed 10 May 2012
Rahman F, Bird C, Devanbu P (2010) Clones: what is that smell? In: Working conf. on mining softw. Repositories, pp 72–81
Rajlich VT, Gosavi P (2004) Incremental change in object-oriented programming. IEEE Softw 21(4):62–69
Rao AA, Reddy KN (2008) Detecting bad smells in object oriented design using design change propagation probability matrix. In: Int’l multiconf. of eng. and computer scientists, pp 1001–1007
Riel AJ (1996) Object-oriented design heuristics, 1st edn. Addison-Wesley, Boston, MA
Schumacher J, Zazworka N, Shull F, Seaman C, Shaw M (2010) Building empirical support for automated code smell detection. In: Int’l symposium on empirical softw. eng. and measurement, ESEM ’10, pp 1–8
Strauss A, Corbin J (1998) Basics of qualitative research: techniques and procedures for developing grounded theory. SAGE Publications
The Apache Software Foundation (2012a) Apache Subversion. http://subversion.apache.org. Accessed 10 May 2012
The Apache Software Foundation (2012b) Apache Tomcat. http://tomcat.apache.org. Accessed 10 May 2012
TMate-Sofware (2010) SVNKit - Subversioning for Java. http://svnkit.com. Accessed 10 May 2012
Tsantalis N, Chaikalis T, Chatzigeorgiou A (2008) JDeodorant: identification and removal of type-checking bad smells. In: European conf. softw. maint. and reeng., pp 329–331
Van Emden E, Moonen L (2001) Java quality assurance by detecting code smells. In: Working conf. reverse eng., pp 97–106
Vetro A, Morisio M, Torchiano M (2011) An empirical validation of FindBugs issues related to defects. In: Conf. on evaluation & assessment in softw. eng. (EASE), pp 144–153
Walter B, Pietrzak B (2005) Multi-criteria detection of bad smells in code with UTA method 2 data sources for smell detection. In: Extreme programming and agile processes in software engineering (XP). Springer, Berlin/Heidelberg, pp 154–161
Yamashita A (2012a) Assessing the capability of code smells to support software maintainability assessments: empirical inquiry and methodological approach. Doctoral thesis, University of Oslo
Yamashita A (2012b) Measuring the outcomes of a maintenance project: technical details and protocols (Report no. 2012-11). Tech. Rep., Simula Research Laboratory, Oslo
Yin R (2002) Case study research: design and methods (applied social research methods). SAGE
ZD Soft (2012) ZD soft screen recorder. http://www.zdsoft.com. Accessed 10 May 2012
Zhang M, Hall T, Baddoo N (2011) Code bad smells: a review of current knowledge. J Softw Maint Evol: Res Pract 23(3):179–202
The author thanks Gunnar Bergersen for his support in selecting the developers of this study and Hans Christian Benestad for providing technical support in the planning stage of the study. Also, thanks to Bente Anda and Dag Sjøberg for finding the resources needed to conduct this study and for insightful discussions. Thanks to Erik Arisholm for sharing his expertise during the analysis of the data. Finally, special thanks to Magne Jørgensen for his guidance and discussions that led to the paper. This work was partly funded by Simula Research Laboratory and the Research Council of Norway through the projects AGILE, grant no. 179851/I40, and TeamIT, grant no. 193236/I40.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Filippo Lanubile
Appendix A
Appendix B
Appendix C
Rights and permissions
About this article
Cite this article
Yamashita, A. Assessing the capability of code smells to explain maintenance problems: an empirical study combining quantitative and qualitative data. Empir Software Eng 19, 1111–1143 (2014). https://doi.org/10.1007/s10664-013-9250-3
Issue Date:
DOI: https://doi.org/10.1007/s10664-013-9250-3