Linguistic antipatterns: what they are and how developers perceive them

Arnaoudova, Venera; Di Penta, Massimiliano; Antoniol, Giuliano

doi:10.1007/s10664-014-9350-8

Linguistic antipatterns: what they are and how developers perceive them

Published: 29 January 2015

Volume 21, pages 104–158, (2016)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Venera Arnaoudova¹,
Massimiliano Di Penta² &
Giuliano Antoniol³

1668 Accesses
81 Citations
3 Altmetric
Explore all metrics

Abstract

Antipatterns are known as poor solutions to recurring problems. For example, Brown et al. and Fowler define practices concerning poor design or implementation solutions. However, we know that the source code lexicon is part of the factors that affect the psychological complexity of a program, i.e., factors that make a program difficult to understand and maintain by humans. The aim of this work is to identify recurring poor practices related to inconsistencies among the naming, documentation, and implementation of an entity—called Linguistic Antipatterns (LAs)—that may impair program understanding. To this end, we first mine examples of such inconsistencies in real open-source projects and abstract them into a catalog of 17 recurring LAs related to methods and attributes. Then, to understand the relevancy of LAs, we perform two empirical studies with developers—30 external (i.e., not familiar with the code) and 14 internal (i.e., people developing or maintaining the code). Results indicate that the majority of the participants perceive LAs as poor practices and therefore must be avoided—69 % and 51 % of the external and internal developers, respectively. As further evidence of LAs’ validity, open source developers that were made aware of LAs reacted to the issue by making code changes in 10 % of the cases. Finally, in order to facilitate the use of LAs in practice, we identified a subset of LAs which were universally agreed upon as being problematic; those which had a clear dissonance between code behavior and lexicon.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Babel of Software Development: Linguistic Diversity in Open Source

An investigation of misunderstanding code patterns in C open-source software projects

Article 22 November 2018

Investigating the relation between lexical smells and change- and fault-proneness: an empirical study

Article 09 May 2016

Notes

http://cocoon.apache.org
http://www.eclipse.org
http://www.veneraarnaoudova.ca/tools
http://eclipse-cs.sourceforge.net/
http://checkstyle.sourceforge.net/
http://www.oracle.com/technetwork/java/codeconv-138413.html
For projects where we did not provide a version, we used version control (accessed on 31/05/2013).
http://ser.soccerlab.polymtl.ca/ser-repos/public/tr-data/lapd-rep-pckg.zip
http://argouml.tigris.org
http://cocoon.apache.org
http://www.eclipse.org
http://www.sensopia.com/english/index.html
None of the questionnaires containing examples of type C.2 was answered.
We do not report project names with the examples to avoid disclosing the confidentiality of the provided answers.
A change may be one or more of the following: modification, addition, or removal.

References

Abbes M, Khomh F, Guéhéneuc YG, Antoniol G (2011) An empirical study of the impact of two antipatterns, Blob and Spaghetti Code, on program comprehension. In: Proceedings of the European Conference on Software Maintenance and Reengineering (CSMR), pp 181–190
Abebe S, Tonella P (2011) Towards the extraction of domain concepts from the identifiers. In: Proceedings of the Working Conference on Reverse Engineering (WCRE), pp 77–86
Abebe S, Tonella P (2013) Automated identifier completion and replacement. In: Proceedings of the European Conference on Software Maintenance and Reengineering (CSMR), pp 263–272
Abebe SL, Haiduc S, Tonella P, Marcus A (2011) The effect of lexicon bad smells on concept location in source code. In: Proceedings of the International Working Conference on Source Code Analysis and Manipulation (SCAM), pp 125–134
Abebe SL, Arnaoudova V, Tonella P, Antoniol G, Guéhéneuc YG (2012) Can lexicon bad smells improve fault prediction? In: Proceedings of the Working Conference on Reverse Engineering (WCRE), pp 235–244
Anquetil N, Lethbridge T (1998) Assessing the relevance of identifier names in a legacy software system. In: Proceedings of the International Conference of the Centre for Advanced Studies on Collaborative Research (CASCON), pp 213–222
Arnaoudova V, Di Penta M, Antoniol G, Guéhéneuc YG (2013) A new family of software anti-patterns: Linguistic anti-patterns. In: Proceedings of the European Conference on Software Maintenance and Reengineering (CSMR), pp 187–196
Arnaoudova V, Eshkevari L, Di Penta M, Oliveto R, Antoniol G, Guéhéneuc YG (2014) Repent: Analyzing the nature of identifier renamings. IEEE Trans Softw Eng (TSE) 40(5):502–532
Article Google Scholar
Brooks R (1983) Towards a theory of the comprehension of computer programs. In J Man-Machine Stud 18(6):543–554
Article MathSciNet Google Scholar
Brown WJ, Malveau RC, Brown WH, McCormick III HW, Mowbray TJ (1998a) Anti patterns: refactoring software, architectures, and projects in crisis, 1st edn. Wiley, New York
Google Scholar
Brown WJ, Malveau RC, HWM III, Mowbray TJ (1998b) AntiPatterns: refactoring software, architectures, and projects in crisis. Wiley, New York
Google Scholar
Caprile B, Tonella P (1999) Nomen est omen: Analyzing the language of function identifiers. In: Proceedings of Working Conference on Reverse Engineering (WCRE), pp 112–122
Caprile B, Tonella P (2000) Restructuring program identifier names. In: Proceedings of the International Conference on Software Maintenance (ICSM), pp 97–107
Chaudhary BD, Sahasrabuddhe HV (1980) Meaningfulness as a factor of program complexity. In: Proceedings of the ACM Annual Conference, ACM, ACM ’80, pp 457–466
De Lucia A, Di Penta M, Oliveto R (2011) Improving source code lexicon via traceability and information retrieval. IEEE Trans Softw Eng 37(2):205–227
Article Google Scholar
Deissenbock F, Pizka M (2005) Concise and consistent naming. In: Proceedings of the International Workshop on Program Comprehension (IWPC), pp 97–106
Fowler M (1999) Refactoring: improving the design of existing code. Addison-Wesley, MA
Google Scholar
Gamma E, Helm R, Johnson R, Vlissides J (1995) Design patterns: elements of reusable object oriented software. Addison-Wesley, Boston
Google Scholar
Glaser BG (1992) Basics of grounded theory analysis. Sociology Press
Grissom RJ, Kim JJ (2005) Effect sizes for research: a broad practical approach, 2nd edn. Lawrence Earlbaum Associates
Groves RM, Fowler Jr FJ, Couper MP, Lepkowski JM, Singer E, Tourangeau R (2009) Survey methodology, 2nd edn. Wiley, New York
MATH Google Scholar
Hintze JL, Nelson RD (1998) Violin plots: a box plot-density trace synergism. Am Stat 52(2):181–184
Google Scholar
Jedlitschka A, Pfahl D (2005) Reporting guidelines for controlled experiments in software engineering. In: International symposium on empirical software engineering
Khomh F, Di Penta M, Guéhéneuc YG (2009) An exploratory study of the impact of code smells on software change-proneness. In: Proceedings of the working conference on reverse engineering (WCRE), pp 75–84
Khomh F, Di Penta M, Guéhéneuc YG, Antoniol G (2012) An exploratory study of the impact of antipatterns on class change- and fault-proneness. Empir Softw Eng 17(3):243–275
Article Google Scholar
Kitchenham B, Pfleeger S, Pickard L, Jones P, Hoaglin D, El Emam K, Rosenberg J (2002) Preliminary guidelines for empirical research in software engineering. IEEE Trans Softw Eng (TSE) 28(8):721–734
Article Google Scholar
Lawrie D, Morrell C, Feild H, Binkley D (2006) What’s in a name? a study of identifiers. In: Proceedings of the International Conference on Program Comprehension (ICPC), pp 3–12
Lawrie D, Morrell C, Feild H, Binkley D (2007) Effective identifier names for comprehension and memory. Innovations Syst Softw Eng 3(4):303–318
Article Google Scholar
Merlo E, McAdam I, De Mori R (2003) Feed-forward and recurrent neural networks for source code informal information analysis. J Softw Maint 15(4):205–244
Article Google Scholar
Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41
Article Google Scholar
Moha N, Guéhéneuc YG, Duchien L, Le Meur AF (2010) DECOR: a method for the specification and detection of code and design smells. IEEE Trans Softw Eng (TSE’10) 36(1):20–36
Article Google Scholar
Nagappan M, Zimmermann T, Bird C (2013) Diversity in software engineering research. In: Proceedings of the joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 466–476
Oppenheim AN (1992) Questionnaire design, interviewing and attitude measurement. Pinter, London
Google Scholar
Palomba F, Bavota G, Di Penta M, Oliveto R, De Lucia A, Poshyvanyk D (2013) Detecting bad smells in source code using change history information. In: Proceedings of the international conference on automated software engineering (ASE), pp 268–278
Palomba F, Bavota G, Penta M D, Oliveto R, Lucia A D (2014) Do they really smell bad? A study on developers’ perception of code bad smells. In: International conference on software maintenance and evolution (ICSME), p. to appear
Parsons J, Saunders C (2004) Cognitive heuristics in software engineering: applying and extending anchoring and adjustment to artifact reuse. IEEE Trans Softw Eng (TSE) 30(12):873–888
Article Google Scholar
Prechelt L, Unger-Lamprecht B, Philippsen M, Tichy W (2002) Two controlled experiments assessing the usefulness of design pattern documentation in program maintenance. IEEE Trans Softw Eng (TSE) 28(6):595–606
Article Google Scholar
Raţiu D, Ducasse S, Girba T, Marinescu R (2004) Using history information to improve design flaws detection. In: Proceedings of the European conference on software maintenance and reengineering (CSMR), pp 223–232
Sheil BA (1981) The psychological study of programming. ACM Comput Surv (CSUR) 13(1):101–120
Article Google Scholar
Shneiderman B (1977) Measuring computer program quality and comprehension. Int J Man-Machine Stud 9(4):465–478
Article Google Scholar
Shneiderman B, Mayer R (1975) Towards a cognitive model of progammer behavior, Tech Rep, vol 37. Indiana University, Bloomington
Google Scholar
Shull F, Singer J, Sjøberg DI (eds) (2007) Guide to advanced empirical software engineering. Springer, New York
Google Scholar
Strauss AL (1987) Qualitative analysis for social scientists. Cambridge Univsersity Press
Takang A, Grubb PA, Macredie RD (1996) The effects of comments and identifier names on program comprehensibility: an experiential study. J Program Lang 4(3):143–167
Google Scholar
Tan L, Yuan D, Krishna G, Zhou Y (2007) /*iComment: bugs or bad comments?*/, Proceedings of the ACM SIGOPS Symposium on Operating Systems Principles (SOSP) 41(6):145–158
Tan L, Zhou Y, Padioleau Y (2011) Acomment: mining annotations from comments and code to detect interrupt related concurrency bugs. In: Proceedings of the International Conference on Software Engineering (ICSE)
Tan SH, Marinov D, Tan L, Leavens GT (2012) @tComment: Testing Javadoc comments to detect comment-code inconsistencies. In: Proceedings of the international conference on software testing, verification and validation (ICST), pp 260–269
Torchiano M (2002) Documenting pattern use in java programs. In: Proceedings of the international conference on software maintenance (ICSM), pp 230–233
Toutanova K, Manning CD (2000) Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Proceedings of the Joint SIGDAT conference on empirical methods in natural language processing and very large corpora (EMNLP/VLC-2000), association for computational linguistics, pp 63–70
Weissman L (1974a) Psychological complexity of computer programs: an experimental methodology. SIGPLAN Not 9(6):25–36
Article Google Scholar
Weissman LM (1974b) A methodology for studying the psychological complexity of computer programs. PhD thesis
Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in software engineering - an introduction. Kluwer, Boston
Book MATH Google Scholar
Woodfield SN, Dunsmore HE, Shen VY (1981) The effect of modularization and comments on program comprehension. In: Proceedings of the international conference on software engineering (ICSE), pp 215–223
Yamashita A, Moonen L (2013) Do developers care about code smells? - An exploratory survey. In: Proceedings of the working conference on reverse engineering (WCRE), pp 242–251
Zhong H, Zhang L, Xie T, Mei H (2011) Inferring specifications for resources from natural language api documentation. Autom Softw Eng 18(3–4):227–261
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank the participants to the two studies for their precious time and effort. They made this work possible.

Author information

Authors and Affiliations

Soccer Lab., DGIGL, Polytechnique Montréal, 2900, Boulevard Édouard-Montpetit 2700, chemin de la Tour, Montréal, QC, H3T 1J4, Canada
Venera Arnaoudova
Department of Engineering, University of Sannio, Benevento, Italy
Massimiliano Di Penta
Soccer Lab., DGIGL, Polytechnique Montréal, Montréal, Canada
Giuliano Antoniol

Authors

Venera Arnaoudova
View author publications
You can also search for this author in PubMed Google Scholar
Massimiliano Di Penta
View author publications
You can also search for this author in PubMed Google Scholar
Giuliano Antoniol
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Venera Arnaoudova.

Additional information

Communicated by: Nachiappan Nagappan

This work is an extension of our previous paper (Arnaoudova et al. 2013)

Appendix

1.1 A Detection

A.1 - “Get” more than accessor::: Find accessor methods by identifying methods whose name starts with ‘get’ and ends with a substring that corresponds to an attribute in the same class and where the attribute’s declared type and the accessor’s return type are the same. Then, identify those accessors that are performing more actions than returning the corresponding attribute. Cases where the attribute is set before it is returned (i.e., Proxy and Singleton design patterns) should not be considered as part of this LA. For a detection built on top of an Abstract Syntax Tree (AST) expressions other than a return statement—where the attribute is returned—can be allowed only if they are child of a conditional check for null value. Other measures for complexity, such as LOC or McCabe’s Cyclomatic Complexity, can be used for a simpler but less accurate detection.

A.2 - “Is” returns more than a Boolean::: Find methods starting with “is” whose return type is not Boolean.

A.3 - “Set” method returns::: Find modifier methods (or more generally methods whose name starts with “set”) and whose return type is different from void.

A.4 - Expecting but not getting single instance::: Find methods returning a collection (e.g., array, list, vector, etc.) but whose name ends with a singular noun and does not contain a word implying a collection (eg., array, list, vector, etc.).

B.1 - Not implemented condition::: Find methods with at least one conditional sentence in comments but with no conditional statements in the implementation (e.g., no control structures or ternary operators).

B.2 - Validation method does not confirm::: Find validation methods (e.g., method names starting with “validate”, “check”, “ensure”) whose return type is void and that do not throw an exception.

B.3 - “Get” method does not return::: Find methods where the name suggests a return value (e.g., names starting with “get”, “return”) but where the return type is void.

B.4 - Not answered question::: Find methods whose name is in the form of predicate (e.g., starts with “is”, “has”) and whose return type is void.

B.5 - Transform method does not return::: Find methods whose name suggests a transformation of an object, (e.g., toSomething, source2target) but its return type is void.

B.6 - Expecting but not getting a collection::: The method name suggests that it returns (e.g., starts with “get”, “return”) multiple objects (e.g., ends with a plural noun), however the return type is not a collection.

C.1 - Method name and return type are opposite::: Find methods where the name and return type contain antonyms.

C.2 - Method signature and comment are opposite::: Find methods whose name or return type have an antonym relation with its comment.

D.1 - Says one but contains many::: Find attributes having a name ending with a singular noun and having a collection as declaring type.

D.2 - Name suggests Boolean but type does not::: Find attributes whose name is structured as a predicate, i.e., starting with a verb in third person (e.g., “is”, “has”) or ending with a verb in gerund/present participle, but whose declaring type is not Boolean.

E.1 - Says many but contains one::: Find attributes having a name ending with a plural noun, however their type is not a collection neither it contains a plural noun.

F.1 - Attribute name and type are opposite::: Find attributes whose name and declaring type contain antonyms.

F.2 - Attribute signature and comment are opposite::: Find attributes whose name or declaring type have an antonym relation with its comment.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arnaoudova, V., Di Penta, M. & Antoniol, G. Linguistic antipatterns: what they are and how developers perceive them. Empir Software Eng 21, 104–158 (2016). https://doi.org/10.1007/s10664-014-9350-8

Download citation

Published: 29 January 2015
Issue Date: February 2016
DOI: https://doi.org/10.1007/s10664-014-9350-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Linguistic antipatterns: what they are and how developers perceive them

Abstract

Access this article

Similar content being viewed by others

The Babel of Software Development: Linguistic Diversity in Open Source

An investigation of misunderstanding code patterns in C open-source software projects

Investigating the relation between lexical smells and change- and fault-proneness: an empirical study

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

1.1 A Detection

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Linguistic antipatterns: what they are and how developers perceive them

Abstract

Access this article

Similar content being viewed by others

The Babel of Software Development: Linguistic Diversity in Open Source

An investigation of misunderstanding code patterns in C open-source software projects

Investigating the relation between lexical smells and change- and fault-proneness: an empirical study

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 A Detection

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation