The impact of identifier style on effort and comprehension

Binkley, Dave; Davis, Marcia; Lawrie, Dawn; Maletic, Jonathan I.; Morrell, Christopher; Sharif, Bonita

doi:10.1007/s10664-012-9201-4

The impact of identifier style on effort and comprehension

Published: 03 May 2012

Volume 18, pages 219–276, (2013)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Dave Binkley¹,
Marcia Davis³,
Dawn Lawrie¹,
Jonathan I. Maletic⁴,
Christopher Morrell² &
…
Bonita Sharif⁵

1939 Accesses
96 Citations
Explore all metrics

Abstract

A family of studies investigating the impact of program identifier style on human comprehension is presented. Two popular identifier styles are examined, namely camel case and underscore. The underlying hypothesis is that identifier style affects the speed and accuracy of comprehending source code. To investigate this hypothesis, five studies were designed and conducted. The first study, which investigates how well humans read identifiers in the two different styles, focuses on low-level readability issues. The remaining four studies build on the first to focus on the semantic implications of identifier style. The studies involve 150 participants with varied demographics from two different universities. A range of experimental methods is used in the studies including timed testing, read aloud, and eye tracking. These methods produce a broad set of measurements and appropriate statistical methods, such as regression models and Generalized Linear Mixed Models (GLMMs), are applied to analyze the results. While unexpected, the results demonstrate that the tasks of reading and comprehending source code is fundamentally different from those of reading and comprehending natural language. Furthermore, as the task becomes similar to reading prose, the results become similar to work on reading natural language text. For more “source focused” tasks, experienced software developers appear to be less affected by identifier style; however, beginners benefit from the use of camel casing with respect to accuracy and effort.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Shorter identifier names take longer to comprehend

Article 26 April 2018

How programmers read regular code: a controlled experiment using eye tracking

Article 21 November 2016

On the Investigation of Empirical Contradictions - Aggregated Results of Local Studies on Readability and Comprehensibility of Source Code

Article 09 November 2023

Notes

The Levenshtein Edit Distance is the minimum number of operations needed to transform one string into another. An operation is an insertion, deletion, or substitution of a single character.
The Performance and Efficiency Hypotheses are restated in terms of the variable used to study it.

References

Anquetil N, Lethbridge T (1998) Extracting concepts from file names; a new file clustering criterion. In: Proceedings of the 20th international conference on software engineering
Bednarik R, Tukiainen M (2006) An eye-tracking methodology for characterizing program comprehension processes. In: Proceedings of symposium on eye tracking research & applications (ETRA), California, USA
Bednarik R, Tukiainen M (2008) Temporal eye-tracking data: evolution of debugging strategies with multiple representations. In: Proceedings of symposium on eye tracking research & applications (ETRA), Savannah, Georgia
Beymer D, Russell D (2005) Webgazeanalyzer: a system for capturing and analyzing web reading behavior using eye gaze. In: Proceedings of CHI ’05 extended abstracts on human factors in computing systems, Portland, OR
Binkley D, Davis M, Lawrie D, Morrell C (2009a) To camelcase or under_score. In: 17th IEEE international conference on program comprehension, British Columbia, Canada
Binkley D, Lawrie D, Maex S, Morrell C (2009b) Identifier length and limited programmer memory. Sci Comput Program 74:149–158
Article MathSciNet Google Scholar
Binkley D, Davis M, Lawrie D, Maletic JI, Morrell C, Sharif B (2011) Extended models on the impact of identifier style on effort and comprehension. Technical Report LOY110720, Loyola University in Maryland
Bouma H (1970) Interaction effects in parafoveal letter recognition. Nature 226:177–178
Article Google Scholar
Brooks R (1983) Towards a theory of the comprehension of computer programs. Int J Man-Mach Stud 18:543–554
Article MathSciNet Google Scholar
Butler S, Wermelinger M, Yijun Y, Sharp H (2010) Exploring the influence of identifier names on code quality: an empirical study. In: Proceedings of 14th European conference on software maintenance and reengineering, Madrid, Spain
Caprile B, Tonella P (2000) Restructuring program identifier names. In: IEEE international conference on software maintenance
Crosby M, Stelovsky J (1990) How do we read algorithms? A case study. IEEE Comput 23(1):24–35
Article Google Scholar
Cutrell E, Guan Z (2007) What are you looking for? An eye-tracking study of information usage in web search. In: Proceedings of CHI, San Jose, California
de Kock E, van Biljon J, Pretorius M (2009) Usability evaluation methods: mind the gaps. In: Proceedings of annual research conference of the South African institute of computer scientists and information technologists Vanderbijlpark, Emfuleni, South Africa
Deißenböck F, Pizka M (2005) Concise and consistent naming. In: Proceedings of the 13th international workshop on program comprehension (IWPC 2005), St. Louis, MO, USA
Duchowski A (2007) Eye tracking methodology: theory and practice, 2nd edn. Springer, London
Google Scholar
Epelboim J, Booth J, Ashkenazy R, Steinmans ATR (1997) Fillers and spaces in text: the importance of word recognition during reading. Vis Res 37(20):465–472
Article Google Scholar
Goldberg JH, Stimson MJ, Lewenstein M, Scott N, Wichansky AM (2002) Eye tracking in web search tasks: design implications. In: Proceedings of 2002 symposium on eye tracking research & applications (ETRA), New Orleans, Louisiana
Grant S, Cordy JR (2010) Estimating the optimal number of latent concepts in source code analysis. In: 10th IEEE working conference on source code analysis and manipulation (SCAM), Timisoara, Romania
Guéhéneuc Y-G (2006) Taupe: towards understanding program comprehension. In: Proceedings of 16th IBM centers for advanced studies on collaborative research, Canada
Høst E, Østvold B (2008) The programmer’s lexicon, volume i: the verbs. In: International working conference on source code analysis and manipulation, Beijing, China
Jeanmart S, Guéhèneuc Y-G, Sahraoui H, Habra N (2009) Impact of the visitor pattern on program comprehension and maintenance. In: Proceedings of 3rd international symposium on empirical software engineering and measurement, Lake Buena Vista, Florida
Just M, Carpenter P (1980) A theory of reading: from eye fixations to comprehension. Psychol Rev 87:329–354
Article Google Scholar
Lawrie D, Morrell C, Feild H, Binkley D (2006) What’s in a name? A study of identifiers. In: 14th international conference on program comprehension
Lawrie D, Morrell C, Feild H, Binkley D (2007) Effective identifier names for comprehension and memory. Innovations in Systems and Software Engineering 3(4):303–318
Article Google Scholar
Liblit B, Begel A, Sweetser E (2006) Cognitive perspectives on the role of naming in computer programs. In: 8th annual psychology of programming workshop, Brighton, UK
MacGinitie W, MacGinitie R, Maria K, Dreyer LG, Hughes KE (2000) Gates–MacGinitie reading tests, 4th edn (GRMT-4). Riverside, Itasca, IL
Matsuda Y, Uwano H, Ohira M, Matsumoto K-i (2009) An Analysis of eye movements during browsing multiple search results pages. Springer, Berlin
Google Scholar
Molenberghs G, Verbeke G (2006) Models for discrete longitudinal data. Springer, Berlin
Google Scholar
Morrell C, Pearson J, Brant L (1997) Linear transformations of linear mixed effects models. Am Stat 51:338–343
Google Scholar
Nakamichi N, Shima K, Sakai M, Matsumoto K-i (2006) Detecting low usability web pages using quantitative data of users’ behavior. In: Proceedings of 28th international conference on software engineering, Shanghai, China
New B, Ferrand L, Pallier C, Brysbaert M (2006) Reexamining the word length effect in visual word recognition: new evidence from the English Lexicon Project. Psychon Bull Rev 13(1):45–52
Article Google Scholar
Ohba M, Gondow K (2005) Toward mining “concept keywords” from identifiers in large software projects. In: Proceedings of the proceedings of the second international workshop on mining software repositories, St Louis, MO
Porras GC, Guéhéneuc Y-G (2010) An empirical study on the efficiency of different design pattern representations in uml class diagrams. Empirical Software Engineering 15:493–522
Article Google Scholar
Rayner K, Fischer M, Pollatsek A (1998) Unspaced text interferes with both word identification and eye movement control. Vis Res 38(8):1129–1144
Article Google Scholar
Sami P, Roman B, Tatiana G, Vesa T, Markku T (2008) A method to study visual attention aspects of collaboration: eye-tracking pair programmers simultaneously. In: Proceedings of symposium on eye tracking research & applications, Georgia, USA
Sharif B, Maletic J (2010a) An eye tracking study on camelcase and under_score identifier styles. In: 18th IEEE international conference on program comprehension, Braga, Portugal
Sharif B, Maletic J (2010b) An eye tracking study on the effects of layout in understanding the role of design patterns. In: 26th IEEE international conference on software maintenance, Timisoara, Romania
Simonyi C (1999) Hungarian notation. msdn.microsoft.com/en-us/library/aa260976(VS.60).aspx
Sjøberg, D, Hannay, J, Hansen, O, Kampenes, V, Karahasanovic, A, Liborg, N, and Rekdal, A (1993). A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 19(4):733–753
Google Scholar
Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng 10:595–609
Article Google Scholar
Takang A, Grubb P, Macredie R (1996) The effects of comments and identifier names on program comprehensibility: an experiential study. J Program Lang 4(3):143–167
Google Scholar
Uwano H, Nakamura M, Monden A, Matsumoto K (2006) Analyzing individual performance of source code review using reviewers’ eye movement. In: Proceedings of 2006 symposium on eye tracking research & applications (ETRA), San Diego, California
Uwano H, Monden A, Matsumoto K (2008) Dresrem 2: an analysis system for multi-document software review using reviewers’ eye movements. In: Proceedings of 3rd international conference on software engineering advances (ICSEA), Sliema, Malta
Verbeke G, Molenberghs G (2001) Linear mixed models for longitudinal data, 2nd edn. Springer, New York
Google Scholar
Wiedenbeck S (1991) The initial stage of program comprehension. Int J Man-Mach Stud 35:517–540
Article Google Scholar
Yusuf S, Kagdi H, Maletic JI (2007) Assessing the comprehension of uml class diagrams via eye tracking. In: Proceedings of 15th IEEE intl. conf. on program comprehension, Banff Canada

Download references

Acknowledgements

Special thanks to all the participants as this work would not be possible without your time. Our thanks to David Robbins for assisting in the use of the Tobii eye tracker and Matt Hearn for helping in the preparation and administration of the studies. Finally, thanks to our three reviewers for their thorough and well considered reviews.

Author information

Authors and Affiliations

Department of Computer Science, Loyola University Maryland, Baltimore, MD, 21210-2699, USA
Dave Binkley & Dawn Lawrie
Department of Mathematics and Statistics, Loyola University Maryland, Baltimore, MD, 21210-2699, USA
Christopher Morrell
Center for Social Organization of Schools, Johns Hopkins University, Baltimore, MD, 21218, USA
Marcia Davis
Department of Computer Science, Kent State University, Kent, OH, 44242, USA
Jonathan I. Maletic
Department of Computer Science and Information Systems, Youngstown State University, Youngstown, OH, 44555, USA
Bonita Sharif

Authors

Dave Binkley
View author publications
You can also search for this author in PubMed Google Scholar
Marcia Davis
View author publications
You can also search for this author in PubMed Google Scholar
Dawn Lawrie
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan I. Maletic
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Morrell
View author publications
You can also search for this author in PubMed Google Scholar
Bonita Sharif
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bonita Sharif.

Additional information

Editors: Giulio Antoniol and Keith Brian Gallagher

Rights and permissions

Reprints and permissions

About this article

Cite this article

Binkley, D., Davis, M., Lawrie, D. et al. The impact of identifier style on effort and comprehension. Empir Software Eng 18, 219–276 (2013). https://doi.org/10.1007/s10664-012-9201-4

Download citation

Published: 03 May 2012
Issue Date: April 2013
DOI: https://doi.org/10.1007/s10664-012-9201-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The impact of identifier style on effort and comprehension

Abstract

Access this article

Similar content being viewed by others

Shorter identifier names take longer to comprehend

How programmers read regular code: a controlled experiment using eye tracking

On the Investigation of Empirical Contradictions - Aggregated Results of Local Studies on Readability and Comprehensibility of Source Code

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The impact of identifier style on effort and comprehension

Abstract

Access this article

Similar content being viewed by others

Shorter identifier names take longer to comprehend

How programmers read regular code: a controlled experiment using eye tracking

On the Investigation of Empirical Contradictions - Aggregated Results of Local Studies on Readability and Comprehensibility of Source Code

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation