Skip to main content
Log in

Commenting source code: is it worth it for small programming tasks?

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Maintaining a program is a time-consuming and expensive task in software engineering. Consequently, several approaches have been proposed to improve the comprehensibility of source code. One of such approaches are comments in the code that enable developers to explain the program with their own words or predefined tags. Some empirical studies indicate benefits of comments in certain situations, while others find no benefits at all. Thus, the real effect of comments on software development remains uncertain. In this article, we describe an experiment in which 277 participants, mainly professional software developers, performed small programming tasks on differently commented code. Based on quantitative and qualitative feedback, we i) partly replicate previous studies, ii) investigate performances of differently experienced participants when confronted with varying types of comments, and iii) discuss the opinions of developers on comments. Our results indicate that comments seem to be considered more important in previous studies and by our participants than they are for small programming tasks. While other mechanisms, such as proper identifiers, are considered more helpful by our participants, they also emphasize the necessity of comments in certain situations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Listing 1
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www.oracle.com/technetwork/articles/javase/codeconvtoc-136057.html

  2. https://www.limesurvey.org/

  3. https://insights.stackoverflow.com/survey/2016#developer-profile-experience

  4. TIOBE: https://www.tiobe.com/tiobe-index/

    RedMonk: http://redmonk.com/sogrady/2017/06/08/language-rankings-6-17/

    PopularitY: http://pypl.github.io/PYPL.html

  5. We did not count lines such as or ⋆⋆/ that do not contain any natural words.

  6. We applied this test since we want to compare two distributions from the same population, but we cannot assume a normal distribution.

References

  • Ali N, Sharafi Z, Guéhéneuc YG, Antoniol G (2015) An empirical study on the importance of source code entities for requirements traceability. Empir Softw Eng (EMSE) 20(2):442–478

    Article  Google Scholar 

  • Anquetil N, Lethbridge T (1998) Assessing the relevance of identifier names in a legacy software system. In: Proceedings of the 8th conference of the centre for advanced studies on collaborative research (CASCON). IBM, pp 213–222

  • Antoniol G, Canfora G, Casazza G, De Lucia A, Merlo E (2002) Recovering traceability links between code and documentation. IEEE Trans Softw Eng (TSE) 28(10):970–983

    Article  Google Scholar 

  • Basili VR, Shull F, Lanubile F (1999) Building knowledge through families of experiments. IEEE Trans Softw Eng (TSE) 25(4):456–473

    Article  Google Scholar 

  • Beck F, Moseler O, Diehl S, Rey GD (2013) In situ understanding of performance bottlenecks through visually augmented code. In: Proceedings of the 21st international conference on program comprehension (ICPC). IEEE, pp 63–72

  • Bezerra RMM, da Silva FQB, Santana AM, Magalhaes CVC, Santos RES (2015) Replication of empirical studies in software engineering: an update of a systematic mapping study. In: Proceedings of the 9th international symposium on empirical software engineering and measurement (ESEM). IEEE, pp 1–4

  • Boehm BW (1981) Software engineering economics. Prentice-Hall

  • Börstler J, Paech B (2016) The role of method chains and comments in software readability and comprehension—an experiment. IEEE Trans Softw Eng (TSE) 42(9):886–898

    Article  Google Scholar 

  • Briand L, Bunse C, Daly J, Differding C (1997) An experimental comparison of the maintainability of object-oriented and structured design documents. In: Proceedings of the 5th international conference on software maintenance (ICSM). IEEE, pp 130–138

  • Buse RP, Weimer WR (2010) Learning a metric for code readability. IEEE Trans Softw Eng (TSE) 36(4):546–558

    Article  Google Scholar 

  • Carver JC (2010) Towards reporting guidelines for experimental replications: a proposal. In: Proceedings of the 1st international workshop on replication in empirical software engineering (RESER)

  • Chikofsky EJ, Cross JH (1990) Reverse engineering and design recovery: a taxonomy. IEEE Soft 7(1):13–17

    Article  Google Scholar 

  • Cook TD, Campbell DT (1979) Quasi-experimentation: design & analysis issues for field settings. Houghton Mifflin

  • Corazza A, Maggio V, Scanniello G (2015) On the coherence between comments and implementations in source code. In: Proceedings of the 41st euromicro conference on software engineering and advanced applications (SEAA). IEEE, pp 76–83

  • Cornelissen B, Holten D, Zaidman A, Moonen L, van Wijk JJ, Van Deursen A (2007) Understanding execution traces using massive sequence and circular bundle views. In: Proceedings of the 15th international conference on program comprehension (ICPC). IEEE, pp 49–58

  • Dinno A (2015) Nonparametric pairwise multiple comparisons in independent groups using dunn’s test. Stata J 15(1):292–300

    Article  Google Scholar 

  • Dunsmore HE (1985) The effect of comments, mnemonic names, and modularity: some university experiment results. In: Empirical foundations of information and software science. Springer, pp 189–196

  • Elshoff JL, Marcotty M (1982) Improving computer program readability to aid modification. Commun ACM 25(8):512–521

    Article  Google Scholar 

  • Feigenspan J, Kästner C, Liebig J, Apel S, Hanenberg S (2012) Measuring programming experience. In: Proceedings of the 20th international conference on program comprehension (ICPC). IEEE, pp 73-82

  • Feigenspan J, Kästner C, Apel S, Liebig J, Schulze M, Dachselt R, Papendieck M, Leich T, Saake G (2013) Do background colors improve program comprehension in the# ifdef hell? Empir Softw Eng (EMSE) 18(4):699–745

    Article  Google Scholar 

  • Fisher RA (1936) Statistical methods for research workers, 6th edn. Oliver and Boyd Edinbrug, Tweeddale Court London: 33 Paternoste R Row, E.C.

  • Fluri B, Wursch M, Gall HC (2007) Do code and comments co-evolve? On the relation between source code and comment changes. In: Proceedings of the 14th working conference on reverse engineering (WCRE). IEEE, pp 70–79

  • Gamma E, Helm R, Johnson R, Vlissides J (1995) Design patterns: elements of reusable object-oriented software. Addison-Wesley, Reading

    MATH  Google Scholar 

  • Gosling SD, Vazire S, Srivastava S, John OP (2004) Should we trust web-based studies? a comparative analysis of six preconceptions about internet questionnaires. Am Psychol 59(2):93

    Article  Google Scholar 

  • Hanenberg S, Kleinschmager S, Robbes R, Tanter É, Stefik A (2014) An empirical study on the impact of static typing on software maintainability. Empir Softw Eng (EMSE) 19(5):1335–1382

    Article  Google Scholar 

  • Hofmeister J, Siegmund J, Holt DV (2017) Shorter identifier names take longer to comprehend. In: Proceedings of the 24th international conference on software analysis, evolution and reengineering (SANER). IEEE, pp 217–227

  • Höst M, Regnell B, Wohlin C (2000) Using students as subjects - a comparative study of students and professionals in lead-time impact assessment. Empir Softw Eng (EMSE) 5(3):201–214

    Article  MATH  Google Scholar 

  • Jalali S, Wohlin C (2012) Systematic literature studies: database searches vs. backward snowballing. In: Proceedings of the 6th international symposium on empirical software engineering and measurement (ESEM). ACM, pp 29–38

  • Jbara A, Feitelson DG (2015) How programmers read regular code: a controlled experiment using eye tracking. In: Proceedings of the 23rd international conference on program comprehension (ICPC). IEEE, pp 244–254

  • Ji W, Berger T, Antkiewicz M, Czarnecki K (2015) Maintaining feature traceability with embedded annotations. In: Proceedings of the 19th international software product line conference (SPLC). ACM, pp 61–70

  • Jiang ZM, Hassan AE (2006) Examining the evolution of code comments in PostgreSQL. In: Proceedings of the 3rd working conference on mining software repositories (MSR). ACM, pp 179–180

  • Juristo N, Vegas S (2009) Using differences among replications of software engineering experiments to gain knowledge. In: Proceedings of the 24th international symposium on empirical software engineering and measurement (ESEM). IEEE, pp 356–366

  • Khamis N, Witte R, Rilling J (2010) Automatic quality assessment of source code comments: the JavadocMiner. In: Proceedings of the 9th international conference on natural language processing and information system (NLDB). Springer, pp 68–79

  • Knuth DE (1984) Literate programming. Comput J 27(2):97–111

    Article  MATH  Google Scholar 

  • Kobayashi K, Kamimura M, Yano K, Kato K, Matsuo A (2013) SArF Map: visualizing software architecture from feature and layer viewpoints. In: Proceedings of the 21st international conference on program comprehension (ICPC). IEEE, pp 43–52

  • Koenemann J, Robertson SP (1991) Expert problem solving strategies for program comprehension. In: Proceedings of the 9th conference on human factors in computing systems (CHI). ACM, pp 125–130

  • Kosar T, Mernik M, Carver JC (2012) Program comprehension of domain-specific and general-purpose languages: comparison using a family of experiments. Empir Softw Eng (EMSE) 17(3):276–304

    Article  Google Scholar 

  • Kramer D (1999) API documentation from source code comments: a case study of Javadoc. In: Proceedings of the 17th annual international conference on computer documentation (SIGDOC). ACM, pp 147–153

  • Krüger J, Gu W, Shen H, Mukelabai M, Hebig R, Berger T (2018) Towards a better understanding of software features and their characteristics: a case study of marlin. In: Proceedings of the 12th workshop on variability modelling of software-intensive systems (VaMoS). ACM, pp 105–112

  • Krüger J, Wiemann J, Fenske W, Saake G, Leich T (2018) Do you remember this source code?. In: Proceedings of the 40th international conference on software engineering (ICSE). ACM, pp 764–775

  • Kruskal WH, Wallis WA (1952) Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47(260):583–621

    Article  MATH  Google Scholar 

  • Lawrie D, Morrell C, Feild H, Binkley D (2007) Effective identifier names for comprehension and memory. Innov Syst Softw Eng 3(4):303–318

    Article  Google Scholar 

  • Mäder P, Egyed A (2015) Do developers benefit from requirements traceability when evolving and maintaining a software system? Empir Softw Eng (EMSE) 20 (2):413–441

    Article  Google Scholar 

  • Martin RC (2009) Clean code: a handbook of agile software craftsmanship. Pearson Education

  • Martinez M, Monperrus M (2015) Mining software repair models for reasoning on the search space of automated program fixing. Empir Soft Eng (EMSE) 20(1):176–205

    Article  Google Scholar 

  • von Mayrhauser A, Vans AM (1995) Program comprehension during software maintenance and evolution. IEEE Comput 28(8):44–55

    Article  Google Scholar 

  • McBurney PW, Mcmillan C (2014) Automatic documentation generation via source code summarization of method context. In: Proceedings of the 22nd international conference on program comprehension (ICPC). ACM, pp 279–290

  • McBurney PW, McMillan C (2016) An empirical study of the textual similarity between source code and source code summaries. Empir Soft Eng (EMSE) 21(1):17–42

    Article  Google Scholar 

  • Norcio AF (1982) Indentation, documentation and programmer comprehension. In: Proceedings of the 1st conference on human factors in computing systems (CHI). ACM, pp 118–120

  • Nurvitadhi E, Leung WW, Cook C (2003) Do class comments aid Java program understanding?. In: Proceedings of the 33rd annual frontiers in education, vol 1. IEEE, pp T3C–T3C

  • Perry DE, Porter AA, Votta LG (2000) Empirical studies of software engineering: a roadmap. In: Proceedings of the conference on the future of software engineering. ACM, pp 345–355

  • Rahman MM, Roy CK, Keivanloo I (2015) Recommending insightful comments for source code using crowdsourced knowledge. In: Proceedings of the 15th international working conference on source code analysis and manipulation (SCAM). IEEE, pp 81–90

  • Ratol IK (2017) Detecting fragile comments. In: Proceedings of the 32nd international conference on automated software engineering (ASE). IEEE Press, pp 112–122

  • Runeson P (2003) Using students as experiment subjects – an analysis on graduate and freshmen student data. In: Proceedings of the 7th international conference on evaluation and assessment in software engineering (EASE). Lund University, pp 95–102

  • Salviulo F, Scanniello G (2014) Dealing with identifiers and comments in source code comprehension and maintenance: results from an ethnographically-informed study with students and professionals. In: Proceedings of the 18th international conference on evaluation and assessment in software engineering (EASE). ACM, pp 48

  • Schröter I, Krüger J, Siegmund J, Leich T (2017) Comprehending studies on program comprehension. In: Proceedings of the 25th international conference on program comprehension (ICPC). IEEE, pp 308–311

  • Seiler M, Paech B (2017) Using tags to support feature management across issue tracking systems and version control systems. In: Requirements engineering: foundation for software quality. Springer, pp 174–180

  • Shakeel Y, Krüger J, von Nostitz-Wallwitz I, Lausberger C, Campero Durand G, Saake G, Leich T (2018) Automated literature analysis - threats and experiences. In: Proceedings of the international workshop on software engineering for science (SE4Science). ACM, pp 20–27

  • Sharon D (1996) Meeting the challenge of software maintenance. IEEE Soft 13 (1):122–125

    Article  Google Scholar 

  • Sheppard S, Borst M, Curtis B, Love L (1978) Predicting programmers’ ability to modify software. Tech Rep TR—7a—3B8100 3, General electric

  • Siegel S (1956) Nonparametric statistics for the behavioral sciences. McGraw-Hill Kogakusha LTD, Tokyo

    MATH  Google Scholar 

  • Siegmund J (2016) Program comprehension: past, present, and future. In: Proceedings of the 23rd international conference on software analysis, evolution and reengineering (SANER). IEEE, pp 13–20

  • Siegmund J, Siegmund N, Apel S (2015) Views on internal and external validity in empirical software engineering. In: Proceedings of the 37th international conference on software engineering (ICSE), vol 1. IEEE, pp 9–19

  • Sommerlad P, Zgraggen G, Corbat T, Felber L (2008) Retaining comments when refactoring code. In: Proceedings of the 23rd conference on object-oriented programming, systems, languages and applications (OOPSLA). ACM, pp 653–662

  • Sridhara G (2016) Automatically detecting the up-to-date status of todo comments in Java programs. In: Proceedings of the 9th India software engineering conference (ISEC). ACM, pp 16–25

  • Sridhara G, Hill E, Muppaneni D, Pollock L (2010) Towards automatically generating summary comments for Java methods. In: Proceedings of the 25th international conference on automated software engineering (ASE). ACM, pp 43–52

  • Standish TA (1984) An essay on software reuse. IEEE Trans Soft Eng (TSE) SE-10(5):494–497. https://ieeexplore.ieee.org/abstract/document/5010272

    Article  Google Scholar 

  • Steidl D, Hummel B, Juergens E (2013) Quality analysis of source code comments. In: Proceedings of the 21st international conference on program comprehension (ICPC). IEEE, pp 83–92

  • Storey MAD (2005) Theories, methods and tools in program comprehension: past, present and future. In: Proceedings of the 13th international workshop on program comprehension (IWPC). IEEE, pp 181–191

  • Storey MAD, Wong K, Muller HH (1997) How do program understanding tools affect how programmers understand programs?. In: Proceedings of the 4th working conference on reverse engineering (WCRE). IEEE, pp 12–21

  • Svahnberg M, Aurum A, Wohlin C (2008) Using students as subjects - an empirical evaluation. In: Proceedings of the 2nd international symposium on empirical software engineering and measurement (ESEM). ACM, pp 288–290

  • Takang AA, Grubb PA, Macredie RD (1996) The effects of comments and identifier names on program comprehensibility: an experimental investigation. J Program Lang (JPL) 4(3):143–167

    Google Scholar 

  • Tan SH, Marinov D, Tan L, Leavens GT (2012) @tComment: testing JavaDoc comments to detect comment-code inconsistencies. In: Proceedings of the 5th international conference on software testing, verification and validation (ICST). IEEE, pp 260–269. https://www.computer.org/csdl/proceedings/icst/2012/4670/00/4670a260-abs.html

  • Tenny T (1985) Procedures and comments vs. the banker’s algorithm. ACM SIGCSE Bulletin 17(3):44–53

    Article  Google Scholar 

  • Tenny T (1988) Program readability: procedures versus comments. IEEE Trans Soft Eng (TSE) 14(9):1271–1279

    Article  Google Scholar 

  • Tiarks R (2011) What maintenance programmers really do: an observational study. In: Proceedings of the 13th workshop on software reengineering, pp 36–37

  • Trochim WM, Donnelly JP, Arora K (2016) Research methods the essential knowledge base, 2nd edn. Cengage Learning, Boston

  • Trumper J, Dollner J, Telea A (2013) Multiscale visual comparison of execution traces. In: Proceedings of the 21st international conference on program comprehension (ICPC). IEEE, pp 53–62

  • Vermeulen A (2000) The elements of Java (tm) style. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Wohlin C (2014) Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th international conference on evaluation and assessment in software engineering (EASE). ACM, pp 1–10

  • Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2012) Experimentation in software engineering. Springer, Berlin

    Book  MATH  Google Scholar 

  • Wong E, Yang J, Tan L (2013) Autocomment: mining question and answer sites for automatic comment generation. In: Proceedings of the 28th international conference on automated software engineering (ASE). IEEE, pp 562–567

  • Woodfield SN, Dunsmore HE, Shen VY (1981) The effect of modularization and comments on program comprehension. In: Proceedings of the 5th international conference on software engineering (ICSE). IEEE, pp 215–223

  • Ying AT, Wright JL, Abrams S (2005) Source code that talks: an exploration of eclipse task comments and their implication to repository mining. ACM SIGSOFT Soft Eng Notes (SEN) 30(4):1–5

    Google Scholar 

Download references

Acknowledgments

This research is supported by DFG grant LE 3382/2-1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Nielebock.

Additional information

Communicated by: Christoph Treude

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

In the following, we present our 9 tasks and their solutions. To indicate the comment type, we always use single-line marks for implementation comments and multi-line marks for documentation comments. Versions with no comments contained none of the lines marked this way. The solutions were used as exemplary sketches, but we checked each solution individually.

1.1 A.1 Apply Code (Tasks 1–3)

Task 1

Call method foo() in such a way that it returns 7.

Listing 2
figure r

Task 1

Listing 3
figure s

Task 1 Solution

Task 2

Call method foo() in such a way that it returns .

Listing 4
figure u

Task 2

Listing 5
figure v

Task 2 Solution

Task 3

Change (only) the list objectList in the method bar so that the call of this method returns .

figure x
Listing 6
figure y

Task 3

Listing 7
figure z

Task 3 Solution

1.2 A.2 Bug Fixing (Tasks 4-6)

Task 4

The foo() method throws a runtime exception for the following input. Fix the error so that the expected result [3, 8] is returned.


foo(new int[]{1, 3, 4, 5, 8, 11, 13}, new int[]{2, 3, 5, 7, 8, 9}); >> [3,8]

Listing 8
figure aa

Task 4

Listing 9
figure ab

Task 4 Solution

Task 5

The foo() method contains an error. Fix it so that the expected results are returned:


foo("abcd", "acbd") >> 1 foo("abcd", "badc") >> 2 foo("abcdef", "defabc") >> 3

It can be assumed that both strings have the same length and are not .

Listing 10
figure ad

Task 5

Listing 11
figure ae

Task 5 Solution

After the study was completed, we noticed that our solution handles only strings with an even number of characters, for example, for "abc" and "cab" the code would calculate only one transposition due to integer division. However, all participants seemed to be unaware of this problem as none of the given answers handles this scenario. Thus, we accepted all answers that solve the task for strings with an even number of characters.

Please notice, that this task does not use the Hamming distance, as a transposition is defined as a single switch of two characters. Thus, we considered "abcd" and "efgh" as invalid input, because no chars within the string are switched.

Task 6

Fix the compile-time error in foo().

Listing 12
figure af

Task 6

Listing 13
figure ag

Task 6 Solution

1.3 A.3 Extend Code (Tasks 7–9)

Task 7

Extend the method foo with an parameter, which is returned if number equals 0. Example:


foo(new int[]{5,13,31}, 7); >> 7

Listing 14
figure ai

Task 7

Listing 15
figure aj

Task 7 Solution

Task 8

Extend method foo() to ignore Strings for the output.


String[] input = {"Tic", null, "Tac", "Toe"}; join(input, ","); >> Tic,Tac,Toe

Listing 16
figure al

Task 8

Listing 17
figure am

Task 8 Solution

Task 9

Extend Class2 with a method bar() of the return type Integer that reverses the operation of foo().

Listing 18
figure an

Task 9

Listing 19
figure ao

Task 9 Solution

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nielebock, S., Krolikowski, D., Krüger, J. et al. Commenting source code: is it worth it for small programming tasks?. Empir Software Eng 24, 1418–1457 (2019). https://doi.org/10.1007/s10664-018-9664-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-018-9664-z

Keywords

Navigation