Abstract
Program comprehension concerns the ability to understand code written by others. But not all code is the same. We use an experimental platform fashioned as an online game-like environment to measure how quickly and accurately 220 professional programmers can interpret code snippets with similar functionality but different structures; snippets that take longer to understand or produce more errors are considered harder. The results indicate, inter alia, that for loops are significantly harder than if s, that some but not all negations make a predicate harder, and that loops counting down are slightly harder than loops counting up. This demonstrates how the effect of syntactic structures, different ways to express predicates, and the use of known idioms can be measured empirically, and that syntactic structures are not necessarily the most important factor. We also found that the metrics of time to understanding and errors made are not necessarily equivalent. Thus loops counting down took slightly longer, but loops with unusual bounds caused many more errors. By amassing many more empirical results like these it may be possible to derive better code complexity metrics than we have today, and also to better appreciate their limitations.








Similar content being viewed by others
Notes
Subjects who did not answer the academic degree questions were assigned to the group of no degree.
References
Abrahão S, Gravino C, Insfran E, Scanniello G, Tortora G (2013) Assessing the effectiveness of sequence diagrams in the comprehension of functional requirements: results from a family of five experiments. IEEE Trans Softw Eng 39 (3):327–342. https://doi.org/10.1109/TSE.2012.27
Adelson B, Soloway E (1985) The role of domain experience in software design. IEEE Trans Softw Eng SE-11(11):1351–1360. https://doi.org/10.1109/TSE.1985.231883
Agresti A, Kateri M (2011) Categorical data analysis. Springer, Berlin
Ajami S, Woodbridge Y, Feitelson DG (2017) Syntax, predicates, idioms — what really affects code complexity? In: 25th international conference of program comprehension, pp 66–76. https://doi.org/10.1109/ICPC.2017.39
Ali M, Elish MO (2013) A comparative literature survey of design patterns impact on software quality. In: International conference of information science & applications. https://doi.org/10.1109/ICISA.2013.6579460
Arunachalam V, Sasso W (1996) Cognitive processes in program comprehension: an empirical analysis in the context of software reengineering. J Syst Softw 34 (3):177–189. https://doi.org/10.1016/0164-1212(95)00074-7
Avidan E, Feitelson DG (2017) Effects of variable names on comprehension: an empirical study. In: 25th international conference in program comprehension, pp 55–65. https://doi.org/10.1109/ICPC.2017.27
Ball T, Larus JR (2000) Using paths to measure, explain, and enhance program behavior. Computer 33(7):57–65. https://doi.org/10.1109/2.869371
Bednarik R, Tukiainen M (2006) An eye-tracking methodology for characterizing porgram comprehension processes. In: 4th symposium eye tracking research & applications, pp 125–132. https://doi.org/10.1145/1117309.1117356
Bergersen GR, Gustafsson J-E (2011) Programming skill, knowledge, and working memory among professional software developers from an investment theory perspective. J Individ Differ 32(4):201–209. https://doi.org/10.1027/1614-0001/a000052
Bergersen GR, Sjøberg DIK, Dybå T (2014) Construction and validation of an instrument for measuring programming skill. IEEE Trans Softw Eng 40(12):1163–1184. https://doi.org/10.1109/TSE.2014.2348997
Bishop B, McDaid K (2008) Spreadsheet debugging behaviour of expert and novice end-users. In: 4th international workshop end-user software engineering, pp 56–60. https://doi.org/10.1145/1370847.1370860
Bishop J, Horspool RN, Xie T, Tillmann N, de Halleux J (2015) Code hunt: experience with coding contests at scale. In 37th international conference and software engineering, vol 2, pp 398–407. https://doi.org/10.1109/ICSE.2015.172
Brooks R (1983) Towards a theory of the comprehension of computer programs. Intl J Man-Mach Stud 18(6):543–554. https://doi.org/10.1016/S0020-7373(83)80031-5
Brooks Jr, FP (1987) No silver bullet: essence and accidents of software engineering. Computer 20(4):10–19. https://doi.org/10.1109/MC.1987.1663532
Buse RPL, Weimer WR (2008) A metric for software readability. In: International symposium software testing & analysis, pp 121–130. https://doi.org/10.1145/1390630.1390647
Butler S, Wermelinger M, Yu Y, Sharp H (2010) Exploring the influence of identifier names on code quality: An empirical study. In: 14th European conference in software maintenance & reengineering. https://doi.org/10.1109/CSMR.2010.27
Coe R (2002) It’s the effect size, stupid: what effect size is and why it is important. In: Conference in British educational research association
Curtis B (1981) Substantiating programmer variability. Proc IEEE 69(7):846. https://doi.org/10.1109/PROC.1981.12088
Curtis B, Sheppard SB, Milliman P (1979) Third time charm: stronger prediction of programmer performance by software complexity metrics. In: 4th international conference software and engineering
Curtis B, Sappidi J, Subramanyam J (2011) An evaluation of the internal quality of business applications: does size matter? In: 33rd international conference software and engineering, pp 711–715. https://doi.org/10.1145/1985793.1985893
Denaro G, Pezzè M (2002) An empirical evaluation of fault-proneness models. In: 24th international conference software and engineering, pp 241–251. https://doi.org/10.1145/581339.581371
Deterding S, Dixon D, Khaled R, Nacke L (2011) From game design elements to gamefulness: Defining “gamification”. In: 15th international academic MindTrek conference: envisioning future media environments, pp 9–15. https://doi.org/10.1145/2181037.2181040
Dijkstra EW (1968) Go To statement considered harmful. Comm ACM 11(3):147–148. https://doi.org/10.1145/362929.362947
Feigenspan J, Apel S, Liebig J, Kästner C (2011) Exploring software measures to assess program comprehension. In: International symposium empirical software engineering & measurement, pp 127–136. https://doi.org/10.1109/ESEM.2011.21
Feitelson DG (2015) Using students as experimental subjects in software engineering research – a review and discussion of the evidence. arXiv:http://arXiv.org/abs/1512.08409 [cs.SE]
Gamma E, Helm R, Johnson R, Vlissides J (1994) Design patterns: elements of reusable object-oriented software. Addison-Wesley, Boston
Gil Y, Lalouche G (2017) On the correlation between size and metric validity. Empir Softw Eng 22(5):2585–2611. https://doi.org/10.1007/s10664-017-9513-5
Gill GK, Kemerer CF (1991) Cyclomatic complexity density and software maintenance productivity. IEEE Trans Softw Eng 17(12):1284–1288. https://doi.org/10.1109/32.106988
Gramß D, Frank T, Rehberger S, Vogel-Heuser B (2014) Female characteristics and requirements in software engineering in mechanical engineering. In: International conference in interactive collaborative learning, pp 272–279. https://doi.org/10.1109/ICL.2014.7017783
Gruhn V, Laue R (2007) On experiments for measuring cognitive weights for software control structures. In: 6th international conference in cognitive informatics, pp 116–119. https://doi.org/10.1109/COGINF.2007.4341880
Hamari J, Shernoff DJ, Rowe E, Coller B, Asbell-Clarke J, Edwards T (2016) Challenging games help students learn: an empirical study on engagement, flow and immersion in game-based learning. Comput Human Behav 54:170–179. https://doi.org/10.1016/j.chb.2015.07.045
Hansen M, Goldstone RL, Lumsdaine A (2013) What makes code hard to understand? arXiv:1304.5257v2[cs.SE]
Heathcote A, Brown S, Mewhort DJK (2000) The power law repealed: the case for an exponential law of practice. Psychon Bullet Rev 7 (2):185–207. https://doi.org/10.3758/BF03212979
Henry S, Kafura D (1981) Software structure metrics based on information flow. IEEE Trans Softw Eng SE-7(5):510–518. https://doi.org/10.1109/TSE.1981.231113
Herraiz I, Hassan AE (2011) Beyond lines of code: do we need more complexity metrics?. In: Oram A, Wilson G (eds) Making software: what really works, and why we believe it. O’Reilly Media Inc., pp 125–141
Huotari K, Hamari J (2012) Defining gamification: a service marketing perspective. In: 16th international academic MindTrek conference, pp 17–22. https://doi.org/10.1145/2393132.2393137
Iselin ER (1988) Conditional statements, looping constructs, and program comprehension: an experimental study. Intl J Man-Mach Stud 28(1):45–66. https://doi.org/10.1016/S0020-7373(88)80052-X
Jbara A, Feitelson DG (2014) On the effect of code regularity on comprehension. In: 22nd international conference in program comprehension, pp 189–200. https://doi.org/10.1145/2597008.2597140
Jbara A, Feitelson DG (2017) How programmers read regular code: a controlled experiment using eye tracking. Empir Softw Eng 22(3):1440–1477. https://doi.org/10.1007/s10664-016-9477-x
Kahney H (1983) What do novice programmers know about recursion. In: SIGCHI conference human factors in computer system, pp 235–239. https://doi.org/10.1145/800045.801618
Katzmarski B, Koschke R (2012) Program complexity metrics and programmer opinions. In: 20th international conferenc in program comprehension, pp 17–26. https://doi.org/10.1109/ICPC.2012.6240486
Kirkpatrick K (2016) Coding as sport. Comm ACM 59(5):32–33. https://doi.org/10.1145/289867
Klerer M (1984) Experimental study of a two-dimensional language vs Fortran for first-course programmers. Intl J Man-Mach Stud 20(5):445–467. https://doi.org/10.1016/S0020-7373(84)80021-8
Landman D, Serebrenik A, Vinju J (2014) Empirical analysis of the relationship between CC and SLOC in a large corpus of Java methods. In: International conference software maintenance & evolution
Letovsky S (1987) Cognitive processes in program comprehension. J Syst Softw 7(4):325–339. https://doi.org/10.1016/0164-1212(87)90032-X
Lumley T, Diehr P, Emerson S, Chen L (2002) The importance of the normality assumption in large public health data sets. Ann Rev of Publ Health 23 (1):151–169
Mair P, Hatzinger R (2007) Extended Rasch modeling: the eRm package for the application of IRT models in R. J Stat Softw 20(9). https://doi.org/10.18637/jss.v020.i09
McCabe T (1976) A complexity measure. IEEE Trans Softw Eng SE-2(4):308–320. https://doi.org/10.1109/TSE.1976.233837
Munson JC, Khoshgoftaar TM (1990) Applications of a relative complexity metric for software project management. J Syst Softw 12 (3):283–291. https://doi.org/10.1016/0164-1212(90)90051-M
Myers GJ (1977) An extension to the cyclomatic measure of program complexity. SIGPLAN Not 12(10):61–64. https://doi.org/10.1145/954627.954633
Myers RH, Montgomery DC, Vining GG, Robinson TJ (2010) Generalized linear models: with applications in engineering and the sciences. Wiley, Hoboken
Mynatt BT (1984) The effect of semantic complexity on the comprehension of program modules. Intl J Man-Mach Stud 21(2):91–103. https://doi.org/10.1016/S0020-7373(84)80060-7
Newell A, Rosenbloom PS (1981) Mechanisms of skill acquisition and the law of practice. In: Anderson JR (ed) Cognitive skills and their acquisition. Lawrence Erlbaum Association, pp 1–55
Nunez WZ, Marin VJ, Rivero CR (2017) ARCC: Assistant For repetitive code comprehension. In: 11th joint European software engineering conference & symposium foundations of software engineering, pp 999–1003. https://doi.org/10.1145/3106237.3122824
Ohlsson N, Alberg H (1996) Predicting fault-prone software modules in telephone switches. IEEE Trans Softw Eng 22(12):886–894. https://doi.org/10.1109/32.553637
Parnin C, Siegmund J, Peitek N (2017) On the nature of programmer expertise. In: 28th psychology of programming interest group annals workshop
Pink DH (2009) Drive: The surprising truth about what motivates us. Tiverhead Hardcover
Piwowarski P (1982) A nesting level complexity measure. SIGPLAN Not 17 (9):44–50. https://doi.org/10.1145/947955.947960
Prechelt L (1999) Comparing Java vs. C/C++ efficiency differences to interpersonal differences. Comm ACM 42(10):109–112. https://doi.org/10.1145/317665.317683
Rajlich V, Cowan GS (1997) Towards standard for experiments in program comprehension. In: 5th IEEE international workshop program comprehension, pp 160–161. https://doi.org/10.1109/WPC.1997.601284
Rich C (1987) Inspection methods in programming: Clichés and plans. A.I. Memo 1005, MIT Artificial Intelligence Laboratory
Rilling J, Klemola T (2003) Identifying comprehension bottlenecks using program slicing and cognitive complexity metrics. In: 11th IEEE international workshop program comprehension, pp 115–124
Sackman H, Erikson WJ, Grant EE (1968) Exploratory experimental studies comparing online and offline programming performance. Comm ACM 11(1):3–11. https://doi.org/10.1145/362851.362858
Schneidewind N, Hinchey M (2009) A complexity reliability model. In: 20th international symposium software reliability engineering, pp 1–10. https://doi.org/10.1109/ISSRE.2009.10
Shao J, Wang Y (2003) A new measure of software complexity based on cognitive weights. Canadian. J Elect Comput Eng 28(2):69–74. https://doi.org/10.1109/CJECE.2003.1532511
Sharafi Z, Soh Z, Guéhéneuc Y-G, Antoniol G (2012) Women and men — different but equal: on the impact of identifier style on source code reading. In: 20th international conferenc program comprehension, pp 27–36. https://doi.org/10.1109/ICPC.2012.6240505
Shepperd M (1988) A critique of cyclomatic complexity as a software metric. Softw Eng J 3(2):30–36. https://doi.org/10.1049/sej.1988.0003
Shneiderman B, Mayer R (1979) Syntactic/semantic interactions in programmer behavior: a model and experimental results. Intl J Comput Inf Syst 8(3):219–238. https://doi.org/10.1007/BF00977789
Siegmund J, Kästner C, Liebig J, Apel S, Hanenberg S (2014) Measuring and modeling programming experience. Empir Softw Eng 19(5):1299–1334. https://doi.org/10.1007/s10664-013-9286-4
Siegmund J, Schumann J (2015) Confounding parameters on program comprehension: a literature survey. Empir Softw Eng 20(4):1159–1192. https://doi.org/10.1007/s10664-014-9318-8
Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng SE-10(5):595–609. https://doi.org/10.1109/TSE.1984.5010283
Sonnentag S (1998) Expertise in professional software design: a process study. J App Psychol 83(5):703–715. https://doi.org/10.1037/0021-9010.83.5.703
Sonnentag S, Niessen C, Volmer J (2006) Expertise in software design. In: Ericsson KA, Charness N, Feltovich PJ, Hoffman RR (eds) The Cambridge handbook of expertise and expert performance. Cambridge University Press, pp 373–387
Vinju JJ, Godfrey MW (2012) What does control flow really look like? Eyeballing the cyclomatic complexity metric. In: 12th IEEE international working conference source code analysis & manipulation
von Mayrhauser A, Vans AM (1995) Program comprehension during software maintenance and evolution. Computer 28(8):44–55. https://doi.org/10.1109/2.402076
Welch BL (1938) The significance of the difference between two means when the population variances are unequal. Biometrika 29(3/4):350–362
Weyuker EJ (1988) Evaluating software complexity measures. IEEE Trans Softw Eng 14(9):1357–1365. https://doi.org/10.1109/32.6178,
Yoder KJ, Belmonte MK (2010) Combining computer game-based behavioral experiments with high-density EEG and infrared gaze tracking. J Vis Exp 46, art. no. e2320. https://doi.org/10.3791/2320
Acknowledgments
Many thanks to Micha Mandel for his help with the statistical analysis, and to the anonymous reviewers for their comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: David Lo and Alexander Serebrenik
Dror Feitelson holds the Berthold Badler chair in Computer Science. This research was supported by the ISRAEL SCIENCE FOUNDATION (grant no. 407/13). This paper is an invited extended version of a paper from ICPC 2017.
Rights and permissions
About this article
Cite this article
Ajami, S., Woodbridge, Y. & Feitelson, D.G. Syntax, predicates, idioms — what really affects code complexity?. Empir Software Eng 24, 287–328 (2019). https://doi.org/10.1007/s10664-018-9628-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-018-9628-3