Skip to main content
Log in

Mining the use of higher-order functions:

An exploratory study on Scala programs

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

A higher-order function takes one or more functions as inputs or outputs to support the generality of function definitions. In modern programming languages, higher-order functions are designed as a feature to enhance usability and scalability. Abstracting higher-order functions from existing functions decreases the number of similar functions and improves the code reuse. However, due to the complexity, defining and calling higher-order functions are not widely used in practice. In this paper, we investigate the use of higher-order functions in Scala programs. We collected 8,285 higher-order functions from 35 Scala projects in GitHub with the most stars and conducted an exploratory study via answering five research questions of using higher-order functions, including the data scale, the definition types, the definition distribution, the factor that correlates with the function calls, and the developer contribution. Our study mainly shows five empirical results about the common use of higher-order functions in Scala programs. Our findings are listed as follows. (1) Among 35 Scala projects, 6.84% of functions are defined as higher-order functions on average and the average calls per function show that higher-order functions are called more frequently than first-order functions. (2) In all higher-order functions in the study, 87.35% of definitions of higher-order functions and 90.66% of calls belong to the type that only takes functions as parameters. (3) Three measurements (including lines of executable code, Cyclomatic complexity, and warnings in the code style) in higher-order functions are lower than those of first-order functions. (4) Regression analysis on all projects suggests that the number of calling higher-order functions highly correlates with the Cyclomatic complexity. (5) In all projects in the study, 43.82% calls of higher-order functions are written by the same developers who have defined the functions and results show that top 20% authors of higher-order functions favor defining or calling higher-order functions than first-order functions. This study can be viewed as a preliminary result to understand the use of higher-order functions and to motivate further investigation in Scala programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. Array.map() in Scala, http://www.scala-lang.org/api/2.12.8/scala/Array.html#map[B](f:A=⟩B):Array[B].

  2. Java supports higher-order functions since its Version 8.0 in 2014.

  3. Project dotty, http://github.com/lampepfl/dotty.

  4. Scala currying, http://docs.scala-lang.org/tour/currying.html.

  5. The collected data in this study are publicly available, http://cstar.whu.edu.cn/p/scalahof/.

  6. SemanticDB, http://scalameta.org/docs/semanticdb/guide.html.

  7. Scala projects with stars, http://github.com/search?l=Scala&o=desc&q=scala&s=stars&type=Repositories, accessed on September 1st, 2019.

  8. Scalameta, http://scalameta.org/.

  9. ScalaStyle, http://www.scalastyle.org/.

References

  • Albrecht AJ Jr, Gaffney JE (1983) Software function, source lines of code, and development effort prediction: A software science validation. IEEE Trans Softw Eng 9(6):639–648. https://doi.org/10.1109/TSE.1983.235271

    Article  Google Scholar 

  • Altenkirch T (2001) Representations of first order function types as terminal coalgebras. In: Proceedings of the 5th International Conference on Typed Lambda Calculi and Applications, TLCA 2001, Krakow, pp. 8–21. https://doi.org/10.1007/3-540-45413-6_5

  • Bacchelli A, Bird C (2013a) Expectations, outcomes, and challenges of modern code review. In: Proceedings of the 2013 International Conference on Software Engineering. IEEE Press, pp 712–721

  • Bacchelli A, Bird C (2013b) Expectations, outcomes, and challenges of modern code review. In: Notkin D, Cheng BHC, Pohl K (eds) 35th International Conference on Software Engineering, ICSE ’13. IEEE Computer Society, San Francisco, pp 712–721. https://doi.org/10.1109/ICSE.2013.6606617

  • Bassoy C, Schatz V (2018) Fast higher-order functions for tensor calculus with tensors and subtensors. In: Proceedings of the 18th International Conference on Computational Science, ICCS 2018, Wuxi, Proceedings, Part I, pp 639–652. https://doi.org/10.1007/978-3-319-93698-7_49

  • Brachthȧuser JI, Schuster P (2017) Effekt: extensible algebraic effects in scala. In: Proceedings of the 8th ACM SIGPLAN International Symposium on Scala, SCALA@SPLASH 2017, Vancouver, pp 67–72. https://doi.org/10.1145/3136000.3136007

  • Budtz-Jorgensen E, Keiding N, Grandjean P, Weihe P (2007) Confounder selection in environmental epidemiology: Assessment of health effects of prenatal mercury exposure. Ann Epidemiol 17:27–35. https://doi.org/10.1016/j.annepidem.2006.05.007

    Article  Google Scholar 

  • Cardelli L, Wegner P (1985) On understanding types, data abstraction, and polymorphism. ACM Comput Surv 17(4):471–522. https://doi.org/10.1145/6041.6042

    Article  Google Scholar 

  • Cassez F, Sloane AM (2017) Scalasmt: satisfiability modulo theory in scala. In: Proceedings of the 8th ACM SIGPLAN International Symposium on Scala, SCALA@SPLASH 2017, Vancouver, pp 51–55. https://doi.org/10.1145/3136000.3136004

  • Clark T, Barn BS (2013) Dynamic reconfiguration of event driven architecture using reflection and higher-order functions. Int J Softw Inf 7(2):137–168. http://www.ijsi.org/ch/reader/view_abstract.aspx?file_no=i157

  • Cohen J, Cohen P, West SG, Aiken LS (2013) Applied multiple regression/correlation analysis for the behavioral sciences. Routledge, Abingdon

  • Fry T, Dey T, Karnauch A, Mockus A (2020) A dataset and an approach for identity resolution of 38 million author ids extracted from 2b git commits. arXiv:2003.08349

  • Gousios G, Storey MD, Bacchelli A (2016) Work practices and challenges in pull-based development: the contributor’s perspective. In: Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, pp 285–296. https://doi.org/10.1145/2884781.2884826

  • Gu Y, Xuan J, Zhang H, Zhang L, Fan Q, Xie X, Qian T (2019) Does the fault reside in a stack trace? assisting crash localization by predicting crashing fault residence. J Syst Softw 148:88–104. https://doi.org/10.1016/j.jss.2018.11.004

    Article  Google Scholar 

  • HackerNews (2009) Twitter jilts Ruby for Scala. http://news.ycombinator.com/item?id=542716

  • Karlsson O, Haller P (2018) Extending scala with records: design, implementation, and evaluation. In: Proceedings of the 9th ACM SIGPLAN International Symposium on Scala, SCALA@ICFP 2018, St. Louis, pp 72–82. https://doi.org/10.1145/3241653.3241661

  • Koch R (2011) The 80/20 Principle: The Secret of Achieving More with Less: Updated 20th anniversary edition of the productivity and business classic. Hachette, UK

  • Kroll L, Carbone P, Haridi S (2017) Kompics scala: narrowing the gap between algorithmic specification and executable code (short paper). In: Proceedings of the 8th ACM SIGPLAN International Symposium on Scala, SCALA@SPLASH 2017, Vancouver, pp 73–77. https://doi.org/10.1145/3136000.3136009

  • Lee PH (2015) Should we adjust for a confounder if empirical and theoretical criteria yield contradictory results? A simulation study Scientific Reports 4(6085). https://doi.org/10.1038/srep06085

  • Lincke D, Schupp S (2012) From HOT to COOL: transforming higher-order typed languages to concept-constrained object-oriented languages. In: International workshop on language descriptions, tools, and applications, LDTA ’12, Tallinn, pp 3. https://doi.org/10.1145/2427048.2427051

  • Madhavan R, Kulal S, Kuncak V (2017) Contract-based resource verification for higher-order functions with memoization. In: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, pp 330–343. http://dl.acm.org/citation.cfm?id=3009874

  • McCabe TJ (1976) A complexity measure. IEEE Trans Softw Eng 2(4):308–320. https://doi.org/10.1109/TSE.1976.233837

    Article  MathSciNet  Google Scholar 

  • McIntosh S, Kamei Y, Adams B, Hassan AE (2016) An empirical study of the impact of modern code review practices on software quality. Empir Softw Eng 21(5):2146–2189. https://doi.org/10.1007/s10664-015-9381-9

    Article  Google Scholar 

  • Nakaguchi T, Murakami Y, Lin D, Ishida T (2016) Higher-order functions for modeling hierarchical service bindings. In: IEEE International conference on services computing, SCC 2016, San francisco, pp 798–803. https://doi.org/10.1109/SCC.2016.110

  • Nystrom N (2017) A scala framework for supercompilation. In: Proceedings of the 8th ACM SIGPLAN International Symposium on Scala, SCALA@SPLASH 2017, Vancouver, pp 18–28. https://doi.org/10.1145/3136000.3136011

  • Odersky M, Altherr P, Cremet V, Emir B, Maneth S, Micheloud S, Mihaylov N, Schinz M, Stenman E, Zenger M (2004) An overview of the scala programming language. Tech. rep., Technical Report IC/2004/64. EPFL Lausanne, Switzerland

  • Racordon D (2018) Coroutines with higher order functions. CoRR arXiv:1812.08278

  • Rahman W, Xu Y, Pu F, Xuan J, Jia X, Basios M, Kanthan L, Li L, Wu F, Xu B (2020) Clone detection on large scala codebases. In: IEEE 14Th international workshop on software clones, IWSC 2020. IEEE, London, pp 38–44. https://doi.org/10.1109/IWSC50091.2020.9047640

  • Reed WJ (2001) The pareto, zipf and other power laws. Econ Lett 74(1):15–19

    Article  Google Scholar 

  • Reynders B, Greefs M, Devriese D, Piessens F (2018) Scalagna 0.1: towards multi-tier programming with scala and scala.js. In: Conference companion of the 2nd international conference on art, science, and engineering of programming, Nice, pp 69–74. https://doi.org/10.1145/3191697.3191731

  • Richardson B (2017) When should i use higher order functions? http://www.quora.com/When-should-I-use-higher-order-functions

  • Richmond D, Althoff A, Kastner R (2018) Synthesizable higher-order functions for C++. IEEE Trans CAD Integr Circ Syst 37(11):2835–2844. https://doi.org/10.1109/TCAD.2018.2857259

    Article  Google Scholar 

  • Rigby PC, Storey MD (2011) Understanding broadcast based peer review on open source software projects. In: Taylor RN, Gall HC, Medvidovic N (eds) Proceedings of the 33rd International Conference on Software Engineering, ICSE 2011. ACM, Waikiki, pp 541–550. https://doi.org/10.1145/1985793.1985867

  • Rusu V, Arusoaie A (2017) Executing and verifying higher-order functional-imperative programs in maude. J Log Algebr Meth Program 93:68–91. https://doi.org/10.1016/j.jlamp.2017.09.002

    Article  MathSciNet  Google Scholar 

  • Scala (2020) The scala language. http://scala-lang.org/

  • Selakovic M, Pradel M, Karim R, Tip F (2018) Test generation for higher-order functions in dynamic languages. PACMPL 2(OOPSLA):161:1–161:27. https://doi.org/10.1145/3276531

    Google Scholar 

  • van der Lippe T, Smith T, Pelsmaeker D, Visser E (2016) A scalable infrastructure for teaching concepts of programming languages in scala with weblab: an experience report. In: Proceedings of the 7th ACM SIGPLAN Symposium on Scala, SCALA@SPLASH 2016, Amsterdam, pp 65–74. https://doi.org/10.1145/2998392.2998402

  • Voirol N, Kneuss E, Kuncak V (2015) Counter-example complete verification for higher-order functions. In: Proceedings of the 6th ACM SIGPLAN Symposium on Scala, Scala@PLDI 2015, Portland, pp 18–29. https://doi.org/10.1145/2774975.2774978

  • Walpole RE, Myers SL, Ye K, Myers RH (2007) Probability and statistics for engineers and scientists. Pearson, London

  • Wester R, Kuper J (2013) A space/time tradeoff methodology using higher-order functions. In: 23Rd international conference on field programmable logic and applications, FPL 2013, Porto, pp 1–2. https://doi.org/10.1109/FPL.2013.6645613

  • Wester R, Kuper J (2014) Design space exploration of a particle filter using higher-order functions. In: Reconfigurable Computing: Architectures, Tools, and Applications - 10th International Symposium, ARC 2014, Vilamoura. Proceedings, pp 219–226. https://doi.org/10.1007/978-3-319-05960-0_21

  • Wilcoxon F (1992) Individual comparisons by ranking methods. In: Breakthroughs in statistics. Springer, pp 196–202

  • Xu Y, Jia X, Xuan J (2019) Writing tests for this higher-order function first: Automatically identifying future callings to assist testers. In: Proceedings of the 11th Asia-Pacific Symposium on Internetware (Internetware 2019), Fukuoka, pp 1–10. https://doi.org/10.1145/1122445.1122456

  • Zhang X, Chen Y, Gu Y, Zou W, Xie X, Jia X, Xuan J (2018) How do multiple pull requests change the same code: A study of competing pull requests in github. In: 2018 IEEE International conference on software maintenance and evolution, ICSME 2018, Madrid, pp 228–239. https://doi.org/10.1109/ICSME.2018.00032

  • Zhao Y, Serebrenik A, Zhou Y, Filkov V, Vasilescu B (2017) The impact of continuous integration on other software development practices: a large-scale empirical study. In: Rosu G, Penta MD, Nguyen TN (eds) Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, ASE 2017. IEEE Computer Society, Urbana, pp 60–71. https://doi.org/10.1109/ASE.2017.8115619

  • Zou W, Xuan J, Xie X, Chen Z, Xu B (2019) How does code style inconsistency affect pull request integration? an exploratory study on 117 github projects. Empir Softw Eng 24(6):3871–3903. https://doi.org/10.1007/s10664-019-09720-x

    Article  Google Scholar 

Download references

Acknowledgements

The work is supported by the National Key R&D Program of China under Grant No. 2018YFB1003901, the National Natural Science Foundation of China under Grant No. 61872273, and the Advance Research Projects of Civil Aerospace Technology – Communications, Navigation and Remote Sensing Integrated Applications and Multi-source Spatial Data Fusion Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jifeng Xuan.

Additional information

Communicated by: Richard Paige, Jordi Cabot and Neil Ernst

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection:Software Applications (NASAC)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Y., Wu, F., Jia, X. et al. Mining the use of higher-order functions:. Empir Software Eng 25, 4547–4584 (2020). https://doi.org/10.1007/s10664-020-09842-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-020-09842-7

Keywords

Navigation