Skip to main content

Towards Extracting Reusable and Maintainable Code Snippets

  • Conference paper
  • First Online:
Software Technologies (ICSOFT 2022)

Abstract

Given the wide adoption of the agile software development paradigm, where efficient collaboration as well as effective maintenance are of utmost importance, and the (re)use of software residing in code hosting platforms, the need to produce qualitative code is evident. A condition for acceptable software reusability and maintainability is the use of idiomatic code, based on syntactic fragments that recur frequently across software projects and are characterized by high quality. In this work, we propose a methodology that can harness data from the most popular GitHub repositories in order to automatically identify reusable and maintainable code idioms, by grouping code blocks that have similar structural and semantic information. We also apply the same methodology on a single-project level, in an attempt to identify frequently recurring blocks of code across the files of a team. Preliminary evaluation of our methodology indicates that our approach can identify commonly used, reusable and maintainable code idioms and code blocks that can be effectively given as actionable recommendations to the developers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com.

  2. 2.

    https://github.com/thdiaman/ASTExtractor.

  3. 3.

    https://programming-idioms.org/.

  4. 4.

    https://www.nayuki.io/page/good-java-idioms.

References

  1. Aggarwal, K., Hindle, A., Stroulia, E.: Co-evolution of project documentation and popularity within github. In: Proceedings of the 11th Working Conference on Mining Software Repositories. MSR 2014, New York, NY, USA, pp. 360–363. Association for Computing Machinery (2014). https://doi.org/10.1145/2597073.2597120

  2. Allamanis, M., Barr, E.T., Bird, C., Devanbu, P., Marron, M., Sutton, C.: Mining semantic loop idioms. IEEE Trans. Software Eng. 44(7), 651–668 (2018). https://doi.org/10.1109/TSE.2018.2832048

    Article  Google Scholar 

  3. Allamanis, M., Sutton, C.: Mining idioms from source code. CoRR abs/1404.0417 (2014). http://arxiv.org/abs/1404.0417

  4. Augsten, N., Böhlen, M., Gamper, J.: The PQ-gram distance between ordered labeled trees 35(1) (2008). https://doi.org/10.1145/1670243.1670247

  5. Augsten, N., Böhlen, M., Gamper, J.: Approximate matching of hierarchical data using PQ-grams - slides 1, 301–312 (2005). https://doi.org/10.5167/uzh-56101

  6. Baltes, S., Dumani, L., Treude, C., Diehl, S.: Sotorrent: Reconstructing and analyzing the evolution of stack overflow posts. CoRR abs/1803.07311 (2018). http://arxiv.org/abs/1803.07311

  7. Borges, H., Hora, A., Valente, M.T.: Understanding the factors that impact the popularity of github repositories. In: 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 334–344 (2016). https://doi.org/10.1109/ICSME.2016.31

  8. Dimaridou, V., Kyprianidis, A.C., Papamichail, M., Diamantopoulos, T., Symeonidis, A.: Towards modeling the user-perceived quality of source code using static analysis metrics, pp. 73–84, July 2017. https://doi.org/10.5220/0006420000730084

  9. Dimaridou, V., Kyprianidis, A.-C., Papamichail, M., Diamantopoulos, T., Symeonidis, A.: Assessing the user-perceived quality of source code components using static analysis metrics. In: Cabello, E., Cardoso, J., Maciaszek, L.A., van Sinderen, M. (eds.) ICSOFT 2017. CCIS, vol. 868, pp. 3–27. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93641-3_1

    Chapter  Google Scholar 

  10. Fowkes, J., Sutton, C.: Parameter-free probabilistic API mining across github. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. FSE 2016, New York, NY, USA, pp. 254–265. Association for Computing Machinery (2016). https://doi.org/10.1145/2950290.2950319

  11. Hnatkowska, B., Jaszczak, A.: Impact of selected java idioms on source code maintainability – empirical study. In: Zamojski, W., Mazurkiewicz, J., Sugier, J., Walkowiak, T., Kacprzyk, J. (eds.) Proceedings of the Ninth International Conference on Dependability and Complex Systems DepCoS-RELCOMEX. June 30 – July 4, 2014, BrunĂ³w, Poland. AISC, vol. 286, pp. 243–254. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07013-1_23

    Chapter  Google Scholar 

  12. Ji, X., Liu, L., Zhu, J.: Code clone detection with hierarchical attentive graph embedding. Int. J. Software Eng. Knowl. Eng. 31(06), 837–861 (2021). https://doi.org/10.1142/S021819402150025X

    Article  Google Scholar 

  13. Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Bilardi, G., Italiano, G.F., Pietracaprina, A., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 91–102. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-68530-8_8

    Chapter  Google Scholar 

  14. McCabe, T.: A complexity measure. IEEE Trans. Software Eng. SE–2(4), 308–320 (1976). https://doi.org/10.1109/TSE.1976.233837

    Article  MathSciNet  MATH  Google Scholar 

  15. Papamichail, M., Diamantopoulos, T., Symeonidis, A.: User-perceived source code quality estimation based on static analysis metrics. In: 2016 IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 100–107 (2016). https://doi.org/10.1109/QRS.2016.22

  16. Papoudakis, A., Karanikiotis, T., Symeonidis, A.: A mechanism for automatically extracting reusable and maintainable code idioms from software repositories. In: Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT, pp. 79–90. INSTICC, SciTePress (2022). https://doi.org/10.5220/0011279300003266

  17. Sivaraman, A., Abreu, R., Scott, A., Akomolede, T., Chandra, S.: Mining idioms in the wild. CoRR abs/2107.06402 (2021). https://arxiv.org/abs/2107.06402

  18. Tai, K.C.: The tree-to-tree correction problem. J. ACM 26(3), 422–433 (1979). https://doi.org/10.1145/322139.322143

    Article  MathSciNet  MATH  Google Scholar 

  19. Tanaka, H., Matsumoto, S., Kusumoto, S.: A study on the current status of functional idioms in Java. IEICE Trans. Inf. Syst. E102.D, 2414–2422 (2019). https://doi.org/10.1587/transinf.2019MPP0002

    Article  Google Scholar 

  20. Wang, J., Dang, Y., Zhang, H., Chen, K., Xie, T., Zhang, D.: Mining succinct and high-coverage api usage patterns from source code. In: 2013 10th Working Conference on Mining Software Repositories (MSR), pp. 319–328 (2013). https://doi.org/10.1109/MSR.2013.6624045

  21. Weber, S., Luo, J.: What makes an open source code popular on git hub? In: 2014 IEEE International Conference on Data Mining Workshop, pp. 851–855 (2014). https://doi.org/10.1109/ICDMW.2014.55

  22. Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput. 18(6), 1245–1262 (1989). https://doi.org/10.1137/0218082

  23. Zhang, Y., Wang, T.: CCEYES: an effective tool for code clone detection on large-scale open source repositories. In: 2021 IEEE International Conference on Information Communication and Software Engineering (ICICSE), pp. 61–70 (2021). https://doi.org/10.1109/ICICSE52190.2021.9404141

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Karanikiotis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Karanikiotis, T., Symeonidis, A.L. (2023). Towards Extracting Reusable and Maintainable Code Snippets. In: Fill, HG., van Sinderen, M., Maciaszek, L.A. (eds) Software Technologies. ICSOFT 2022. Communications in Computer and Information Science, vol 1859. Springer, Cham. https://doi.org/10.1007/978-3-031-37231-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-37231-5_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-37230-8

  • Online ISBN: 978-3-031-37231-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics