Skip to main content

A Personalized Code Formatter: Detection and Fixing

  • Conference paper
  • First Online:
Software Technologies (ICSOFT 2021)

Abstract

The wide adoption of component-based software development and the (re)use of software residing in code hosting platforms have led to an increased interest shown towards source code readability and comprehensibility. One factor that can undeniably improve readability is the consistent code styling and formatting used across a project. To that end, many code formatting approaches usually define a set of rules, in order to model a commonly accepted formatting. However, this approach is mostly based on the experts’ expertise, is time-consuming and ignores the specific styling and formatting a team selects to use. Thus, it becomes too intrusive and may be not adopted. In this work, we present an automated mechanism that can be trained to identify deviations from the selected formatting style of a given project, given a set of source code files, and provide recommendations towards maintaining a common styling across all files of the project. At first, source code is transformed into small meaningful pieces, called tokens, which are used to train the models of our mechanism, in order to predict the probability of a token being wrongly positioned. Then, a number of possible fixes are examined as replacements of the wrongly positioned token and, based on a scoring function, the most suitable fixes are given as recommendations to the developer. Preliminary evaluation on various axes indicates that our approach can effectively detect formatting deviations from the project’s code styling and provide actionable recommendations to the developer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://gist.github.com/karanikiotis/263251decb86f839a3265cc2306355b2.

  2. 2.

    https://github.com/keras-team/keras.

  3. 3.

    https://github.com/KTH/codrep-2019/tree/master/Datasets.

  4. 4.

    https://github.com/KTH/codrep-2019.

References

  1. Allamanis, M., Barr, E.T., Bird, C., Sutton, C.: Learning natural coding conventions. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pp. 281–293. Association for Computing Machinery, New York (2014). https://doi.org/10.1145/2635868.2635883

  2. Codrep: Codrep 2019 (2019). https://github.com/KTH/codrep-2019. Accessed 27 Sept 2020

  3. GNU Project: Indent - GNU project (2007). https://www.gnu.org/software/indent/. Accessed 27 Sept 2020

  4. Hellendoorn, V.J., Devanbu, P.: Are deep neural networks the best choice for modeling source code? In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017, pp. 763–773. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3106237.3106290

  5. Hindle, A., Godfrey, M.W., Holt, R.C.: From indentation shapes to code structures. In: 2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation, pp. 111–120 (2008)

    Google Scholar 

  6. Hochreiter, S., Schmidhuber, J.: LSTM can solve hard long time lag problems. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems, vol. 9, pp. 473–479. MIT Press (1997). http://papers.nips.cc/paper/1215-lstm-can-solve-hard-long-time-lag-problems.pdf

  7. Karanikiotis, T., Chatzidimitriou, K.C., Symeonidis, A.L.: Towards automatically generating a personalized code formatting mechanism. In: Proceedings of the 16th International Conference on Software Technologies (2021). https://doi.org/10.5220/0010579900900101

  8. Kesler, T.E., Uram, R.B., Magareh-Abed, F., Fritzsche, A., Amport, C., Dunsmore, H.: The effect of indentation on program comprehension. Int. J. Man-Mach. Stud. 21(5), 415–428 (1984) https://doi.org/10.1016/S0020-7373(84)80068-1. http://www.sciencedirect.com/science/article/pii/S0020737384800681

  9. Lee, T., Lee, J.B., In, H.: A study of different coding styles affecting code readability. Int. J. Softw. Eng. Its Appl. 7, 413–422 (2013). https://doi.org/10.14257/ijseia.2013.7.5.36

  10. Loriot, B., Madeiral, F., Monperrus, M.: STYLER: learning formatting conventions to repair checkstyle errors. CoRR abs/1904.01754 (2019). http://arxiv.org/abs/1904.01754

  11. Markovtsev, V., Long, W., Mougard, H., Slavnov, K., Bulychev, E.: Style-analyzer: fixing code style inconsistencies with interpretable unsupervised algorithms, pp. 468–478, May 2019. https://doi.org/10.1109/MSR.2019.00073. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85072331325 &doi=10.1109%2fMSR.2019.00073 &partnerID=40 &md5=1c53eb83d17352bd9e21fc03c40f7ef3

  12. Miara, R.J., Musselman, J.A., Navarro, J.A., Shneiderman, B.: Program indentation and comprehensibility. Commun. ACM 26(11), 861–867 (1983). https://doi.org/10.1145/182.358437

  13. Ogura, N., Matsumoto, S., Hata, H., Kusumoto, S.: Bring your own coding style. In: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 527–531 (2018). https://doi.org/10.1109/SANER.2018.8330253

  14. Parr, T., Vinju, J.: Towards a universal code formatter through machine learning. In: Proceedings of the 2016 ACM SIGPLAN International Conference on Software Language Engineering, SLE 2016, pp. 137–151. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2997364.2997383

  15. Posnett, D., Hindle, A., Devanbu, P.: A simpler model of software readability. In: Proceedings of the 8th Working Conference on Mining Software Repositories, MSR 2011, pp. 73–82. Association for Computing Machinery, New York (2011). https://doi.org/10.1145/1985441.1985454

  16. Prabhu, R., Phutane, N., Dhar, S., Doiphode, S.: Dynamic formatting of source code in editors. In: 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1–6 (2017). https://doi.org/10.1109/ICIIECS.2017.8276008

  17. Prettier: Prettier (2017). https://prettier.io/. Accessed 27 Sept 2020

  18. Santos, E.A., Campbell, J.C., Patel, D., Hindle, A., Amaral, J.N.: Syntax and sensibility: using language models to detect and correct syntax errors. In: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 311–322 (2018)

    Google Scholar 

  19. Scalabrino, S., Linares-Vásquez, M., Poshyvanyk, D., Oliveto, R.: Improving code readability models with textual features. In: 2016 IEEE 24th International Conference on Program Comprehension (ICPC), pp. 1–10 (2016). https://doi.org/10.1109/ICPC.2016.7503707

  20. Scalabrino, S., Linares-Vásquez, M., Oliveto, R., Poshyvanyk, D.: A comprehensive model for code readability. J. Softw. Evol. Process 30 (2018). https://doi.org/10.1002/smr.1958

  21. Seo, K.K.: An application of one-class support vector machines in content-based image retrieval. Exp. Syst. Appl. 33(2), 491–498 (2007) https://doi.org/10.1016/j.eswa.2006.05.030. http://www.sciencedirect.com/science/article/pii/S0957417406001655

  22. Tysell Sundkvist, L., Persson, E.: Code styling and its effects on code readability and interpretation. Ph.D. thesis, KTH Royal Institute of Technology (2017). http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209576

  23. Wang, X., Pollock, L., Vijay-Shanker, K.: Automatic segmentation of method code into meaningful blocks to improve readability. In: 2011 18th Working Conference on Reverse Engineering, pp. 35–44 (2011)

    Google Scholar 

  24. White, M., Vendome, C., Linares-Vásquez, M., Poshyvanyk, D.: Toward deep learning software repositories. In: Proceedings of the 12th Working Conference on Mining Software Repositories, MSR 2015, pp. 334–345. IEEE Press (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Karanikiotis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Karanikiotis, T., Chatzidimitriou, K.C., Symeonidis, A.L. (2022). A Personalized Code Formatter: Detection and Fixing. In: Fill, HG., van Sinderen, M., Maciaszek, L.A. (eds) Software Technologies. ICSOFT 2021. Communications in Computer and Information Science, vol 1622. Springer, Cham. https://doi.org/10.1007/978-3-031-11513-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-11513-4_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-11512-7

  • Online ISBN: 978-3-031-11513-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics