skip to main content
10.1145/3361242.3362774acmotherconferencesArticle/Chapter ViewAbstractPublication PagesinternetwareConference Proceedingsconference-collections
research-article

A Neural-Network based Code Summarization Approach by Using Source Code and its Call Dependencies

Authors Info & Claims
Published:28 October 2019Publication History

ABSTRACT

Code summarization aims at generating natural language abstraction for source code, and it can be of great help for program comprehension and software maintenance. The current code summarization approaches have made progress with neural-network. However, most of these methods focus on learning the semantic and syntax of source code snippets, ignoring the dependency of codes. In this paper, we propose a novel method based on neural-network model using the knowledge of the call dependency between source code and its related codes. We extract call dependencies from the source code, transform it as a token sequence of method names, and leverage the Seq2Seq model for code summarization using the combination of source code and call dependency information. About 100,000 code data is collected from 1,000 open source Java proejects on github for experiment. The large-scale code experiment shows that by considering not only the code itself but also the codes it called, the code summarization model can be improved with the BLEU score to 33.08.

References

  1. Miltiadis Allamanis, Earl T Barr, Christian Bird, and Charles Sutton. 2014. Learning natural coding conventions. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 281--293.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Miltiadis Allamanis, Earl T Barr, Christian Bird, and Charles Sutton. 2015. Suggesting accurate method and class names. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACM, 38--49.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Miltos Allamanis, Daniel Tarlow, Andrew Gordon, and Yi Wei. 2015. Bimodal modelling of source code and natural language. In International Conference on Machine Learning. 2123--2132.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Uri Alon, Omer Levy, and Eran Yahav. 2018. code2seq: Generating sequences from structured representations of code. arXiv preprint arXiv:1808.01400 (2018).Google ScholarGoogle Scholar
  5. Alberto Bacchelli, Michele Lanza, and Romain Robbes. 2010. Linking e-mails and source code artifacts. In 2010 ACM/IEEE 32nd International Conference on Software Engineering, Vol. 1. IEEE, 375--384.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google ScholarGoogle Scholar
  7. Themistoklis Diamantopoulos, Georgios Karagiannopoulos, and Andreas Symeonidis. 2018. Codecatch: extracting source code snippets from online sources. In 2018 IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE). IEEE, 21--27.Google ScholarGoogle Scholar
  8. Brian P Eddy, Jeffrey A Robinson, Nicholas A Kraft, and Jeffrey C Carver. 2013. Evaluating source code summarization techniques: Replication and expansion. In 2013 21st International Conference on Program Comprehension (ICPC). IEEE, 13--22.Google ScholarGoogle ScholarCross RefCross Ref
  9. Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2016. Deep API learning. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 631--642.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sonia Haiduc, Jairo Aponte, Laura Moreno, and Andrian Marcus. 2010. On the use of automated text summarization techniques for summarizing source code. In 2010 17th Working Conference on Reverse Engineering. IEEE, 35--44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, et al. 2018. Achieving human parity on automatic chinese to english news translation. arXiv preprint arXiv:1803.05567 (2018).Google ScholarGoogle Scholar
  12. Emily Hill, Lori Pollock, and K Vijay-Shanker. 2009. Automatically capturing source code context of nl-queries for software maintenance and reuse. In Proceedings of the 31st International Conference on Software Engineering. IEEE Computer Society, 232--242.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2018. Deep code comment generation. In Proceedings of the 26th Conference on Program Comprehension. ACM, 200--210.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Xing Hu, Ge Li, Xin Xia, David Lo, Shuai Lu, and Zhi Jin. 2018. Summarizing source code with transferred api knowledge. (2018).Google ScholarGoogle Scholar
  15. Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing source code using a neural attention model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. 2073--2083.Google ScholarGoogle ScholarCross RefCross Ref
  16. Siyuan Jiang, Ameer Armaly, and Collin McMillan. 2017. Automatically generating commit messages from diffs using neural machine translation. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, 135--146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Laura Moreno, Jairo Aponte, Giriprasad Sridhara, Andrian Marcus, Lori Pollock, and K Vijay-Shanker. 2013. Automatic generation of natural language summaries for java classes. In 2013 21st International Conference on Program Comprehension (ICPC). IEEE, 23--32.Google ScholarGoogle ScholarCross RefCross Ref
  18. Najam Nazar, Yan Hu, and He Jiang. 2016. Summarizing software artifacts: A literature review. Journal of Computer Science and Technology 31, 5 (2016), 883--909.Google ScholarGoogle ScholarCross RefCross Ref
  19. Federico Tomassetti Nicholas Smith, Danny van Bruggen. [n.d.]. JAVAPARSER FOR PROCESSING JAVA CODE. https://javaparser.org/.Google ScholarGoogle Scholar
  20. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 311--318.Google ScholarGoogle Scholar
  21. Nicholas Smith, Danny van Bruggen, and Federico Tomassetti. 2017. JavaParser: visited. Leanpub, oct. de (2017).Google ScholarGoogle Scholar
  22. Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104--3112.Google ScholarGoogle Scholar
  23. Eike von Savigny. 2010. Ludwig Wittgenstein: Philosophische Untersuchungen. Vol. 13. Walter de Gruyter.Google ScholarGoogle Scholar
  24. Gang Yin, Tao Wang, Huaimin Wang, Qiang Fan, Yang Zhang, Yue Yu, and Cheng Yang. 2015. OSSEAN: mining crowd wisdom in open source communities. In 2015 IEEE Symposium on Service-Oriented System Engineering. IEEE, 367--371.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Neural-Network based Code Summarization Approach by Using Source Code and its Call Dependencies

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      Internetware '19: Proceedings of the 11th Asia-Pacific Symposium on Internetware
      October 2019
      179 pages
      ISBN:9781450377010
      DOI:10.1145/3361242

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 October 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Internetware '19 Paper Acceptance Rate20of35submissions,57%Overall Acceptance Rate55of111submissions,50%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader