Skip to main content

Automatically Generating Descriptive Texts in Logging Statements: How Far Are We?

  • Conference paper
  • First Online:
Programming Languages and Systems (APLAS 2020)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12470))

Included in the following conference series:

Abstract

In most cases, logs are the only accurate information available for administrators to understand system behavior and diagnose failure root causes. However, due to the lack of well-defined logging guidance, it is challenging for developers to decide what to log, especially logging statements that contain descriptive texts and variables. In this paper, we explore automatically generation of descriptive texts in logging statements and evaluate the effectiveness of various automatic generation methods. We propose that to generate descriptive texts in logging statements can be transferred as a retrieval-based Q&A task. According to the roles of query and answer, we design two retrieval strategies including Code&Code and Code&Log. To measure the similarity between the query and answer, we utilize two types of retrieval algorithms including Information retrieval-based and neural networks-based algorithms. We conduct a systematic analysis of various retrieval algorithms under different retrieval strategies in terms of their effectiveness, and assess their accuracy using the automatic metrics and human evaluation during which 5 instructive findings are presented. We believe that these findings can provide potential implications for both researchers and practitioners for relevant research. Moreover, we construct and release a log text dataset containing over 138K valid log texts from 85 Java projects in Apache ecosystem for future logging statement analysis and generation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/liuxiaotong0302/LogSearch.

References

  1. Chen, B., Jiang, Z.M.J.: Characterizing logging practices in Java-based open source software projects–a replication study in Apache Software Foundation. Empirical Softw. Eng. 22, 330–374 (2017)

    Article  Google Scholar 

  2. Zhu, J., He, P., Fu, Q., Zhang, H., Lyu, M.R., Zhang, D.: Learning to log: helping developers make informed logging decisions. In: Proceedings of the 37th International Conference on Software Engineering, vol. 1, pp. 415–425. IEEE Press (2015)

    Google Scholar 

  3. Yuan, D., Zheng, J., Park, S., Zhou, Y., Savage, S.: Improving software diagnosability via log enhancement. ACM Trans. Comput. Syst. (TOCS) 30, 4 (2012)

    Google Scholar 

  4. He, P., Chen, Z., He, S., Lyu, M.R.: Characterizing the natural language descriptions in software logging statements. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 178–189. ACM (2018)

    Google Scholar 

  5. Lv, F., Zhang, H., Lou, J.-g., Wang, S., Zhang, D., Zhao, J.: Codehow: effective code search based on API understanding and extended boolean model (e). In: 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 260–270. IEEE (2015)

    Google Scholar 

  6. Gu, X., Zhang, H., Zhang, D., Kim, S.: Deep API learning. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 631–642

    Google Scholar 

  7. Gu, X., Zhang, H., Kim, S.: Deep code search. In: IEEE/ACM 40th International Conference on Software Engineering (ICSE), pp. 933–944. IEEE (2018)

    Google Scholar 

  8. Eclipse JDT. http://www.eclipse.org/jdt/

  9. Camel Case. https://en.wikipedia.org/wiki/camelcase

  10. Levenshtein Distance. https://en.wikipedia.org/wiki/Levenshtein_distance

  11. Jaccard Index. https://en.wikipedia.org/wiki/Jaccard_index

  12. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp. 311–318. Association for Computational Linguistics (2002)

    Google Scholar 

  13. Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)

    Google Scholar 

  14. Luong, M.-T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)

  15. See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368 (2017)

  16. Wu, Y., Wei, F., Huang, S., Wang, Y., Li, Z., Zhou, M.: Response generation by context-aware prototype editing. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 7281–7288 (2019)

    Google Scholar 

  17. Yuan, D., et al.: Be conservative: enhancing failure diagnosis with proactive logging. In: Presented as part of the 10th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 2012), pp. 293–306 (2012)

    Google Scholar 

  18. Lal, S., Sardana, N., Sureka, A.: LogOptPlus: learning to optimize logging in catch and if programming constructs. In: IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), pp. 215–220. IEEE (2016)

    Google Scholar 

  19. Jia, T., Li, Y., Zhang, C., Xia, W., Jiang, J., Liu, Y.: Machine deserves better logging: a log enhancement approach for automatic fault diagnosis. In: IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 106–111. IEEE (2018)

    Google Scholar 

  20. Zhao, X., Rodrigues, K., Luo, Y., Stumm, M., Yuan, D., Zhou, Y.: The game of twenty questions: do you know where to log? In: Proceedings of the 16th Workshop on Hot Topics in Operating Systems, pp. 125–131. ACM (2017)

    Google Scholar 

  21. Cinque, M., Cotroneo, D., Pecchia, A.: Event logs for the analysis of software failures: a rule-based approach. IEEE Trans. Software Eng. 39, 806–821 (2012)

    Article  Google Scholar 

  22. Chen, B., Jiang, Z.M.J.: Characterizing and detecting anti-patterns in the logging code. In: Proceedings of the 39th International Conference on Software Engineering, pp. 71–81. IEEE Press (2017)

    Google Scholar 

  23. Li, H., Shang, W., Hassan, A.E.: Which log level should developers choose for a new logging statement? Empirical Softw. Eng. 22(4), 1684–1716 (2016). https://doi.org/10.1007/s10664-016-9456-2

    Article  Google Scholar 

  24. Su, Z., Ahn, B.-R., Eom, K.-Y., Kang, M.-K., Kim, J.-P., Kim, M.-K.: Plagiarism detection using the Levenshtein distance and Smith-Waterman algorithm. In: 3rd International Conference on Innovative Computing Information and Control, pp. 569–569. IEEE (2008)

    Google Scholar 

  25. Apache Ecosystem. https://www.apache.org/

  26. McMillan, C., Grechanik, M., Poshyvanyk, D., Fu, C., Xie, Q.: Exemplar: a source code search engine for finding highly relevant applications. IEEE Trans. Softw. Eng. 38, 1069–1087 (2011)

    Article  Google Scholar 

  27. Wang, K., Ming, Z., Chua, T.-S.: A syntactic tree matching approach to finding similar questions in community-based QA services. In: Proceedings of the 32nd ACM SIGIR conference on Research and development in information retrieval, pp. 187–194. ACM (2019)

    Google Scholar 

  28. Shen, Y., Rong, W., Sun, Z., Ouyang, Y., Xiong, Z.: Question/answer matching for CQA system via combining lexical and sequential information. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)

    Google Scholar 

  29. Pecchia, A., Cinque, M., Carrozza, G., Cotroneo, D.: Industry practices and event logging: assessment of a critical software development process. In: Proceedings of the 37th International Conference on Software Engineering, vol. 2, pp. 169–178. IEEE Press (2015)

    Google Scholar 

  30. Li, Z., Chen, T.-H., Yang, J., Shang, W.: DLFinder: characterizing and detecting duplicate logging code smells. In: IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 152–163. IEEE (2019)

    Google Scholar 

  31. Liu, Z., Xia, X., Hassan, A.E., Lo, D., Xing, Z., Wang, X.: Neural-machine-translation-based commit message generation: how far are we? In: IEEE/ACM 33rd International Conference on Automated Software Engineering (ASE), pp. 373–384. IEEE (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, X., Jia, T., Li, Y., Yu, H., Yue, Y., Hou, C. (2020). Automatically Generating Descriptive Texts in Logging Statements: How Far Are We?. In: Oliveira, B.C.d.S. (eds) Programming Languages and Systems. APLAS 2020. Lecture Notes in Computer Science(), vol 12470. Springer, Cham. https://doi.org/10.1007/978-3-030-64437-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64437-6_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64436-9

  • Online ISBN: 978-3-030-64437-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics