Studying software logging using topic models

Li, Heng; Chen, Tse-Hsun (Peter); Shang, Weiyi; Hassan, Ahmed E.

doi:10.1007/s10664-018-9595-8

Studying software logging using topic models

Published: 30 January 2018

Volume 23, pages 2655–2694, (2018)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Heng Li¹,
Tse-Hsun (Peter) Chen²,
Weiyi Shang² &
…
Ahmed E. Hassan¹

1361 Accesses
52 Citations
3 Altmetric
Explore all metrics

Abstract

Software developers insert logging statements in their source code to record important runtime information; such logged information is valuable for understanding system usage in production and debugging system failures. However, providing proper logging statements remains a manual and challenging task. Missing an important logging statement may increase the difficulty of debugging a system failure, while too much logging can increase system overhead and mask the truly important information. Intuitively, the actual functionality of a software component is one of the major drivers behind logging decisions. For instance, a method maintaining network communications is more likely to be logged than getters and setters. In this paper, we used automatically-computed topics of a code snippet to approximate the functionality of a code snippet. We studied the relationship between the topics of a code snippet and the likelihood of a code snippet being logged (i.e., to contain a logging statement). Our driving intuition is that certain topics in the source code are more likely to be logged than others. To validate our intuition, we conducted a case study on six open source systems, and we found that i) there exists a small number of “log-intensive” topics that are more likely to be logged than other topics; ii) each pair of the studied systems share 12% to 62% common topics, and the likelihood of logging such common topics has a statistically significant correlation of 0.35 to 0.62 among all the studied systems; and iii) our topic-based metrics help explain the likelihood of a code snippet being logged, providing an improvement of 3% to 13% on AUC and 6% to 16% on balanced accuracy over a set of baseline metrics that capture the structural information of a code snippet. Our findings highlight that topics contain valuable information that can help guide and drive developers’ logging decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

¹ https://qpid.apache.org/components/java-broker
https://issues.apache.org/jira
https://issues.apache.org/jira/browse/QPID-4038
Qpid-Java git commit: d606368b92f3952f57dbabd8553b3b6f426305e1
We share our replication package online: http://sailhome.cs.queensu.ca/replication/LoggingTopicModel
http://logging.apache.org/log4j
http://www.slf4j.org
https://commons.apache.org/logging
http://www.eclipse.org/jdt
http://activemq.apache.org

References

Apache-Commons (2016) Best practices—logging exceptions. https://commons.apache.org/logging/guide.html
Asuncion H U, Asuncion A U, Taylor R N (2010) Software traceability with topic modeling. In: Proceedings of the 32nd international conference on software engineering. ICSE ’10, pp 95–104
Baldi PF, Lopes CV, Linstead EJ, Bajracharya SK (2008a) A theory of aspects as latent topics. In: Proceedings of the 23rd ACM SIGPLAN conference on object-oriented programming systems languages and applications. OOPSLA ’08, pp 543–562
Baldi P F, Lopes C V, Linstead E J, Bajracharya S K (2008b) A theory of aspects as latent topics. In: ACM Sigplan notices, vol 43. ACM, pp 543–562
Bavota G, Oliveto R, Gethers M, Poshyvanyk D, Lucia A D (2014) Methodbook: recommending move method refactorings via relational topic models. IEEE Trans Softw Eng 40(7):671–694
Article Google Scholar
Binkley D, Heinz D, Lawrie D, Overfelt J (2014) Understanding LDA in source code analysis. In: Proceedings of the 22nd international conference on program comprehension, pp 26–36
Blei D M, Ng A Y, Jordan M I (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Bring J (1994) How to standardize regression coefficients. Am Stat 48(3):209–213
Google Scholar
Brown P F, deSouza P V, Mercer R L, Pietra V J D, Lai J C (1992) Class-based n-gram models of natural language. Comput Linguist 18:467–479
Google Scholar
Chang J, Gerrish S, Wang C, Boyd-graber JL, Blei D M (2009) Reading tea leaves: how humans interpret topic models. Adv Neural Inf Process Syst 22:288–296
Google Scholar
Chen B, Jiang Z M (2017) Characterizing and detecting anti-patterns in the logging code. In: Proceedings of the 39th international conference on software engineering. ICSE ’17, pp 71–81
Chen T-H, Thomas S W, Nagappan M, Hassan A (2012) Explaining software defects using topic models. In: Proceedings of the 9th working conference on mining software repositories. MSR ’12, pp 189– 198
Chen T-H, Shang W, Hassan A E, Nasser M, Flora P (2016a) Cacheoptimizer: helping developers configure caching frameworks for hibernate-based database-centric web applications. In: Proceedings of the 24th ACM SIGSOFT international symposium on foundations of software engineering. FSE ’16, pp 666– 677
Chen T-H, Thomas S W, Hassan A E (2016b) A survey on the use of topic models when mining software repositories. Empir Softw Eng 21(5):1843–1919
Chen T-H, Syer M D, Shang W, Jiang Z M, Hassan A E, Nasser M, Flora P (2017a) Analytics-driven load testing: an industrial experience report on load testing of large-scale systems. In: Proceedings of the 39th international conference on software engineering: software engineering in practice track. ICSE-SEIP ’17, pp 243–252
Chen T-H, Shang W, Nagappan M, Hassan A E, Thomas S W (2017b) Topic-based software defect explanation. J Syst Softw 129:79–106
Cleary B, Exton C, Buckley J, English M (2008) An empirical analysis of information retrieval based concept location techniques in software comprehension. Empir Softw Eng 14(1):93–130
Article Google Scholar
Cohen I, Goldszmidt M, Kelly T, Symons J, Chase J S (2004) Correlating instrumentation data to system states: a building block for automated diagnosis and control. In: Proceedings of the 6th conference on symposium on opearting systems design & implementation, pp 16–16
De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2012) Using IR methods for labeling source code artifacts: is it worthwhile? In: Proceedings of the 20th international conference on program comprehension. ICPC ’12, pp 193–202
De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2014) Labeling source code with information retrieval methods: an empirical study. Empir Softw Eng 19(5):1383–1420
Article Google Scholar
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22
Article Google Scholar
Fu Q, Zhu J, Hu W, Lou J-G, Ding R, Lin Q, Zhang D, Xie T (2014) Where do developers log? An empirical study on logging practices in industry. In: Companion proceedings of the 36th international conference on software engineering. ICSE Companion ’14, pp 24–33
Goshtasby A A (2012) Similarity and dissimilarity measures. In: Image registration: principles, tools and methods. Springer London, London, pp 7–66
Groeneveld R A, Meeden G (1984) Measuring Skewness and Kurtosis. J R Stat Soc D (Stat) 33(4):391–399
Google Scholar
Hall D, Jurafsky D, Manning C D (2008) Studying the history of ideas using topic models. In: Proceedings of the 2008 conference on empirical methods in natural language processing. EMNLP ’08, pp 363–371. Association for Computational Linguistics
Hindle A, Bird C, Zimmermann T, Nagappan N (2014) Do topics make sense to managers and developers? Empir Softw Eng
Hu J, Sun X, Lo D, Li B (2015) Modeling the evolution of development topics using dynamic topic models. In: Proceedings of the 22nd IEEE international conference on software analysis, evolution, and reengineering. SANER’15, pp 3–12
Kabacoff R (2011) R in action. Manning Publications Co., Greenwich
Google Scholar
Kabinna S, Bezemer C-P, Hassan A E, Shang W (2016) Examining the stability of logging statements. In: Proceedings of the 23rd IEEE international conference on software analysis, evolution, and reengineering. SANER ’16
Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, Berlin
Book MATH Google Scholar
Kuhn A, Ducasse S, Gírba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49:230–243
Article Google Scholar
Lal S, Sureka A (2016) Logopt: static feature extraction from source code for automated catch block logging prediction. In: Proceedings of the 9th India software engineering conference. ISEC ’16, pp 151– 155
Li H, Shang W, Zou Y, Hassan AE (2017a) Towards just-in-time suggestions for log changes. Empir Softw Eng 22(4):1831–1865
Li H, Shang W, Hassan AE (2017b) Which log level should developers choose for a new logging statement? Empir Softw Eng 22(4):1684–1716
Linstead E, Lopes C, Baldi P (2008) An application of latent Dirichlet allocation to analyzing software evolution. In: Proceedings of seventh international conference on machine learning and applications. ICMLA ’12, pp 813–818
Liu Y, Poshyvanyk D, Ferenc R, Gyimothy T, Chrisochoides N (2009a) Modeling class cohesion as mixtures of latent topics. In: Proceedings of the 25th international conference on software maintenance. ICSE ’09, pp 233–242
Liu Y, Poshyvanyk D, Ferenc R, Gyimothy T, Chrisochoides N (2009b) Modeling class cohesion as mixtures of latent topics. In: Proceedings of the 25th IEEE international conference on software maintenance. ICSM ’09, pp 233–242
Macbeth G, Razumiejczyk E, Ledesma R D (2011) Cliff’s delta calculator: a non-parametric effect size program for two groups of observations. Univ Psychol 10 (2):545–555
Google Scholar
Mariani L, Pastore F (2008) Automated identification of failure causes in system logs. In: Proceedings of the 2008 19th international symposium on software reliability engineering, pp 117–126
Martin T M, Harten P, Young D M, Muratov E N, Golbraikh A, Zhu H, Tropsha A (2012) Does rational selection of training and test sets improve the outcome of qsar modeling? J Chem Inf Model 52(10):2570–2578
Article Google Scholar
Maskeri G, Sarkar S, Heafield K (2008) Mining business topics in source code using latent Dirichlet allocation. In: Proceedings of the 1st India software engineering conference, pp 113–120
McCabe TJ (1976) A complexity measure. IEEE Trans Softw Eng SE-2(4):308–320
Article MathSciNet MATH Google Scholar
McCallum AK (2002) Mallet: a machine learning for language toolkit
Microsoft-MSDN (2016) Logging an exception. https://msdn.microsoft.com/en-us/library/ff664711(v=pandp.50).aspx
Misra H, Cappé O, Yvon F (2008) Using lda to detect semantically incoherent documents. In: Proceedings of the 12th conference on computational natural language learning. CoNLL ’08. Association for Computational Linguistics, pp 41–48
Nguyen T T, Nguyen T N, Phuong T M (2011) Topic-based defect prediction. In: Proceedings of the 33rd international conference on software engineering. ICSE ’11, pp 932–935
Oliner A, Ganapathi A, Xu W (2012) Advances and challenges in log analysis. Commun ACM 55(2):55–61
Article Google Scholar
Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2013) How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms. In: Proceedings of the 2013 international conference on software engineering. ICSE ’13, pp 522–531
Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2016) Parameterizing and assembling ir-based solutions for se tasks using genetic algorithms. In: Proceedings of the 23rd IEEE international conference on software analysis, evolution, and reengineering. SANER ’16
Pecchia A, Cinque M, Carrozza G, Cotroneo D (2015) Industry practices and event logging: assessment of a critical software development process. In: Proceedings of the 37th international conference on software engineering. ICSE ’15, pp 169–178
Poshyvanyk D, Gueheneuc Y, Marcus A, Antoniol G, Rajlich V (2007) Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Trans Softw Eng 33(6):420–432
Article Google Scholar
Rao S, Kak A (2011) Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In: Proceeding of the 8th working conference on mining software repositories. MSR ’11, pp 43–52
Romano J, Kromrey J D, Coraggio J, Skowronek J (2006) Appropriate statistics for ordinal level data: should we really be using t-test and cohen’sd for evaluating group differences on the nsse and other surveys. In: Annual meeting of the Florida association of institutional research, pp 1–33
Shang W, Jiang Z M, Adams B, Hassan A E, Godfrey M W, Nasser M, Flora P (2014) An exploratory study of the evolution of communicated information about the execution of large software systems. J Softw: Evol Process 26(1):3–26
Google Scholar
Shang W, Nagappan M, Hassan AE (2015) Studying the relationship between logging characteristics and the code quality of platform software. Empir Softw Eng 20 (1):1–27
Article Google Scholar
Simon N, Friedman J, Hastie T, Tibshirani R (2011) Regularization paths for cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1–13
Article Google Scholar
Steyvers M, Griffiths T (2007) Probabilistic topic models. In: Handbook of latent semantic analysis, vol 427(7), pp 424–440
Sun X, Li B, Leung H, Li B, Li Y (2015a) Msr4sm: using topic models to effectively mining software repositories for software maintenance tasks. Inf Softw Technol 66:1–12
Sun X, Li B, Li Y, Chen Y (2015b) What information in software historical repositories do we need to support software maintenance tasks? An approach based on topic model. In: Computer and information science. Springer International Publishing, Cham, pp 27–37
Sun X, Liu X, Li B, Duan Y, Yang H, Hu J (2016) Exploring topic models in software engineering data analysis: a survey. In: Proceedings of the 17th IEEE/ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing. SNPD’, vol. 16, pp 357–362
Swinscow TDV, Campbell MJ et al (2002) Statistics at Square One. BMJ, London
Google Scholar
Syer MD, Jiang Z M, Nagappan M, Hassan A E, Nasser M, Flora P (2013) Leveraging performance counters and execution logs to diagnose memory-related performance issues. In: Proceedings of the 29th IEEE international conference on software maintenance. ICSM 13’, pp 110–119
Thomas SW (2012) A lightweight source code preprocesser. https://github.com/doofuslarge/lscp
Thomas S, Adams B, Hassan A E, Blostein D (2010) Validating the use of topic models for software evolution. In: Proceedings of the 10th international working conference on source code analysis and manipulation. SCAM ’10, pp 55–64
Thomas S W, Adams B, Hassan A E, Blostein D (2011) Modeling the evolution of topics in source code histories. In: Proceedings of the 8th working conference on mining software repositories, pp 173–182
Thomas S W, Adams B, Hassan A E, Blostein D (2014) Studying software evolution using topic models. Sci Comput Program 80:457–479
Article Google Scholar
Tian K, Revelle M, Poshyvanyk D (2009) Using latent Dirichlet allocation for automatic categorization of software. In: Proceedings of the 6th international working conference on mining software repositories. MSR ’09, pp 163–166
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Series B (Methodological) 58(1):267–288
MathSciNet MATH Google Scholar
Wallach H M, Mimno D M, McCallum A (2009) Rethinking lda: why priors matter. In: Advances in neural information processing systems. NIPS ’09, pp 1973–1981
Witten I H, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San Mateo
MATH Google Scholar
Xu W, Huang L, Fox A, Patterson D, Jordan M I (2009) Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles. SOSP ’09, pp 117–132
Yuan D, Mai H, Xiong W, Tan L, Zhou Y, Pasupathy S (2010) Sherlog: error diagnosis by connecting clues from run-time logs. SIGARCH Comput Architect News 38(1):143–154
Article Google Scholar
Yuan D, Zheng J, Park S, Zhou Y, Savage S (2011) Improving software diagnosability via log enhancement. In: Proceedings of the sixteenth international conference on architectural support for programming languages and operating systems. ASPLOS ’11, pp 3–14
Yuan D, Park S, Huang P, Liu Y, Lee M M, Tang X, Zhou Y, Savage S (2012a) Be conservative: enhancing failure diagnosis with proactive logging. In: Proceedings of the 10th USENIX conference on operating systems design and implementation. OSDI’12, pp 293–306
Yuan D, Park S, Zhou Y (2012b) Characterizing logging practices in open-source software. In: Proceedings of the 34th international conference on software engineering. ICSE ’12, pp 102–112
Yuan D, Luo Y, Zhuang X, Rodrigues G R, Zhao X, Zhang Y, Jain P U, Stumm M (2014) Simple testing can prevent most critical failures: an analysis of production failures in distributed data-intensive systems. In: Proceedings of the 11th USENIX conference on operating systems design and implementation. OSDI’14, pp 249–265
Zeng L, Xiao Y, Chen H (2015) Linux auditing: overhead and adaptation. In: Proceedings of 2015 IEEE international conference on communications. ICC ’15, pp 7168–7173
Zhang S, Cohen I, Symons J, Fox A (2005) Ensembles of models for automated diagnosis of system performance problems. In: Proceedings of the 2005 international conference on dependable systems and networks. DSN ’05, pp 644–653
Zhu J, He P, Fu Q, Zhang H, Lyu M R, Zhang D (2015) Learning to log: helping developers make informed logging decisions. In: Proceedings of the 37th international conference on software engineering, vol 1. ICSE ’15, pp 415–425

Download references

Author information

Authors and Affiliations

Software Analysis and Intelligence Lab (SAIL), Queen’s University, Kingston, Ontario, Canada
Heng Li & Ahmed E. Hassan
Department of Computer Science and Software Engineering, Concordia University, Montreal, Quebec, Canada
Tse-Hsun (Peter) Chen & Weiyi Shang

Authors

Heng Li
View author publications
You can also search for this author in PubMed Google Scholar
Tse-Hsun (Peter) Chen
View author publications
You can also search for this author in PubMed Google Scholar
Weiyi Shang
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed E. Hassan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heng Li.

Additional information

Communicated by: Miryung Kim

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, H., Chen, TH.(., Shang, W. et al. Studying software logging using topic models. Empir Software Eng 23, 2655–2694 (2018). https://doi.org/10.1007/s10664-018-9595-8

Download citation

Published: 30 January 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s10664-018-9595-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Studying software logging using topic models

Abstract

Access this article

Similar content being viewed by others

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML

Sampling in software engineering research: a critical review and guidelines

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Studying software logging using topic models

Abstract

Access this article

Similar content being viewed by others

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML

Sampling in software engineering research: a critical review and guidelines

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation