Skip to main content
Log in

Demystifying the challenges and benefits of analyzing user-reported logs in bug reports

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Logs in bug reports provide important debugging information for developers. During the debugging process, developers need to study the bug report and examine user-provided logs to understand the system executions that lead to the problem. Intuitively, user-provided logs illustrate the problems that users encounter and may help developers with the debugging process. However, some logs may be incomplete or inaccurate, which can cause difficulty for developers to diagnose the bug, and thus, delay the bug fixing process. In this paper, we conduct an empirical study on the challenges that developers may encounter when analyzing the user-provided logs and their benefits. In particular, we study both log snippets and exception stack traces in bug reports. We conduct our study on 10 large-scale open-source systems with a total of 1,561 bug reports with logs (BRWL) and 7,287 bug reports without logs (BRNL). Our findings show that: 1) BRWL takes longer time (median ranges from 3 to 91 days) to resolve compared to BRNL (median ranges from 1 to 25 days). We also find that reporters may not attach accurate or sufficient logs (i.e., developers often ask for additional logs in the Comments section of a bug report), which extends the bug resolution time. 2) Logs often provide a good indication of where a bug is located. Most bug reports (73%) have overlaps between the classes that generate the logs and their corresponding fixed classes. However, there is still a large number of bug reports where there is no overlap between the logged and fixed classes. 3) Our manual study finds that there is often missing system execution information in the logs. Many logs only show the point of failure (e.g., exception) and do not provide a direct hint on the actual root cause. In fact, through call graph analysis, we find that 28% of the studied bug reports have the fixed classes reachable from the logged classes, while they are not visible in the logs attached in bug reports. In addition, some logging statements are removed in the source code as the system evolves, which may cause further challenges in analyzing the logs. In short, our findings highlight possible future research directions to better help practitioners attach or analyze logs in bug reports.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. https://github.com/SPEAR-SE/LogInBugReportsEmpirical_Data

References

  • Anvik J, Hiew L, Murphy G C (2006) Who should fix this bug? In: Proceedings of the 28th international conference on software engineering, ICSE ’06, pp 361–370

  • Apache (2019) Aapache JIRA. Last accessed: Feb. 1, 2019

  • Bettenburg N, Just S, Schrter A, Weiss C, Premraj R, Zimmermann T (2008a) What makes a good bug report? In: Proceedings of the 16th international symposium on foundations of software engineering

  • Bettenburg N, Premraj R, Zimmermann T, Kim S (2008b) Duplicate bug reports considered harmful... really? In: Proceedings of the 24th IEEE international conference on software maintenance, ICSM ’18

  • Bhagwan R, Kumar R, Maddila C S, Philip A A (2018) Orca: Differential bug localization in large-scale services. In: 13th USENIX symposium on operating systems design and implementation (OSDI 18). USENIX Association, pp 493–509

  • Bianchi F A, Pezzè M, Terragni V (2017) Reproducing concurrency failures from crash stacks. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering, ESEC/FSE 2017, pp 705–716

  • Cao Y, Zhang H, Ding S (2014) Symcrash: Selective recording for reproducing crashes. In: Proceedings of the 29th ACM/IEEE international conference on automated software engineering, ASE ’14, pp 791–802

  • Chaparro O, Florez J M, Marcus A (2017) Using observed behavior to reformulate queries during text retrieval-based bug localization. In: Proceedings of the 33rd international conference on software maintenance and evolution, ICSME ’17, pp 376–387

  • Chen T-H, Nagappan M, Shihab E, Hassan A E (2014) An empirical study of dormant bugs. In: Proceedings of the 11th working conference on mining software repositories, MSR 2014, pp 82–91

  • Chen T-H, Shang W, Hassan A E, Nasser M, Flora P (2016) Cacheoptimizer: Helping developers configure caching frameworks for hibernate-based database-centric web applications. In: Proceedings of the 24th ACM SIGSOFT international symposium on foundations of software engineering, FSE 2016, pp 666–677

  • Chen T-H, Syer M D, Shang W, Jiang Z M, Hassan A E, Nasser M, Flora P (2017a) Analytics-driven load testing: An industrial experience report on load testing of large-scale systems. In: Proceedings of the 39th international conference on software engineering: software engineering in practice track, ICSE-SEIP ’17, pp 243–252

  • Chen B, Jiang Z M (2017b) Characterizing logging practices in java-based open source software projects – a replication study in apache software foundation. Empir Softw Eng 22(1):330–374

  • Chen B, Song J, Xu P, Hu X, Jiang Z M J (2018) An automated approach to estimating code coverage measures via execution logs. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, ASE ’18, pp 305–316

  • Cliff N (November 1993) Dominance statistics: Ordinal analyses to answer ordinal questions. Psychol Bull 114(3):494–509

  • Dao T, Zhang L, Meng N (2017) How does execution information help with information-retrieval based bug localization? In: Proceedings of the 25th international conference on program comprehension, ICPC ’17, pp 241–250

  • Fu Q, Zhu J, Hu W, Lou J-G, Ding R, Lin Q, Zhang D, Xie T (2014) Where do developers log? an empirical study on logging practices in industry. In: Proceedings of the 36th international conference on software engineering, ICSE-SEIP ’14, 24–33

  • Hassani M, Shang W, Shihab E, Tsantalis N (2018) Studying and detecting log-related issues. Empirical Software Engineering

  • IEEE (2020) Ieee definitions. https://standards.ieee.org/standard/610_12-1990.html. Last accessed March 23 2020

  • Jin W, Orso A (2012) Bugredux: Reproducing field failures for in-house debugging. In: Proceedings of the 34th international conference on software engineering, ICSE ’12, pp 474–484

  • Kim S, Zimmermann T, Pan K, Whitehead E J J (2006) Automatic identification of bug-introducing changes. In: Proceedings of the 21st international conference on automated software engineering (ASE)

  • Lam A N, Nguyen A T, Nguyen H A, Nguyen T N (2017) Bug localization with combination of deep learning and information retrieval. In: Proceedings of the 25th international conference on program comprehension, ICPC ’17, pp 218–229

  • LaToza T D, Myers B A (2010) Developers ask reachability questions. In: Proceedings of the 32Nd ACM/IEEE international conference on software engineering, ICSE ’10, pp 185–194

  • Li H, Chen T-H P, Hassan A E, Nasser M, Flora P (2018) Adopting autonomic computing capabilities in existing large-scale systems: an industrial experience report. In: Proceedings of the 40th international conference on software engineering: software engineering in practice, ICSE-SEIP ’18, pp 1–10

  • Li Z, Chen T-H P, Yang J, Shang W (2019) DLfinder: Characterizing and detecting duplicate logging code smells. In: Proceedings of the 41st international conference on software engineering, ICSE ’19, pp 152–163

  • Li H, Shang W, Adams B, Sayagh M, Hassan A E (2020a) A qualitative study of the benefits and costs of logging from developers’ perspectives. IEEE Transactions on Software Engineering

  • Li Z, Chen T-H, Shang W (2020b) Where shall we log? studying and suggesting logging locations in code blocks. In: Proceedings of the 35rd IEEE/ACM international conference on automated software engineering (ASE)

  • Lin Q, Zhang H, Lou J-G, Zhang Y, Chen X (2016) Log clustering based problem identification for online service systems. In: Proceedings of the 38th international conference on software engineering companion, ICSE ’16, pp 102–111

  • Liu B, Lucia, Nejati S, Briand L C, Bruckmann T (2016) Simulink fault localization: an iterative statistical debugging approach. Softw Test Verif Reliab 26(6):431–459

    Article  Google Scholar 

  • Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes? pp 1–5

  • Loyola P, Gajananan K, Satoh F (2018) Bug localization by learning to rank and represent bug inducing changes. In: Proceedings of the 27th ACM international conference on information and knowledge management, CIKM ’18, pp 657–665

  • Moore DS, MacCabe GP, Craig BA (2009) Introduction to the practice of statistics. W.H. Freeman and Company

  • Moreno L, Treadway J J, Marcus A, Shen W (2014) On the use of stack traces to improve text retrieval-based bug localization. In: 2014 IEEE international conference on software maintenance and evolution. IEEE, pp 151–160

  • Rahman M M, Roy C K (2018) Improving bug localization with report quality dynamics and query reformulation. In: Proceedings of the 40th international conference on software engineering: companion proceeedings, ICSE ’18, pp 348–349

  • Romano J, Kromrey J D, Coraggio J, Skowronek J (2006) Appropriate statistics for ordinal level data: Should we really be using t-test and cohen’sd for evaluating group differences on the nsse and other surveys. In: Annual meeting of the Florida association of institutional research, pp 1–33

  • Saha R K, Lease M, Khurshid S, Perry D E (2013) Improving bug localization using structured information retrieval. In: Proceedings of the 28th IEEE/ACM international conference on automated software engineering, ASE’13, pp 345–355

  • Satvat K, Saxena N (2018) Crashing privacy: An autopsy of a web browser’s leaked crash reports. CoRR, 1808.01718

  • Schroter A, Schröter A, Bettenburg N, Premraj R (2010) Do stack traces help developers fix bugs? In: 2010 7th IEEE working conference on mining software repositories (MSR 2010). IEEE, pp 118–121

  • Shang W, Jiang Z M, Hemmati H, Adams B, Hassan A E, Martin P (2013) Assisting developers of big data analytics applications when deploying on hadoop clouds. In: Proceedings of the 2013 international conference on software engineering, ICSE ’13, pp 402–411

  • Sisman B, Kak A C (2012) Incorporating version histories in information retrieval based bug localization. In: Proceedings of the 9th IEEE working conference on mining software repositories, MSR ’12, pp 50–59

  • Soltani M, Panichella A, Van Deursen A (2018) Search-based crash reproduction and its impact on debugging. IEEE Trans Softw Eng:1–1

  • Tucek J, Lu S, Huang C, Xanthos S, Zhou Y (2007) Triage: Diagnosing production run failures at the user’s site. In: Proceedings of 21st ACM SIGOPS symposium on operating systems principles, SOSP ’07, 131–144

  • Wang S, Lo D (2016) Amalgam+: Composing rich information sources for accurate bug localization. J Softw Evol Process 28(10):921–942

    Article  Google Scholar 

  • Wong C-P, Xiong Y, Zhang H, Hao D, Zhang L, Mei H (2014) Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis. In: Proceedings of the 2014 IEEE international conference on software maintenance and evolution, ICSME’14, pp 181–190

  • Wu R, Zhang H, Cheung S-C, Kim S (2014) Crashlocator: Locating crashing faults based on crash stacks. In: Proceedings of the 2014 international symposium on software testing and analysis, ISSTA 2014, pp 204–214

  • Xu W, Huang L, Fox A, Patterson D, Jordan M I (2009) Detecting large-scale system problems by mining console logs. In: SOSP ’09: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles. ACM, Big Sky, pp 117–132

  • Yuan D, Mai H, Xiong W, Tan L, Zhou Y, Pasupathy S (2010) Sherlog: Error diagnosis by connecting clues from run-time logs. In: Proceedings of the 15th international conference on architectural support for programming languages and operating systems (ASPLOS), pp 143–154

  • Yuan D, Zheng J, Park S, Zhou Y, Savage S (2011) Improving software diagnosability via log enhancement. In: ASPLOS ’11: Proceedings of the sixteenth international conference on architectural support for programming languages and operating systems. ACM, pp 3–14

  • Yuan D, Park S, Huang P, Liu Y, Lee M M, Tang X, Zhou Y, Savage S (2012a) Be conservative: Enhancing failure diagnosis with proactive logging. In: Presented as part of the 10th USENIX symposium on operating systems design and implementation (OSDI 12), pp 293–306

  • Yuan D, Park S, Zhou Y (2012b) Characterizing logging practices in open-source software. In: Proceedings of the 2012 international conference on software engineering, pp 102–112

  • Yuan D, Luo Y, Zhuang X, Rodrigues G R, Zhao X, Zhang Y, Jain P U, Stumm M (2014) Simple testing can prevent most critical failures: an analysis of production failures in distributed data-intensive systems. In: Proceedings of the 11th USENIX conference on operating systems design and implementation, OSDI’14, pp 249–265

  • Zeng Y, Chen J, Shang W, Chen T-H P (2019) Studying the characteristics of logging practices in mobile apps: a case study on f-droid. Empir Softw Eng 24(6):3394–3434

    Article  Google Scholar 

  • Zhao X, Zhang Y, Lion D, Ullah M F, Luo Y, Yuan D, Stumm M (2014) Lprof: A non-intrusive request flow profiler for distributed systems. In: Proceedings of the 11th USENIX conference on operating systems design and implementation, OSDI’14. USENIX Association, pp 629–644

  • Zhou J, Zhang H, Lo D (2012) Where should the bugs be fixed? - more accurate information retrieval-based bug localization based on bug reports. In: Proceedings of the 34th international conference on software engineering, ICSE ’12, pp 14–24

  • Zimmermann T, Premraj R, Bettenburg N, Just S, Schroter A, Weiss C (2010) What makes a good bug report?. IEEE Trans Softw Eng 36 (5):618–643. https://doi.org/10.1109/TSE.2010.63

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to An Ran Chen.

Additional information

Communicated by: Romain Robbes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, A.R., Chen, TH.(. & Wang, S. Demystifying the challenges and benefits of analyzing user-reported logs in bug reports. Empir Software Eng 26, 8 (2021). https://doi.org/10.1007/s10664-020-09893-w

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-020-09893-w

Keywords

Navigation