poster

On the effectiveness of labeled latent dirichlet allocation in automatic bug-report categorization

Author:
Minhaz F. Zibran

University of New Orleans, New Orleans, LA

University of New Orleans, New Orleans, LA
View Profile

ICSE '16: Proceedings of the 38th International Conference on Software Engineering CompanionMay 2016Pages 713–715https://doi.org/10.1145/2889160.2892646

Published:14 May 2016Publication History

ICSE '16: Proceedings of the 38th International Conference on Software Engineering Companion

Pages 713–715

ABSTRACT

Bug-reports are valuable sources of information. However, study of the bug-reports' content written in natural language demands tedious human efforts for manual interpretation. This difficulty limits the scale of empirical studies, which rely on interpretation and categorization of bug-reports. In this work, we investigate the effectiveness of Labeled Latent Dirichlet Allocation (LLDA) in automatic classification of bug-reports into a predefined set of categories.

References

Stanford topic modelling toolbox, http://nlp.stanford.edu/software/tmt/tmt-0.3, last access: Jan 2016.Google Scholar
P. Anbalagan and M. Vouk. On predicting the time taken to correct bug reports in open source projects. In Proceedings of the International Conference on Software Maintenance (ICSM), pages 523--526, 2009.Google ScholarCross Ref
G. Antoniol, K. Ayari, M. Penta, F. Khomh, and Y. Guéhéneuc. Is it a bug or an enhancement? In Proceedings of the Centre for Advanced Studies Conference (CASCON), pages 304--318, 2008. Google ScholarDigital Library
E. Giger, M. Pinzger, and H. Gall. Predicting the fix time of bugs. In Proceedings of the International Workshop on Recommendation Systems for Software Engineering (WRSSE), pages 52--56, 2010. Google ScholarDigital Library
A. Lamkanfi, S. Demeyer, E. Giger, and B. Goethals. Predicting the severity of a reported bug. In Proceedings of the International Conference on Mining Software Repositories (MSR), pages 1--10, 2010.Google ScholarCross Ref
T. Menzies and A. Marcus. Automated severity assessment of software defect reports. In Proceedings of the International Conference on Software Maintenance (ICSM), pages 346--355, 2008.Google ScholarCross Ref
N. Pingclasai, H. Hata, and K. Matsumoto. Classifying bug reports to bugs and other requests using topic modeling. In Proceedings of the Asia-Pacific Software Engineering Conference (APSEC), pages 13--18, 2013.Google ScholarCross Ref
D. Ramage, D. Hall, R. Nallapati, and C. Manning. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 248--256, 2009. Google ScholarDigital Library
S. Rastkar, G. Murphy, and G. Murray. Summarizing software artifacts: A case study of bug reports. In Proceedings of the International Conference on Software Engineering (ICSE), pages 505--514, 2010. Google ScholarDigital Library
P. Runeson, M. Alexandersson, and O. Nyholm. Detection of duplicate defect reports using natural language processing. In Proceedings of the International Conference on Software Engineering, pages 499--510, 2007. Google ScholarDigital Library
K. Somasundaram and G. Murphy. Automatic categorization of bug reports using latent dirichlet allocation. In Proceedings of the India Software Engineering Conference (ISEC), pages 125--130, 2012. Google ScholarDigital Library
C. Sun, D. Lo amd S. Khoo, and J. Jiang. Towards more accurate retrieval of duplicate bug reports. In Proceedings of the International Conference on Automated Software Engineering (ASE), pages 253--262, 2011. Google ScholarDigital Library
C. Sun, D. Lo, X. Wang, J. Jiang, and S. Khoo. A discriminative model approach for accurate duplicate bug report retrieval. In Proceedings of the International Conference on Software Engineering (ICSE), pages 45--54, 2010. Google ScholarDigital Library
Y. Tian, D. Lo, and C. Sun. Drone: Predicting priority of reported bugs by multi-factor analysis. In Proceedings of the International Conference on Software Maintenance (ICSM), pages 200--209, 2013. Google ScholarDigital Library
D. Čubranić Automatic bug triage using text categorization. In Proceedings of the International Conference on Software Engineering and Knowledge Engineering (SEKE), pages 92--97, 2004.Google Scholar
C. Weiss, R. Premraj, T. Zimmermann, and A. Zeller. How long will it take to fix this bug? In Proceedings of the International Conference on Mining Software Repositories (MSR), page 1, 2007. Google ScholarDigital Library
Y. Zhou, Y. Tong, R. Gu, and H. Gall. Combining text mining and data mining for bug report classification. In Proceedings of the International Conference on Software Maintenance and Evolution(ICSME), pages 311--320, 2014. Google ScholarDigital Library
M. Zibran. What makes APIs difficult to use? J. Comp. Sci. Netw. Sec., 8(4):255--261, 2008.Google Scholar
M. Zibran, F. Eishita, and C. Roy. Useful, but usable? factors affecting the usability of APIs. In Proceedings of the International Working Conference on Reverse Engineering (WCRE), pages 151--155, 2011. Google ScholarDigital Library

Index Terms

On the effectiveness of labeled latent dirichlet allocation in automatic bug-report categorization
1. Software and its engineering
  1. Software notations and tools
    1. Software libraries and repositories

Recommendations

Automatic categorization of bug reports using latent Dirichlet allocation
ISEC '12: Proceedings of the 5th India Software Engineering Conference

Software developers, particularly in open-source projects, rely on bug repositories to organize their work. On a bug report, the component field is used to indicate to which team of developers a bug should be routed. Researchers have shown that ...
Read More
Bug localization using latent Dirichlet allocation

Context: Some recent static techniques for automatic bug localization have been built around modern information retrieval (IR) models such as latent semantic indexing (LSI). Latent Dirichlet allocation (LDA) is a generative statistical model that has ...
Read More
Latent dirichlet allocation based multi-document summarization
AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text data

Extraction based Multi-Document Summarization Algorithms consist of choosing sentences from the documents using some weighting mechanism and combining them into a summary. In this article we use Latent Dirichlet Allocation to capture the events being ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICSE '16: Proceedings of the 38th International Conference on Software Engineering Companion
May 2016
946 pages
ISBN:9781450342056
DOI:10.1145/2889160
General Chair:
Laura Dillon
Michigan State University
,
Program Chairs:
Willem Visser
Stellenbosch University, South Africa
,
Laurie Williams
North Carolina State University
Copyright © 2016 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 May 2016
Check for updates
Author Tags
LLDA
automatic categorization
bug-report
topic modelling
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate276of1,856submissions,15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 170
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

On the effectiveness of labeled latent dirichlet allocation in automatic bug-report categorization

ICSE '16: Proceedings of the 38th International Conference on Software Engineering Companion

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automatic categorization of bug reports using latent Dirichlet allocation

Bug localization using latent Dirichlet allocation

Latent dirichlet allocation based multi-document summarization