Skip to main content

Improved Searchability of Bug Reports Using Content-Based Labeling with Machine Learning of Sentences

  • Conference paper
  • First Online:
Knowledge-Based Software Engineering: 2018 (JCKBSE 2018)

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 108))

Included in the following conference series:

Abstract

Most stakeholders refer to past bug reports when they encounter a problem since bug reports contain useful information. However, searching for specific content is difficult because there are many bug reports. The desired content depends on the viewpoint of the stakeholder. A full text search includes unwanted content, which is costly. Although this problem has been previously noted, a solution has yet to be proposed. Herein we propose Content-based Labeling Method as a solution. This method organizes information in a bug report by labeling each sentence based on its contents, allowing stakeholders’ viewpoints to be considered. We evaluate the improvement in searchability. The Content-based Labeling Method improves the searchability according to the F-measure and precision of the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bettenburg, N., Just, S., Schroter, A., Weiss, C., Premraj, R., Zimmermann,T.: What makes a good bug report? In: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 308–318 (2008)

    Google Scholar 

  2. Yusop, N.S.M.Y., Grundy, J., Vasa, R.: Reporting usability defects: do reporters report what software developers need? In: Proceedings of the 24th Australasian Software Engineering Conference, pp. 38–45 (2015)

    Google Scholar 

  3. Rastkar, S., Murphy, G.C., Murray, G.: Automatic summarization of bug reports. IEEE Trans. Softw. Eng. 40(4), 366–380 (2014)

    Article  Google Scholar 

  4. Rastkar, S., Murphy, G.C., Murray, G.: Summarizing software artifacts: a case study of bug reports. In: Proceedings of the 32nd International Conference on Software Engineering, pp. 505–514 (2010)

    Google Scholar 

  5. Ferreira, E.C., Vieira, V., Mourao, F.: Bug report summarization: an evaluation of ranking techniques. In: X Brazilian Symposium on Components, Architectures and Reuse Software, pp. 101–110 (2016)

    Google Scholar 

  6. Mani, S., Catherine, R., Sinha, V.S., Dubey, A.: AUSUM: approach for unsupervised bug report summarization. In: Proceedings of the 20th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, pp. 1–11 (2012)

    Google Scholar 

  7. Yusop, N.S.M.Y., Grundy, J., Vasa, R.: Reporting usability defects: do reporters report what software developers need? In: Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, pp. 1–10 (2016)

    Google Scholar 

  8. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Proceedings of the 10th European Conference on Machine Learning, pp. 137–142 (1998)

    Google Scholar 

  9. Zhang, H., Li, D.: Naïve Bayes text classifier. In: Proceedings of the IEEE International Conference on Granular Computing, pp. 708–711 (2007)

    Google Scholar 

  10. Wu, Q., Ye, Y., Zhang, H., Ng, M.K., Ho, S.-S.: ForesTexter: an efficient random forest algorithm for imbalanced text categorization. Knowl. Based Syst. 67, 105–116 (2014)

    Article  Google Scholar 

  11. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34(1), 1–47 (2002)

    Article  Google Scholar 

  12. Scikit-learn machine learning in Python. http://scikit-learn.org/

  13. Gensim topic modelling for humans. https://radimrehurek.com/gensim/

  14. Garca, S., Herrera, F.: Evolutionary under-sampling for classification with imbalanced data sets: proposals and taxonomy. Evol. Comput. 17(3), 275–306 (2009)

    Article  Google Scholar 

  15. Hripcsak, G., Rothschild, A.S.: Agreement, the F-Measure, and reliability in information retrieval. J. Am. Inform. Assoc. 12(3), 296–298 (2005)

    Article  Google Scholar 

  16. Watanabe, Y., et al.: ID3P: iterative data-driven development of persona based on quantitative evaluation and revision. In: Proceedings of the 10th International Workshop on Cooperative and Human Aspects of Software Engineering, pp. 49–55 (2017)

    Google Scholar 

  17. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuki Noyori .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Noyori, Y., Washizaki, H., Fukazawa, Y., Kanuka, H., Ooshima, K., Tsuchiya, R. (2019). Improved Searchability of Bug Reports Using Content-Based Labeling with Machine Learning of Sentences. In: Virvou, M., Kumeno, F., Oikonomou, K. (eds) Knowledge-Based Software Engineering: 2018. JCKBSE 2018. Smart Innovation, Systems and Technologies, vol 108. Springer, Cham. https://doi.org/10.1007/978-3-319-97679-2_8

Download citation

Publish with us

Policies and ethics