Improving Logging Prediction on Imbalanced Datasets: A Case Study on Open Source Java Projects

Sangeeta Lal, Neetu Sardana, Ashish Sureka

Source Title: International Journal of Open Source Software and Processes (IJOSSP)7(2)

ISSN: 1942-3926|EISSN: 1942-3934|EISBN13: 9781466690653|DOI: 10.4018/IJOSSP.2016040103

MLA

Lal, Sangeeta, et al. "Improving Logging Prediction on Imbalanced Datasets: A Case Study on Open Source Java Projects." IJOSSP vol.7, no.2 2016: pp.43-71. http://doi.org/10.4018/IJOSSP.2016040103

APA

Lal, S., Sardana, N., & Sureka, A. (2016). Improving Logging Prediction on Imbalanced Datasets: A Case Study on Open Source Java Projects. International Journal of Open Source Software and Processes (IJOSSP), 7(2), 43-71. http://doi.org/10.4018/IJOSSP.2016040103

Chicago

Lal, Sangeeta, Neetu Sardana, and Ashish Sureka. "Improving Logging Prediction on Imbalanced Datasets: A Case Study on Open Source Java Projects," International Journal of Open Source Software and Processes (IJOSSP) 7, no.2: 43-71. http://doi.org/10.4018/IJOSSP.2016040103

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

Logging is an important yet tough decision for OSS developers. Machine-learning models are useful in improving several steps of OSS development, including logging. Several recent studies propose machine-learning models to predict logged code construct. The prediction performances of these models are limited due to the class-imbalance problem since the number of logged code constructs is small as compared to non-logged code constructs. No previous study analyzes the class-imbalance problem for logged code construct prediction. The authors first analyze the performances of J48, RF, and SVM classifiers for catch-blocks and if-blocks logged code constructs prediction on imbalanced datasets. Second, the authors propose LogIm, an ensemble and threshold-based machine-learning model. Third, the authors evaluate the performance of LogIm on three open-source projects. On average, LogIm model improves the performance of baseline classifiers, J48, RF, and SVM, by 7.38%, 9.24%, and 4.6% for catch-blocks, and 12.11%, 14.95%, and 19.13% for if-blocks logging prediction.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Improving Logging Prediction on Imbalanced Datasets: A Case Study on Open Source Java Projects

MLA

APA

Chicago

Export Reference

Abstract

Request Access