ABSTRACT
Users often experience failures, have problems, or have further requests with regard to the software they use. Software companies provide customer care service or customer support to handle such issues or problems which sometimes can be resolved right away and sometimes have to be forwarded to responsible persons. Efficiency of problem handling is very important to software companies to maintain customer satisfaction. This paper reports a case of a software company in Thailand whose derivatives trading software is used by a large number of broker companies and their customers. The software company has experienced problems where the reported software problems are classified incorrectly and hence are directed to the wrong persons and have to be reclassified. Assigning the problem reports to the responsible persons in a timely and correct manner is crucial especially for the nature of the trading software. This paper presents a multiclass classification method to classify 11 problem report types that are found in this trading software. Machine learning algorithms that are applied include Multinomial Naïve Bayes, Linear SVC, Random Forest, and Logistic Regression, and consider both lexical features and metadata of the problem reports. In an experiment, Linear SVC performed best, having the F1 score of 91.69% and accuracy of 91.79% when using unigram and trigram features of the problem report text which is written in Thai and English. The paper presents a support tool for classifying new problem reports and providing a dashboard of the problems found in this derivatives trading software for the software team to manage its maintenance.
- Freewill Solutions Company Limited. [Online]. Available from: http://www.freewillsolutions.com/. Last access: September 8, 2019.Google Scholar
- Xia, X., Lo D., Qiu, W., Wang, X., and Zhou, B. 2014. Automated configuration bug report prediction using text mining. In Proceedings of 2014 IEEE 38th Annual International Computers, Software and Applications Conference (Vasteras, Sweden, July 21-25, 2014). COMPSAC '14. IEEE, New York, NY, 107--116. DOI= 10.1109/COMPSAC.2014.17.Google Scholar
- Goseva-Popstojanova, K. and Tyo, J. 2018. Identification of security related bug reports via text mining using supervised and unsupervised classification. In Proceedings of 2018 IEEE International Conference on Software Quality, Reliability and Security (Lisbon, Portugal, July 16-20, 2018). QRS '18. IEEE, New York, NY, 344--355. DOI= 10.1109/QRS.2018.00047.Google ScholarCross Ref
- Terdchanakul, P., Hata, H., Phannachitta, P., and Matsumoto, K. 2017. Bug or not? Bug report classification using n-gram IDF. In Proceedings of 2017 IEEE International Conference on Software Maintenance and Evolution (Shanghai, China, September 17-22, 2017). ICSME '14. IEEE, New York, NY, 534--538. DOI=10.1109/ICSME.2017.14.Google Scholar
- Zhou, Y., Tong, Y., Gu, R., and Gall, H. 2014. Combining text mining and data mining for bug report classification. In Proceedings of 2014 IEEE International Conference on Software Maintenance and Evolution (Victoria, BC, Canada, September 29 - October 3, 2014). ICSME '14. IEEE, New York, NY, 311--320. DOI=10.1109/ICSME.2014.53.Google ScholarDigital Library
- Nigam, A., Nigam, B., Bhaisare, C., and Arya, N. 2012. Classifying the bugs using multi-class semi supervised support vector machine. In Proceedings of the International Conference on Pattern Recognition, Informatics and Medical Engineering (Salem, Tamilnadu, India, March 21-23, 2012). PRIME '12. IEEE, New York, NY, 393--397. DOI=10.1109/ICPRIME.2012.6208378.Google Scholar
- PyThaiNLP [Online]. Available: https://github.com/PyThaiNLP/pythainlp. Last access: September 8, 2019.Google Scholar
- Scikit-learn, https://scikit-learn.org/stable/. Last access: September 8, 2019.Google Scholar
- Müller, A. C. and Guido, S. Introduction to Machine Learning with Python. 2016. O'Reilly, Sebastopol, CA.Google Scholar
- Chawla, N.V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. 2002. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16 (June 2002), 321--357. DOI=https://doi.org/10.1613/jair.953Google ScholarCross Ref
- Matplotlib [Online]. Available: https://matplotlib.org/. Last access: September 8, 2019.Google Scholar
Index Terms
- Identification of Software Problem Report Types Using Multiclass Classification
Recommendations
Learning ECOC Code Matrix for Multiclass Classification with Application to Glaucoma Diagnosis
Classification of different mechanisms of angle closure glaucoma (ACG) is important for medical diagnosis. Error-correcting output code (ECOC) is an effective approach for multiclass classification. In this study, we propose a new ensemble learning ...
A hybrid ensemble for classification in multiclass datasets
New hybrid ensemble algorithm for multiclass classification problems is proposed.It uses machine learning algorithms, feature ranking method and an instance filter.Its aim is to improve the performance results of ensemble-Vote.It is tested on four ...
The Multiclass ROC Front method for cost-sensitive classification
This paper addresses the problem of learning a multiclass classification system that can suit to any environment. By that we mean that particular (imbalanced) misclassification costs are taken into account by the classifier for predictions. However, ...
Comments