Abstract
In the field of customer service management, classifying service dialogues to different business labels is beneficial for managers to improve their service quality. However, the size of labeled service dialogue dataset in real scenarios is usually small due to the expensive labeling cost, which makes it difficult to fully train the supervised classification models. Moreover, the service dialogue usually contains chitchat which can be regarded as the noise affecting the classification performance. Existing text classification methods fail to address above two issues simultaneously. Hence, in this paper, we propose a dialogue classification algorithm that strengthens the influence of the business-related utterances in the dialogue and use them as the key utterances to improve the classification. Firstly, we propose key utterance labels that can indicate which utterances in the dialogue are key utterances. Then, we propose the dialogue classification model that is based on the key utterance labels and logistic regression, namely KU-LR. The KU-LR can learn the key utterance patterns and increase the importance of key utterances in the dialogue, and then the KU-LR makes more accurate decisions for dialogue classification. The experimental results on real-world dataset show that the KU-LR method outperforms other baselines when the training dataset is small.
Similar content being viewed by others
References
Aborisade, O., Anwar, M. (2018). Classification for authorship of tweets by comparing logistic regression and Naive Bayes classifiers. In: 2018 IEEE international conference on information reuse and integration (pp. 269–276), IEEE.
Aggarwal, C. C., & Zhai, C. (2012). Mining text data. Berlin: Springer.
Bahdanau, D., Cho, K., Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv:14090473
Bloehdorn, S., Hotho, A. (2004). Boosting for text classification with semantic features. In: International workshop on knowledge discovery on the web (pp. 149–166). Springer.
Boulis, C., & Ostendorf, M. (2005). Text classification by augmenting the bag-of-words representation with redundancy-compensated bigrams. In: Proceedings of the international workshop in feature selection in data mining, Citeseer (pp. 9–16).
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:181004805.
Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29(2–3), 103–130.
Gao, H., Huang, W., & Duan, Y. (2020a). The cloud-edge based dynamic reconfiguration to service workflow for mobile ecommerce environments: A qos prediction perspective. Internet Technology
Gao, H., Kuang, L., Yin, Y., Guo, B., & Dou, K. (2020b). Mining consuming behaviors with temporal evolution for personalized recommendation in mobile marketing apps. Mobile Networks and Applications, 25, 1233–1248.
Gao, H., Liu, C., Li, Y., & Yang, X. (2020c). V2vr: Reliable hybrid-network-oriented v2v data transmission and routing considering rsus and connectivity probability. In: IEEE Transactions on Intelligent Transportation Systems (pp. 1–14).
Gasso, G. (2019). Logistic regression.
Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv:14085882.
Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text classification algorithms: A survey. Information, 10(4), 150.
Kumar, B. S., Ravi, V. (2017). Text document classification with pca and one-class svm. In: Proceedings of the 5th international conference on frontiers in intelligent computing: theory and applications (pp. 107–115). Springer.
Lai, S., Xu, L., Liu, K., & Zhao, J. (2015). Recurrent convolutional neural networks for text classification. In: Twenty-ninth AAAI conference on artificial intelligence.
Li, L., Weinberg, C. R., Darden, T. A., & Pedersen, L. G. (2001). Gene selection for sample classification based on gene expression data: Study of sensitivity to choice of parameters of the ga/knn method. Bioinformatics, 17(12), 1131–1142.
Liu, J., Yang, Y., Lv, S., Wang, J., Chen, H. (2019). Attention-based bigru-cnn for chinese question classification. Journal of Ambient Intelligence and Humanized Computing 1–12.
Liu, P., Qiu, X., Huang, X. (2016). Recurrent neural network for text classification with multi-task learning. arXiv:160505101.
Ma, X., Gao, H., Xu, H., & Bian, M. (2019). An iot-based task scheduling optimization scheme considering the deadline and cost-aware scientific workflow for cloud computing. EURASIP Journal on Wireless Communications and Networking, 2019(1), 249.
Manevitz, L. M., & Yousef, M. (2001). One-class svms for document classification. Journal of Machine Learning Research, 2, 139–154.
Nowak, J., Taspinar, A., & Scherer, R. (2017). Lstm recurrent neural networks for short text and sentiment classification. In: International conference on artificial intelligence and soft computing (pp. 553–562). Springer.
Pranckevičius, T., & Marcinkevičius, V. (2017). Comparison of Naive Bayes, random forest, decision tree, support vector machines, and logistic regression classifiers for text reviews classification. Baltic Journal of Modern Computing, 5(2), 221.
Sebastiani, F. (2002). Machine learning in automated text categorization. Computing Surveys, 34(1), 1–47.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In: Advances in neural information processing systems (pp. 5998–6008).
Wang, L., Wang, H., & Yang, H. (2019). Classification method for tibetan texts based on in-depth learning. In: 2019 IEEE 8th joint international information technology and artificial intelligence conference (pp. 1231–1235). IEEE.
Xu, B., Guo, X., Ye, Y., & Cheng, J. (2012). An improved random forest classifier for text categorization. Journal of Computers, 7(12), 2913–2920.
Yang, X., Zhou, S., & Cao, M. (2019). An approach to alleviate the sparsity problem of hybrid collaborative filtering based recommendations: The product-attribute perspective from user reviews. Mobile Networks and Applications, 25, 376–390.
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. (2016). Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: Human language technologies (pp. 1480–1489).
Yin, W., Kann, K., Yu, M., & Schütze, H. (2017). Comparative study of cnn and rnn for natural language processing. arXiv:170201923.
Yuan, G. X., Ho, C. H., & Lin, C. J. (2012). Recent advances of large-scale linear classification. Proceedings of the IEEE, 100(9), 2584–2603.
Zhang, W., Yoshida, T., & Tang, X. (2011). A comparative study of tf* idf, lsi and multi-words for text classification. Expert Systems with Applications, 38(3), 2758–2765.
Zhang, X., Zhao, J., LeCun, Y. (2015). Character-level convolutional networks for text classification. In: Advances in neural information processing systems (pp. 649–657).
Acknowledgements
This research was partially sponsored by the following funds: National Key R&D Program of China (2018YFB1402800), the Fundamental Research Funds for the Provincial Universities of Zhejiang (RF-A2020007) and Zhejiang Lab (2020AA3AB05).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, Y., Cao, B., Ma, K. et al. Improving the classification of call center service dialogue with key utterences. Wireless Netw 27, 3395–3406 (2021). https://doi.org/10.1007/s11276-021-02573-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11276-021-02573-7