Supervised ensemble sentiment-based framework to measure chatbot quality of services

Almansor, Ebtesam Hussain; Hussain, Farookh Khadeer; Hussain, Omar Khadeer

doi:10.1007/s00607-020-00863-0

Supervised ensemble sentiment-based framework to measure chatbot quality of services

Special Issue Article
Published: 04 November 2020

Volume 103, pages 491–507, (2021)
Cite this article

Computing Aims and scope Submit manuscript

Ebtesam Hussain Almansor^1,2,
Farookh Khadeer Hussain¹ &
Omar Khadeer Hussain³

914 Accesses
6 Citations
Explore all metrics

Abstract

Developing an intelligent chatbot has evolved in the last few years to become a trending topic in the area of computer science. However, a chatbot often fails to understand the user’s intent, which can lead to the generation of inappropriate responses that cause dialogue breakdown and user dissatisfaction. Detecting the dialogue breakdown is essential to improve the performance of the chatbot and increase user satisfaction. Recent approaches have focused on modeling conversation breakdown using serveral approaches, including supervised and unsupervised approaches. Unsupervised approach relay heavy datasets, which make it challenging to apply it to the breakdown task. Another challenge facing predicting breakdown in conversation is the bias of human annotation for the dataset and the handling process for the breakdown. To tackle this challenge, we have developed a supervised ensemble automated approach that measures Chatbot Quality of Service (CQoS) based on dialogue breakdown. The proposed approach is able to label the datasets based on sentiment considering the context of the conversion to predict the breakdown. In this paper we aim to detect the affect of sentiment change of each speaker in a conversation. Furthermore, we use the supervised ensemble model to measure the CQoS based on breakdown. Then we handle this problem by using a hand-over mechanism that transfers the user to a live agent. Based on this idea, we perform several experiments across several datasets and state-of-the-art models, and we find that using sentiment as a trigger for breakdown outperforms human annotation. Overall, we infer that knowledge acquired from the supervised ensemble model can indeed help to measure CQoS based on detecting the breakdown in conversation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fine-Grained Emotion Detection in Contact Center Chat Utterances

Modeling the Chatbot Quality of Services (CQoS) Using Word Embedding to Intelligently Detect Inappropriate Responses

The First Conversational Intelligence Challenge

References

Yan Z, Duan N, Bao J, Chen P, Zhou M, Li Z (2018) Response selection from unstructured documents for human-computer conversation systems. Knowl-Based Syst 142:149
Article Google Scholar
Nuruzzaman M, Hussain OK (2020) IntelliBot: a dialogue-based chatbot for the insurance industry. Knowl-Based Syst 196:105810
Article Google Scholar
Yan Z, Duan N, Chen P, Zhou M, Zhou J, Li Z (2017) In: Thirty-first AAAI conference on artificial intelligence
Henderson M, Thomson B, Young S (2013) In: Proceedings of the SIGDIAL 2013 conference, pp 467–471
Banchs RE, Li H (2012) In: Proceedings of the ACL 2012 system demonstrations. Association for Computational Linguistics, pp 37–42
Wu Y, Li Z, Wu W, Zhou M (2018) Response selection with topic clues for retrieval-based chatbots. Neurocomputing 316:251
Article Google Scholar
Ji Z, Lu Z, Li H (2014) arXiv preprint arXiv:1408.6988
Shang L, Lu Z, Li H (2015) arXiv preprint arXiv:1503.02364
Martinovsky B, Traum D (2006) The error is the clue: breakdown in human–machine interaction. Tech. rep., University of Southern California Marina Del Rey CA Inst for Creative
Xie Z, Ling G (2017) In: Proceedings of the Dialog System Technology Challenges Workshop (DSTC6)
Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117
Article Google Scholar
Gautam G, Yadav D (2014) In: 2014 seventh international conference on Contemporary Computing (IC3). IEEE, pp 437–442
Walker MA, Langkilde-Geary I, Hastie HW, Wright J, Gorin A (2002) Automatically training a problematic dialogue predictor for a spoken dialogue system. J Artif Intell Res 16:293
Article Google Scholar
Higashinaka R, Funakoshi K, Kobayashi Y, Inaba M (2016) In: Proceedings of the tenth international conference on Language Resources and Evaluation (LREC’16), pp 3146–3150
Kobayashi S, Unno Y, Fukuda M (2015) Multitask learning of recurrent neural network for detecting breakdowns of dialog and language modeling. Tech. rep., JSAI technical report (SIG-SLUD-75-B502)
Saito A, Iki T (2017) In: Proceedings of the dialog system technology challenges workshop (DSTC6)
Lee S, Lee D, Hooshyar D, Jo J, Lim H (2020) Integrating breakdown detection into dialogue systems to improve knowledge management: encoding temporal utterances with memory attention. Inf Technol Manag 21(1):51
Article Google Scholar
Almansor EH, Hussain FK (2020) In: International conference on advanced information networking and applications. Springer, pp 60–70
Park C, Kim K, Kim S (2017) In: Proceedings of the dialog system technology challenges workshop (DSTC6)
Hori C, Perez J, Higashinaka R, Hori T, Boureau YL, Inaba M, Tsunomori Y, Takahashi T, Yoshino K, Kim S (2019) Overview of the sixth dialog system technology challenge: DSTC6. Comput Speech Lang 55:1
Article Google Scholar
Sugiyama H (2017) In: Proceedings of Dialog System Technology Challenges, vol 6
Takayama J, Nomoto E, Arase Y (2017) In: Proceedings of the Dialog System Technology Challenge 6 Workshop (DSTC6)
Taniguchi K (2015) In JSAI Technical Report (SIG-SLUD-75-B502), pp 37–40
Lopes J (2017) In: Proceedings of Dialog System Technology Challenges Workshop (DSTC6)
Sugiyama H (2019) Empirical feature analysis for dialogue breakdown detection. Comput Speech Lang 54:140
Article Google Scholar
Hutto CJ, Gilbert E (2014) In: Eighth international AAAI conference on weblogs and social media
Almansor EH, Al-Ani A (2018) In: International conference on machine learning and data mining in pattern recognition. Springer, pp 347–356
Almansor EH, Al-Ani A, Hussain FK (2019) In: Conference on complex, intelligent, and software intensive systems. Springer, pp 176–187
Kalarani P, Brunda SS (2019) Sentiment analysis by POS and joint sentiment topic features using SVM and ANN. Soft Comput 23(16):7067
Article Google Scholar
Raza M, Hussain FK, Hussain OK, Zhao M, Rehman Z (2019) A comparative analysis of machine learning models for quality pillar assessment of SaaS services by multi-class text classification of users’ reviews. Future Gener Comput Syst 101:341–371
Article Google Scholar
Higashinaka R, Funakoshi K, Inaba M, Tsunomori Y, Takahashi T, Kaji N (2017) In: Proceedings of dialog system technology challenge, vol 6
Danescu-Niculescu-Mizil C, Lee L (2011) In: Proceedings of the 2nd workshop on cognitive modeling and computational linguistics. Association for Computational Linguistics, pp 76–87
Poria S, Hazarika D, Majumder N, Naik G, Cambria E, Mihalcea R (2018) arXiv preprint arXiv:1810.02508
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427
Article Google Scholar
Raschka S (2015) Python machine learning. Packt Publishing Ltd, Birmingham
Google Scholar
Müller AC, Guido S et al (2016) Introduction to machine learning with Python: a guide for data scientists. O’Reilly Media Inc, Sebastopol
Google Scholar
Coelho LP, Richert W (2015) Building machine learning systems with Python. Packt Publishing Ltd, Birmingham
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, Australia
Ebtesam Hussain Almansor & Farookh Khadeer Hussain
Community College, Najran University, Najran, Saudi Arabia
Ebtesam Hussain Almansor
School of Business, University of New South Wales (UNSW), Canberra, Australia
Omar Khadeer Hussain

Authors

Ebtesam Hussain Almansor
View author publications
You can also search for this author in PubMed Google Scholar
Farookh Khadeer Hussain
View author publications
You can also search for this author in PubMed Google Scholar
Omar Khadeer Hussain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ebtesam Hussain Almansor.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Almansor, E.H., Hussain, F.K. & Hussain, O.K. Supervised ensemble sentiment-based framework to measure chatbot quality of services. Computing 103, 491–507 (2021). https://doi.org/10.1007/s00607-020-00863-0

Download citation

Received: 08 September 2020
Accepted: 20 October 2020
Published: 04 November 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s00607-020-00863-0

Keywords

Mathematics Subject Classification

60-08

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supervised ensemble sentiment-based framework to measure chatbot quality of services

Abstract

Access this article

Similar content being viewed by others

Fine-Grained Emotion Detection in Contact Center Chat Utterances

Modeling the Chatbot Quality of Services (CQoS) Using Word Embedding to Intelligently Detect Inappropriate Responses

The First Conversational Intelligence Challenge

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Supervised ensemble sentiment-based framework to measure chatbot quality of services

Abstract

Access this article

Similar content being viewed by others

Fine-Grained Emotion Detection in Contact Center Chat Utterances

Modeling the Chatbot Quality of Services (CQoS) Using Word Embedding to Intelligently Detect Inappropriate Responses

The First Conversational Intelligence Challenge

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation