Introducing Multi-modality in Persuasive Task Oriented Virtual Sales Agent

Raut, Aritra; Das, Subrata; Tiwari, Abhisek; Saha, Sriparna; Maitra, Anutosh; Ramnani, Roshni; Sengupta, Shubhashis

doi:10.1007/978-3-031-30111-7_46

Aritra Raut¹²,
Subrata Das¹³,
Abhisek Tiwari¹³,
Sriparna Saha¹³,
Anutosh Maitra¹⁴,
Roshni Ramnani¹⁴ &
…
Shubhashis Sengupta¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13625))

Included in the following conference series:

International Conference on Neural Information Processing

854 Accesses

Abstract

In recent years, the usage of virtual assistants to complete tasks like service scheduling and online shopping has increased in both popularity and need. An end user’s task goals are the main objectives of a task-oriented conversation agent, and those should be served effectively and successfully. Beside that, user satisfaction is one of the most important aspect that should be taken care of. Communication with multi-modal responses makes the conversation easier and more attractive. Responses through proper images can improve the quality of a task oriented conversation in terms of user satisfaction. Keeping these aspects in mind, we propose a framework which infuses multi-modality with an end-to-end persuasive task oriented dialogue generation module. Additionally, we create a personalised persuasive multi-modal dialogue (PPMD) corpus with slot, sentiment, and agent action annotation at turn level that contains multi-modal responses from both ends. The results and thorough analysis on this dataset show that the suggested multi-modal persuasive virtual assistant achieves better performance over traditional task-oriented frameworks in terms of user satisfaction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lei, W., Jin, X., Kan, M.Y., Ren, Z., He, X., Yin, D.: Sequicity: simplifying task-oriented dialogue systems with single sequence-to-sequence architectures. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, pp. 1437–1447. Association for Computational Linguistics (2018). https://aclanthology.org/P18-1133
Liang, W., Tian, Y., Chen, C., Yu, Z.: MOSS: end-to-end dialog system framework with modular supervision. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, pp. 8327–8335 (2020). https://doi.org/10.1609/aaai.v34i05.6349
Yang, Y., Li, Y., Quan, X.: UBAR: towards fully end-to-end task-oriented dialog systems with GPT-2. In: AAAI (2021)
Google Scholar
Radford, A., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Google Scholar
Budzianowski, P., Vulić, I.: Hello, it’s GPT-2-how can I help you? Towards the use of pretrained language models for task-oriented dialogue systems. arXiv preprint arXiv:1907.05774 (2019)
Tiwari, A., et al.: A dynamic goal adapted task oriented dialogue agent. PLoS ONE 16(4), e0249030 (2021)
Article MathSciNet Google Scholar
Tiwari, A., et al.: A persona aware persuasive dialogue policy for dynamic and co-operative goal setting. Expert Syst. Appl. 195, 116303 (2022)
Article Google Scholar
Das, A., et al.: Visual dialog. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 326–335 (2017)
Google Scholar
Guo, D., Wang, H., Wang, M.: Dual visual attention network for visual dialog. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 10–16 August 2019, pp. 4989–4995 (2019)
Google Scholar
Tiwari, A., et al.: Multi-modal dialogue policy learning for dynamic and co-operative goal setting. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Google Scholar
Guo, D., Wang, H., Wang, S., Wangb, M.: Textual-visual reference-aware attention network for visual dialog. IEEE Trans. Image Process. 29, 6655–6666 (2020)
Article MATH Google Scholar
Hemphill, C.T., Godfrey, J.J., Doddington, G.R.: The ATIS spoken language systems pilot corpus. In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, 24–27 June 1990 (1990)
Google Scholar
Budzianowski, P., et al.: MultiWOZ-a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling. arXiv preprint arXiv:1810.00278 (2018)
Zhang, S., Dinan, E., Urbanek, J., Szlam, A., Kiela, D., Weston, J.: Personalizing dialogue agents: I have a dog, do you have pets too? In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2204–2213 (2018)
Google Scholar
Bordes, A., Boureau, Y.L., Weston, J.: Learning end-to-end goal-oriented dialog. arXiv preprint arXiv:1605.07683 (2016)
Lewis, M., Yarats, D., Dauphin, Y., Parikh, D., Batra, D.: Deal or no deal? End-to-end learning of negotiation dialogues. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2443–2453 (2017)
Google Scholar
Saha, A., Khapra, M., Sankaranarayanan, K.: Towards building large scale multimodal domain-aware conversation systems. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Wang, X., et al.: Persuasion for good: towards a personalized persuasive dialogue system for social good. arXiv preprint arXiv:1906.06725 (2019)
Baichoo, A.: Kaggle GSMArean (2017). https://www.kaggle.com/arwinneil/gsmarena-phone-dataset
Wolf, T., et al.: HuggingFace’s transformers: state-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)
Google Scholar
Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Ramakrishna Mission Vivekananda Educational and Research Institute, Belur, Howrah, India
Aritra Raut
Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, Bihar, India
Subrata Das, Abhisek Tiwari & Sriparna Saha
Accenture Labs, Bangalore, Karnataka, India
Anutosh Maitra, Roshni Ramnani & Shubhashis Sengupta

Authors

Aritra Raut
View author publications
You can also search for this author in PubMed Google Scholar
Subrata Das
View author publications
You can also search for this author in PubMed Google Scholar
Abhisek Tiwari
View author publications
You can also search for this author in PubMed Google Scholar
Sriparna Saha
View author publications
You can also search for this author in PubMed Google Scholar
Anutosh Maitra
View author publications
You can also search for this author in PubMed Google Scholar
Roshni Ramnani
View author publications
You can also search for this author in PubMed Google Scholar
Shubhashis Sengupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aritra Raut .

Editor information

Editors and Affiliations

Indian Institute of Technology Indore, Indore, India
Mohammad Tanveer
Indian Institute of Information Technology - Allahabad, Prayagraj, India
Sonali Agarwal
Kobe University, Kobe, Japan
Seiichi Ozawa
Indian Institute of Technology Patna, Patna, India
Asif Ekbal
University of Innsbruck, Innsbruck, Austria
Adam Jatowt

Appendix

Table 5. Intent, slot and dialogue act list of the PPMD dataset

Full size table

Table 6. Examples of different persuasion strategies

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Raut, A. et al. (2023). Introducing Multi-modality in Persuasive Task Oriented Virtual Sales Agent. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Lecture Notes in Computer Science, vol 13625. Springer, Cham. https://doi.org/10.1007/978-3-031-30111-7_46

Download citation

DOI: https://doi.org/10.1007/978-3-031-30111-7_46
Published: 13 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30110-0
Online ISBN: 978-3-031-30111-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Introducing Multi-modality in Persuasive Task Oriented Virtual Sales Agent

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation