ABSTRACT
External knowledge-enhanced task-oriented dialogue systems aim to cover user requests beyond pre-defined DBs/APIs. Recently, existing dialogue systems have focused more on retrieving external knowledge sources relevant to dialogue contexts, achieving competitive results. However, due to the lack of modeling entity-aware dialogue intention, such dialogue systems are hard to accurately and efficiently link the out-of-API functions in real-world scenarios. To tackle this problem, this paper investigates learning dense entity-aware dialogue intentions for external knowledge documents retrieval in task-oriented dialogues. To this end, we propose an intention-guided two-stage training approach that includes intention-guided training and knowledge transfer stages. This approach, which leverages rewritten utterances that explicitly convey entity-aware user intentions, can improve the performance of existing Bi-Encoder retrievers such as DPR (Deep Passage Retriever). In intention-guided training stage, a posterior history encoder is initialized and guided by inputting rewritten utterances for learning discriminative dense representations. In knowledge transfer stage, these representations are transferred to a newly initialized prior encoder for inference via an extra intent consistency loss. In addition, negative sampling in test knowledge documents is used to learn more discriminative dense representations of the unseen domain. The advantages of our approach are no need for response annotations and extra response generator, additionally, it provides great scalability. The experimental results on augmented MultiWOZ 2.1 dataset show that our approach outperforms baseline models except for relevance classifiers in retrieval accuracy and has reasonably high efficiency.
- Seokhwan Kim, Mihail Eric, Karthik Gopalakrishnan, Behnam Hedayatnia, Yang Liu, and Dilek Hakkani-Tür. 2020. Beyond Domain APIs: Task-oriented conversational modeling with unstructured knowledge access. In SIGdial. 278–289.Google Scholar
- Mihail Eric, Rahul Goel, Shachi Paul, Abhishek Sethi, Sanchit Agarwal, Shuyang Gao, Adarsh Kumar, Anuj Goyal, Peter Ku, and Dilek Hakkani-Tur. 2020. MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 422–428. https://www.aclweb.org/anthology/2020.lrec-1.53Google Scholar
- H. He, Hua Lu, Siqi Bao, Fan Wang, Hua Wu, Zheng-Yu Niu, and H. Wang. 2021. Learning to Select External Knowledge with Multi-Scale Negative Sampling. ArXiv abs/2102.02096(2021).Google Scholar
- Chaohong Tan, Xiaoyu Yang, Zi’ou Zheng, Tianda Li, Yufei Feng, Jia-Chen Gu, QUAN LIU, Dan Liu, Zhenhua Ling, and Xiao-Dan Zhu. 2020. Learning to Retrieve Entity-Aware Knowledge and Generate Responses with Copy Mechanism for Task-Oriented Dialogue Systems. ArXiv abs/2012.11937(2020).Google Scholar
- David Thulke, Nico Daheim, Christian Dugast, and H. Ney. 2021. Efficient Retrieval Augmented Generation from Unstructured Knowledge for Task-Oriented Dialog. ArXiv abs/2102.04643(2021).Google Scholar
- Patrick S. H. Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems, Vol. 33. 9459–9474.Google Scholar
- Daniel Gillick, Sayali Kulkarni, Larry Lansing, Alessandro Presta, Jason Baldridge, Eugene Ie, and Diego Garcia-Olano. 2019. Learning Dense Representations for Entity Retrieval. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL). 528–537.Google ScholarCross Ref
- Tiancheng Zhao and Maxine Eskenazi. 2018. Zero-Shot Dialog Generation with Cross-Domain Latent Actions. In Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue. 1–10.Google ScholarCross Ref
- Hang Liu, Meng Chen, Youzheng Wu, X. He, and B. Zhou. 2021. Conversational Query Rewriting with Self-supervised Learning. ArXiv abs/2102.04708(2021).Google Scholar
- Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick S. H. Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 6769–6781.Google ScholarCross Ref
- Tsung-Hsien Wen, David Vandyke, Nikola Mrksic, Milica Gasic, Lina Maria Rojas-Barahona, Pei-Hao Su, Stefan Ultes, and Steve J. Young. 2017. A Network-based End-to-End Trainable Task-oriented Dialogue System. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Vol. 1. 438–449.Google ScholarCross Ref
- Chien-Sheng Wu, Andrea Madotto, Ehsan Hosseini-Asl, Caiming Xiong, Richard Socher, and Pascale Fung. 2019. Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 808–819.Google ScholarCross Ref
- Hung Le, Steven C.H. Hoi, and Richard Socher. 2020. Non-Autoregressive Dialog State Tracking. In ICLR 2020 : Eighth International Conference on Learning Representations.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Vol. 30. 5998–6008.Google ScholarDigital Library
- Haoyu Song, Weinan Zhang, Yiming Cui, Dong Wang, and Ting Liu. 2019. Exploiting Persona Information for Diverse Generation of Conversational Responses.. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. 5190–5196.Google ScholarCross Ref
- Rongzhong Lian, Min Xie, Fan Wang, Jinhua Peng, and Hua Wu. 2019. Learning to Select Knowledge for Response Generation in Dialog Systems. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. 5081–5087.Google ScholarCross Ref
- Siqi Bao, Huang He, Fan Wang, Rongzhong Lian, and Hua Wu. 2019. Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment.. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5382–5391.Google ScholarCross Ref
- Siqi Bao, Huang He, Fan Wang, Hua Wu, and Haifeng Wang. 2020. PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 85–96.Google ScholarCross Ref
- Byeongchang Kim, Jaewoo Ahn, and Gunhee Kim. 2020. Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue. In ICLR 2020 : Eighth International Conference on Learning Representations.Google Scholar
- Xiaoxue Zang, Abhinav Rastogi, Srinivas Sunkara, Raghav Gupta, Jianguo Zhang, and Jindong Chen. 2020. MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines. In Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI. 109–117.Google ScholarCross Ref
- Tsung-Hsien Wen, Yishu Miao, Phil Blunsom, and Steve Young. 2017. Latent Intention Dialogue Models. In ICML’17 Proceedings of the 34th International Conference on Machine Learning - Volume 70. 3732–3741.Google Scholar
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692(2019).Google Scholar
- Alexandr Andoni and Piotr Indyk. 2008. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of The ACM 51, 1 (2008), 117–122.Google ScholarDigital Library
- Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Mingwei Chang. 2020. Retrieval Augmented Language Model Pre-Training. In ICML 2020: 37th International Conference on Machine Learning, Vol. 1. 3929–3938.Google Scholar
- Daya Guo, Duyu Tang, Nan Duan, Jian Yin, Daxin Jiang, and Ming Zhou. 2020. Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 6118–6129.Google ScholarCross Ref
- Davis Liang, Peng Xu, Siamak Shakeri, C. D. Santos, Ramesh Nallapati, Zhiheng Huang, and Bing Xiang. 2020. Embedding-based Zero-shot Retrieval through Query Generation. ArXiv abs/2009.10270(2020).Google Scholar
- Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 7871–7880.Google ScholarCross Ref
- Hui Su, Xiaoyu Shen, Rongzhi Zhang, Fei Sun, Pengwei Hu, Cheng Niu, and Jie Zhou. 2019. Improving Multi-turn Dialogue Modelling with Utterance ReWriter. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 22–31. https://doi.org/10.18653/v1/P19-1003Google ScholarCross Ref
- Stephen Robertson and Hugo Zaragoza. 2009. The Probabilistic Relevance Framework.Google Scholar
- Chulaka Gunasekara, Seokhwan Kim, Luis Fernando D’Haro, Abhinav Rastogi, Y. Chen, M. Eric, Behnam Hedayatnia, Karthik Gopalakrishnan, Y. Liu, Chao-Wei Huang, D. Hakkani-Tur, Jinchao Li, Qi Zhu, Lingxiao Luo, L. Liden, Kaili Huang, Shahin Shayandeh, Runze Liang, Baolin Peng, Zheng Zhang, Swadheen Shukla, Minlie Huang, Jianfeng Gao, Shikib Mehri, Y. Feng, Carla Gordon, S. Alavi, David Traum, M. Eskénazi, A. Beirami, Eunjoon Cho, Paul A. Crook, Ankita De, A. Geramifard, S. Kottur, Seungwhan Moon, S. Poddar, and Rajen Subba. 2020. Overview of the Ninth Dialog System Technology Challenge: DSTC9. ArXiv abs/2011.06486(2020).Google Scholar
- Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, and Jamie Brew. 2019. HuggingFace’s Transformers: State-of-the-art Natural Language Processing.arXiv preprint arXiv:1910.03771(2019).Google Scholar
- Diederik P. Kingma and Jimmy Lei Ba. 2015. Adam: A Method for Stochastic Optimization. In 2015 International Conference on Learning Representations (ICLR).Google Scholar
Index Terms
- Learning Dense Entity-Aware Dialogue Intentions with Rewritten Utterance for External Knowledge Documents Retrieval
Recommendations
An evaluation of strategies for selective utterance verification for spoken natural language dialog
ANLC '97: Proceedings of the fifth conference on Applied natural language processingAs with human-human interaction, spoken human-computer dialog will contain situations where there is miscommunication. In experimental trials consisting of eight different users, 141 problem-solving dialogs, and 2840 user utterances, the Circuit Fix-It ...
An evaluation of strategies for selectively verifying utterance meanings in spoken natural language dialog
As with human human interaction, spoken human computer dialog will contain situations where there is miscommunication. One natural strategy for reducing the impact of miscommunication is selective verification of the user utterance meanings. This paper ...
Utterance Classification for Combination of Multiple Simple Dialog Systems
ISPAW '11: Proceedings of the 2011 IEEE Ninth International Symposium on Parallel and Distributed Processing with Applications WorkshopsThis paper describes an utterance classification method for combining multiple dialog systems. For reducing effort of developing spoken dialog systems, several dialog systems have been proposed that do not require complicated dialog description. However,...
Comments