skip to main content
10.1145/3488560.3498404acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Unsupervised Cross-Domain Adaptation for Response Selection Using Self-Supervised and Adversarial Training

Published: 15 February 2022 Publication History

Abstract

Recently, many neural context-response matching models have been developed for retrieval-based dialogue systems. Although existing models achieve impressive performance through learning on a large amount of in-domain parallel dialogue data, they usually perform worse in another new domain. How to transfer a response retrieval model trained in high-resource domains to other low-resource domains is a crucial problem for scalable dialogue systems. To this end, we investigate the unsupervised cross-domain adaptation for response selection when the target domain has no parallel dialogue data. Specifically, we propose a two-stage method to adapt a response selection model to a new domain using self-supervised and adversarial training based on pre-trained language models (PLMs). To efficiently incorporate domain awareness and target-domain knowledge to PLMs, we first design a self-supervised post-training procedure, including domain discrimination (DD) task, target-domain masked language model (MLM) task and target-domain next sentence prediction (NSP) task. Based on this, we further conduct the adversarial fine-tuning to empower the model to match the proper response with extracted domain-shared features as much as possible. Experimental results show that our proposed method achieves consistent and significant improvements on several cross-domain response selection datasets.

References

[1]
Daniel Cohen, Bhaskar Mitra, Katja Hofmann, and W Bruce Croft. 2018. Cross domain regularization for neural ranking models using adversarial learning. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1025--1028.
[2]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL. 4171--4186.
[3]
Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, and Hsiao-Wuen Hon. 2019. Unified Language Model Pre-training for Natural Language Understanding and Generation. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. dtextquotesingle Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. 13063--13075.
[4]
Chunning Du, Haifeng Sun, Jingyu Wang, Qi Qi, and Jianxin Liao. 2020. Adversarial and domain-aware bert for cross-domain sentiment analysis. In ACL . 4019--4028.
[5]
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, Francc ois Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. The journal of machine learning research, Vol. 17, 1 (2016), 2096--2030.
[6]
Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial networks. arXiv preprint arXiv:1406.2661 (2014).
[7]
Jiafeng Guo, Yixing Fan, Liang Pang, Liu Yang, Qingyao Ai, Hamed Zamani, Chen Wu, W Bruce Croft, and Xueqi Cheng. 2020. A deep look into neural ranking models for information retrieval. Information Processing & Management, Vol. 57, 6 (2020), 102067.
[8]
Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A Smith. 2020. Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. In ACL .
[9]
Lanqing Hu, Meina Kan, Shiguang Shan, and Xilin Chen. 2018. Duplex generative adversarial network for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 1498--1507.
[10]
Kai Hua, Zhiyuan Feng, Chongyang Tao, Rui Yan, and Lu Zhang. 2020. Learning to detect relevant contexts and knowledge for response selection in retrieval-based dialogue systems. In Proceedings of the 29th ACM international conference on information & knowledge management . 525--534.
[11]
Minlie Huang, Xiaoyan Zhu, and Jianfeng Gao. 2020. Challenges in building intelligent open-domain dialog systems. ACM Transactions on Information Systems (TOIS), Vol. 38, 3 (2020), 1--32.
[12]
Zongcheng Ji, Zhengdong Lu, and Hang Li. 2014. An Information Retrieval Approach to Short Text Conversation. arXiv preprint arXiv:1408.6988 (2014).
[13]
S Lee, H Schulz, A Atkinson, J Gao, K Suleman, L El Asri, M Adada, M Huang, S Sharma, W Tay, et al. 2019. Multi-domain task-completion dialog challenge. Dialog system technology challenges, Vol. 8 (2019), 9.
[14]
Feng-Lin Li, Minghui Qiu, Haiqing Chen, Xiongwei Wang, Xing Gao, Jun Huang, Juwei Ren, Zhongzhou Zhao, Weipeng Zhao, Lei Wang, et al. 2017a. Alime assist: An intelligent assistant for creating an innovative e-commerce experience. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2495--2498.
[15]
Juntao Li, Ruidan He, Hai Ye, Hwee Tou Ng, Lidong Bing, and Rui Yan. 2020 a. Unsupervised domain adaptation of a pretrained cross-lingual language model. arXiv preprint arXiv:2011.11499 (2020).
[16]
Jinchao Li, Baolin Peng, Sungjin Lee, Jianfeng Gao, Ryuichi Takanobu, Qi Zhu, Minlie Huang, Hannes Schulz, Adam Atkinson, and Mahmoud Adada. 2020 b. Results of the multi-domain task-completion dialog challenge. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, Eighth Dialog System Technology Challenge Workshop .
[17]
Zheng Li, Yun Zhang, Ying Wei, Yuxiang Wu, and Qiang Yang. 2017b. End-to-End Adversarial Memory Network for Cross-domain Sentiment Classification. In IJCAI. 2237--2243.
[18]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692 (2019).
[19]
Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015a. The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue . 285--294.
[20]
Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015b. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909 (2015).
[21]
Xiaofei Ma, Peng Xu, Zhiguo Wang, Ramesh Nallapati, and Bing Xiang. 2019. Domain adaptation with BERT-based domain classification and data selection. In Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019). 76--83.
[22]
Nikola Mrkvs ić, Diarmuid O Séaghdha, Blaise Thomson, Milica Gavs ić, Pei-Hao Su, David Vandyke, Tsung-Hsien Wen, and Steve Young. 2015. Multi-domain dialog state tracking using recurrent neural networks. arXiv preprint arXiv:1506.07190 (2015).
[23]
Kuniaki Saito, Kohei Watanabe, Yoshitaka Ushiku, and Tatsuya Harada. 2018. Maximum classifier discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition . 3723--3732.
[24]
Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, and Yoshua Bengio. 2017. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence . 3295--3301.
[25]
Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural Responding Machine for Short-Text Conversation. In ACL . 1577--1586.
[26]
Heung-Yeung Shum, Xiao-dong He, and Di Li. 2018. From Eliza to XiaoIce: challenges and opportunities with social chatbots. Frontiers of Information Technology & Electronic Engineering, Vol. 19, 1 (2018), 10--26.
[27]
Chongyang Tao, Jiazhan Feng, Rui Yan, Wei Wu, and Daxin Jiang. 2021. A survey on response selection for retrieval-based dialogues. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21) . 4619--4626.
[28]
Chongyang Tao, Wei Wu, Can Xu, Wenpeng Hu, Dongyan Zhao, and Rui Yan. 2019 a. Multi-representation fusion network for multi-turn response selection in retrieval-based chatbots. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 267--275.
[29]
Chongyang Tao, Wei Wu, Can Xu, Wenpeng Hu, Dongyan Zhao, and Rui Yan. 2019 b. One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues. In ACL. 1--11.
[30]
Gokhan Tur. 2006. Multitask learning for spoken language understanding. In 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, Vol. 1. IEEE, I--I.
[31]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).
[32]
Jesse Vig and Kalai Ramea. 2019. Comparison of transfer-learning approaches for response selection in multi-turn conversations. In Workshop on DSTC7 .
[33]
Oriol Vinyals and Quoc Le. 2015. A Neural Conversational Model. arXiv preprint arXiv:1506.05869 (2015).
[34]
Huazheng Wang, Zhe Gan, Xiaodong Liu, Jingjing Liu, Jianfeng Gao, and Hongning Wang. 2019. Adversarial domain adaptation for machine reading comprehension. arXiv preprint arXiv:1908.09209 (2019).
[35]
Hao Wang, Zhengdong Lu, Hang Li, and Enhong Chen. 2013. A Dataset for Research on Short-Text Conversations. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing . 935--945.
[36]
Taesun Whang, Dongyub Lee, Chanhee Lee, Kisu Yang, Dongsuk Oh, and HeuiSeok Lim. 2019. An effective domain adaptive post-training method for bert in response selection. arXiv preprint arXiv:1908.04812 (2019).
[37]
Taesun Whang, Dongyub Lee, Chanhee Lee, Kisu Yang, Dongsuk Oh, and HeuiSeok Lim. 2020. An Effective Domain Adaptive Post-Training Method for BERT in Response Selection. In Proc. Interspeech 2020 .
[38]
Yu Wu, Wei Wu, Chen Xing, Ming Zhou, and Zhoujun Li. 2017. Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots. In ACL. 496--505.
[39]
Hu Xu, Bing Liu, Lei Shu, and Philip S Yu. 2019. BERT post-training for review reading comprehension and aspect-based sentiment analysis. arXiv preprint arXiv:1904.02232 (2019).
[40]
Ruijian Xu, Chongyang Tao, Daxin Jiang, Xueliang Zhao, Dongyan Zhao, and Rui Yan. 2020. Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues. arXiv preprint arXiv:2009.06265 (2020).
[41]
Liu Yang, Minghui Qiu, Chen Qu, Jiafeng Guo, Yongfeng Zhang, W Bruce Croft, Jun Huang, and Haiqing Chen. 2018. Response ranking with deep matching networks and external knowledge in information-seeking conversation systems. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval . 245--254.
[42]
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. XLNet: Generalized autoregressive pretraining for language understanding. In Advances in neural information processing systems. 5753--5763.
[43]
Jianfei Yu, Minghui Qiu, Jing Jiang, Jun Huang, Shuangyong Song, Wei Chu, and Haiqing Chen. 2018. Modelling domain relationships for transfer learning on retrieval-based question answering systems in e-commerce. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 682--690.
[44]
Chunyuan Yuan, Wei Zhou, Mingming Li, Shangwen Lv, Fuqing Zhu, Jizhong Han, and Songlin Hu. 2019. Multi-hop Selector Network for Multi-turn Response Selection in Retrieval-based Chatbots. In EMNLP . 111--120.
[45]
Liangli Zhen, Peng Hu, Xi Peng, Rick Siow Mong Goh, and Joey Tianyi Zhou. 2020. Deep Multimodal Transfer Learning for Cross-Modal Retrieval. IEEE Transactions on Neural Networks and Learning Systems (2020).
[46]
Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu, and Hua Wu. 2018. Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network. In ACL . 1118--1127.

Cited By

View all
  • (2024)Metric-Free Learning Network with Dual Relations Propagation for Few-Shot Aspect Category Sentiment AnalysisTransactions of the Association for Computational Linguistics10.1162/tacl_a_0063512(100-119)Online publication date: 31-Jan-2024
  • (2024)Supervised multi-regional segmentation machine learning architecture for digital twin applications in coastal regionsJournal of Coastal Conservation10.1007/s11852-024-01038-128:2Online publication date: 1-Mar-2024
  • (2023)Dual adversarial models with cross-coordination consistency constraint for domain adaption in brain tumor segmentationFrontiers in Neuroscience10.3389/fnins.2023.104353317Online publication date: 13-Apr-2023

Index Terms

  1. Unsupervised Cross-Domain Adaptation for Response Selection Using Self-Supervised and Adversarial Training

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
    February 2022
    1690 pages
    ISBN:9781450391320
    DOI:10.1145/3488560
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 February 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep neural network
    2. matching
    3. multi-turn response selection
    4. retrieval-based chatbot
    5. unsupervised cross-domain adaptation

    Qualifiers

    • Research-article

    Conference

    WSDM '22

    Acceptance Rates

    Overall Acceptance Rate 498 of 2,863 submissions, 17%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)29
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Metric-Free Learning Network with Dual Relations Propagation for Few-Shot Aspect Category Sentiment AnalysisTransactions of the Association for Computational Linguistics10.1162/tacl_a_0063512(100-119)Online publication date: 31-Jan-2024
    • (2024)Supervised multi-regional segmentation machine learning architecture for digital twin applications in coastal regionsJournal of Coastal Conservation10.1007/s11852-024-01038-128:2Online publication date: 1-Mar-2024
    • (2023)Dual adversarial models with cross-coordination consistency constraint for domain adaption in brain tumor segmentationFrontiers in Neuroscience10.3389/fnins.2023.104353317Online publication date: 13-Apr-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media