research-article

Unsupervised Cross-Domain Adaptation for Response Selection Using Self-Supervised and Adversarial Training

Authors:

Daxin JiangAuthors Info & Claims

WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining

Pages 562 - 570

https://doi.org/10.1145/3488560.3498404

Published: 15 February 2022 Publication History

Abstract

Recently, many neural context-response matching models have been developed for retrieval-based dialogue systems. Although existing models achieve impressive performance through learning on a large amount of in-domain parallel dialogue data, they usually perform worse in another new domain. How to transfer a response retrieval model trained in high-resource domains to other low-resource domains is a crucial problem for scalable dialogue systems. To this end, we investigate the unsupervised cross-domain adaptation for response selection when the target domain has no parallel dialogue data. Specifically, we propose a two-stage method to adapt a response selection model to a new domain using self-supervised and adversarial training based on pre-trained language models (PLMs). To efficiently incorporate domain awareness and target-domain knowledge to PLMs, we first design a self-supervised post-training procedure, including domain discrimination (DD) task, target-domain masked language model (MLM) task and target-domain next sentence prediction (NSP) task. Based on this, we further conduct the adversarial fine-tuning to empower the model to match the proper response with extracted domain-shared features as much as possible. Experimental results show that our proposed method achieves consistent and significant improvements on several cross-domain response selection datasets.

References

[1]

Daniel Cohen, Bhaskar Mitra, Katja Hofmann, and W Bruce Croft. 2018. Cross domain regularization for neural ranking models using adversarial learning. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1025--1028.

Digital Library

[2]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL. 4171--4186.

[3]

Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, and Hsiao-Wuen Hon. 2019. Unified Language Model Pre-training for Natural Language Understanding and Generation. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. dtextquotesingle Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. 13063--13075.

[4]

Chunning Du, Haifeng Sun, Jingyu Wang, Qi Qi, and Jianxin Liao. 2020. Adversarial and domain-aware bert for cross-domain sentiment analysis. In ACL . 4019--4028.

[5]

Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, Francc ois Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. The journal of machine learning research, Vol. 17, 1 (2016), 2096--2030.

[6]

Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial networks. arXiv preprint arXiv:1406.2661 (2014).

[7]

Jiafeng Guo, Yixing Fan, Liang Pang, Liu Yang, Qingyao Ai, Hamed Zamani, Chen Wu, W Bruce Croft, and Xueqi Cheng. 2020. A deep look into neural ranking models for information retrieval. Information Processing & Management, Vol. 57, 6 (2020), 102067.

[8]

Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A Smith. 2020. Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. In ACL .

[9]

Lanqing Hu, Meina Kan, Shiguang Shan, and Xilin Chen. 2018. Duplex generative adversarial network for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 1498--1507.

[10]

Kai Hua, Zhiyuan Feng, Chongyang Tao, Rui Yan, and Lu Zhang. 2020. Learning to detect relevant contexts and knowledge for response selection in retrieval-based dialogue systems. In Proceedings of the 29th ACM international conference on information & knowledge management . 525--534.

Digital Library

[11]

Minlie Huang, Xiaoyan Zhu, and Jianfeng Gao. 2020. Challenges in building intelligent open-domain dialog systems. ACM Transactions on Information Systems (TOIS), Vol. 38, 3 (2020), 1--32.

Digital Library

[12]

Zongcheng Ji, Zhengdong Lu, and Hang Li. 2014. An Information Retrieval Approach to Short Text Conversation. arXiv preprint arXiv:1408.6988 (2014).

[13]

S Lee, H Schulz, A Atkinson, J Gao, K Suleman, L El Asri, M Adada, M Huang, S Sharma, W Tay, et al. 2019. Multi-domain task-completion dialog challenge. Dialog system technology challenges, Vol. 8 (2019), 9.

[14]

Feng-Lin Li, Minghui Qiu, Haiqing Chen, Xiongwei Wang, Xing Gao, Jun Huang, Juwei Ren, Zhongzhou Zhao, Weipeng Zhao, Lei Wang, et al. 2017a. Alime assist: An intelligent assistant for creating an innovative e-commerce experience. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2495--2498.

Digital Library

[15]

Juntao Li, Ruidan He, Hai Ye, Hwee Tou Ng, Lidong Bing, and Rui Yan. 2020 a. Unsupervised domain adaptation of a pretrained cross-lingual language model. arXiv preprint arXiv:2011.11499 (2020).

[16]

Jinchao Li, Baolin Peng, Sungjin Lee, Jianfeng Gao, Ryuichi Takanobu, Qi Zhu, Minlie Huang, Hannes Schulz, Adam Atkinson, and Mahmoud Adada. 2020 b. Results of the multi-domain task-completion dialog challenge. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, Eighth Dialog System Technology Challenge Workshop .

[17]

Zheng Li, Yun Zhang, Ying Wei, Yuxiang Wu, and Qiang Yang. 2017b. End-to-End Adversarial Memory Network for Cross-domain Sentiment Classification. In IJCAI. 2237--2243.

[18]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692 (2019).

[19]

Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015a. The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue . 285--294.

[20]

Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015b. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909 (2015).

[21]

Xiaofei Ma, Peng Xu, Zhiguo Wang, Ramesh Nallapati, and Bing Xiang. 2019. Domain adaptation with BERT-based domain classification and data selection. In Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019). 76--83.

[22]

Nikola Mrkvs ić, Diarmuid O Séaghdha, Blaise Thomson, Milica Gavs ić, Pei-Hao Su, David Vandyke, Tsung-Hsien Wen, and Steve Young. 2015. Multi-domain dialog state tracking using recurrent neural networks. arXiv preprint arXiv:1506.07190 (2015).

[23]

Kuniaki Saito, Kohei Watanabe, Yoshitaka Ushiku, and Tatsuya Harada. 2018. Maximum classifier discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition . 3723--3732.

[24]

Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, and Yoshua Bengio. 2017. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence . 3295--3301.

[25]

Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural Responding Machine for Short-Text Conversation. In ACL . 1577--1586.

[26]

Heung-Yeung Shum, Xiao-dong He, and Di Li. 2018. From Eliza to XiaoIce: challenges and opportunities with social chatbots. Frontiers of Information Technology & Electronic Engineering, Vol. 19, 1 (2018), 10--26.

[27]

Chongyang Tao, Jiazhan Feng, Rui Yan, Wei Wu, and Daxin Jiang. 2021. A survey on response selection for retrieval-based dialogues. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21) . 4619--4626.

[28]

Chongyang Tao, Wei Wu, Can Xu, Wenpeng Hu, Dongyan Zhao, and Rui Yan. 2019 a. Multi-representation fusion network for multi-turn response selection in retrieval-based chatbots. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 267--275.

Digital Library

[29]

Chongyang Tao, Wei Wu, Can Xu, Wenpeng Hu, Dongyan Zhao, and Rui Yan. 2019 b. One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues. In ACL. 1--11.

[30]

Gokhan Tur. 2006. Multitask learning for spoken language understanding. In 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, Vol. 1. IEEE, I--I.

[31]

Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).

[32]

Jesse Vig and Kalai Ramea. 2019. Comparison of transfer-learning approaches for response selection in multi-turn conversations. In Workshop on DSTC7 .

[33]

Oriol Vinyals and Quoc Le. 2015. A Neural Conversational Model. arXiv preprint arXiv:1506.05869 (2015).

[34]

Huazheng Wang, Zhe Gan, Xiaodong Liu, Jingjing Liu, Jianfeng Gao, and Hongning Wang. 2019. Adversarial domain adaptation for machine reading comprehension. arXiv preprint arXiv:1908.09209 (2019).

[35]

Hao Wang, Zhengdong Lu, Hang Li, and Enhong Chen. 2013. A Dataset for Research on Short-Text Conversations. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing . 935--945.

[36]

Taesun Whang, Dongyub Lee, Chanhee Lee, Kisu Yang, Dongsuk Oh, and HeuiSeok Lim. 2019. An effective domain adaptive post-training method for bert in response selection. arXiv preprint arXiv:1908.04812 (2019).

[37]

Taesun Whang, Dongyub Lee, Chanhee Lee, Kisu Yang, Dongsuk Oh, and HeuiSeok Lim. 2020. An Effective Domain Adaptive Post-Training Method for BERT in Response Selection. In Proc. Interspeech 2020 .

[38]

Yu Wu, Wei Wu, Chen Xing, Ming Zhou, and Zhoujun Li. 2017. Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots. In ACL. 496--505.

[39]

Hu Xu, Bing Liu, Lei Shu, and Philip S Yu. 2019. BERT post-training for review reading comprehension and aspect-based sentiment analysis. arXiv preprint arXiv:1904.02232 (2019).

[40]

Ruijian Xu, Chongyang Tao, Daxin Jiang, Xueliang Zhao, Dongyan Zhao, and Rui Yan. 2020. Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues. arXiv preprint arXiv:2009.06265 (2020).

[41]

Liu Yang, Minghui Qiu, Chen Qu, Jiafeng Guo, Yongfeng Zhang, W Bruce Croft, Jun Huang, and Haiqing Chen. 2018. Response ranking with deep matching networks and external knowledge in information-seeking conversation systems. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval . 245--254.

Digital Library

[42]

Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. XLNet: Generalized autoregressive pretraining for language understanding. In Advances in neural information processing systems. 5753--5763.

[43]

Jianfei Yu, Minghui Qiu, Jing Jiang, Jun Huang, Shuangyong Song, Wei Chu, and Haiqing Chen. 2018. Modelling domain relationships for transfer learning on retrieval-based question answering systems in e-commerce. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 682--690.

Digital Library

[44]

Chunyuan Yuan, Wei Zhou, Mingming Li, Shangwen Lv, Fuqing Zhu, Jizhong Han, and Songlin Hu. 2019. Multi-hop Selector Network for Multi-turn Response Selection in Retrieval-based Chatbots. In EMNLP . 111--120.

[45]

Liangli Zhen, Peng Hu, Xi Peng, Rick Siow Mong Goh, and Joey Tianyi Zhou. 2020. Deep Multimodal Transfer Learning for Cross-Modal Retrieval. IEEE Transactions on Neural Networks and Learning Systems (2020).

[46]

Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu, and Hua Wu. 2018. Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network. In ACL . 1118--1127.

Cited By

Zhao SXie YChen WWang TYao JZheng J(2024)Metric-Free Learning Network with Dual Relations Propagation for Few-Shot Aspect Category Sentiment AnalysisTransactions of the Association for Computational Linguistics10.1162/tacl_a_0063512(100-119)Online publication date: 31-Jan-2024
https://doi.org/10.1162/tacl_a_00635
Ahmadi MGholizadeh Lonbar ANouri MSharifzadeh Javidi ATarlani Beris ASharifi ASalimi-Tarazouj A(2024)Supervised multi-regional segmentation machine learning architecture for digital twin applications in coastal regionsJournal of Coastal Conservation10.1007/s11852-024-01038-128:2Online publication date: 1-Mar-2024
https://doi.org/10.1007/s11852-024-01038-1
Qin CLi WZheng BZeng JLiang SZhang XZhang W(2023)Dual adversarial models with cross-coordination consistency constraint for domain adaption in brain tumor segmentationFrontiers in Neuroscience10.3389/fnins.2023.104353317Online publication date: 13-Apr-2023
https://doi.org/10.3389/fnins.2023.1043533

Index Terms

Unsupervised Cross-Domain Adaptation for Response Selection Using Self-Supervised and Adversarial Training
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Domain Adaptation for Speaker Verification Based on Self-supervised Learning with Adversarial Training
MultiMedia Modeling
Abstract
Speaker verification models trained on a single domain have difficulty keeping performance on new domain data. Adversarial training maps different domain data to the same subspace to handle this problem. However, adversarial training only uses ...
Semi-supervised adversarial discriminative domain adaptation
Abstract
Domain adaptation is a potential method to train a powerful deep neural network across various datasets. More precisely, domain adaptation methods train the model on training data and test that model on a completely separate dataset. The ...
Self-corrected unsupervised domain adaptation
Abstract
Unsupervised domain adaptation (UDA), which aims to use knowledge from a label-rich source domain to help learn unlabeled target domain, has recently attracted much attention. UDA methods mainly concentrate on source classification and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining

February 2022

1690 pages

ISBN:9781450391320

DOI:10.1145/3488560

General Chairs:
K. Selcuk Candan
Arizona State University, USA
,
Huan Liu
Arizona State University, USA
,
Program Chairs:
Leman Akoglu
Carnegie Mellon University, USA
,
Xin Luna Dong
Meta Platforms, Inc. (former Facebook), USA
,
Jiliang Tang
Michigan State University, USA

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 February 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WSDM '22

Sponsor:

WSDM '22: The Fifteenth ACM International Conference on Web Search and Data Mining

February 21 - 25, 2022

AZ, Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
326
Total Downloads

Downloads (Last 12 months)29
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhao SXie YChen WWang TYao JZheng J(2024)Metric-Free Learning Network with Dual Relations Propagation for Few-Shot Aspect Category Sentiment AnalysisTransactions of the Association for Computational Linguistics10.1162/tacl_a_0063512(100-119)Online publication date: 31-Jan-2024
https://doi.org/10.1162/tacl_a_00635
Ahmadi MGholizadeh Lonbar ANouri MSharifzadeh Javidi ATarlani Beris ASharifi ASalimi-Tarazouj A(2024)Supervised multi-regional segmentation machine learning architecture for digital twin applications in coastal regionsJournal of Coastal Conservation10.1007/s11852-024-01038-128:2Online publication date: 1-Mar-2024
https://doi.org/10.1007/s11852-024-01038-1
Qin CLi WZheng BZeng JLiang SZhang XZhang W(2023)Dual adversarial models with cross-coordination consistency constraint for domain adaption in brain tumor segmentationFrontiers in Neuroscience10.3389/fnins.2023.104353317Online publication date: 13-Apr-2023
https://doi.org/10.3389/fnins.2023.1043533

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten