Unsupervised Domain Adaptation on Sentence Matching Through Self-Supervision

Bai, Gui-Rong; Liu, Qing-Bin; He, Shi-Zhu; Liu, Kang; Zhao, Jun

doi:10.1007/s11390-022-1479-0

Unsupervised Domain Adaptation on Sentence Matching Through Self-Supervision

Regular Paper
Published: 30 November 2023

Volume 38, pages 1237–1249, (2023)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Gui-Rong Bai^1,2,
Qing-Bin Liu^1,2,
Shi-Zhu He^1,2,
Kang Liu^1,2 &
…
Jun Zhao^1,2

137 Accesses
1 Altmetric
Explore all metrics

Abstract

Although neural approaches have yielded state-of-the-art results in the sentence matching task, their performance inevitably drops dramatically when applied to unseen domains. To tackle this cross-domain challenge, we address unsupervised domain adaptation on sentence matching, in which the goal is to have good performance on a target domain with only unlabeled target domain data as well as labeled source domain data. Specifically, we propose to perform self-supervised tasks to achieve it. Different from previous unsupervised domain adaptation methods, self-supervision can not only flexibly suit the characteristics of sentence matching with a special design, but also be much easier to optimize. When training, each self-supervised task is performed on both domains simultaneously in an easy-to-hard curriculum, which gradually brings the two domains closer together along the direction relevant to the task. As a result, the classifier trained on the source domain is able to generalize to the unlabeled target domain. In total, we present three types of self-supervised tasks and the results demonstrate their superiority. In addition, we further study the performance of different usages of self-supervised tasks, which would inspire how to effectively utilize self-supervision for cross-domain scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Bowman S R, Angeli G, Potts C, Manning C D. A large annotated corpus for learning natural language inference. arXiv: 1508.05326, 2015. https://arxiv.org/abs/1508.05326, Nov. 2023.
Williams A, Nangia N, Bowman S R. A broad-coverage challenge corpus for sentence understanding through inference. arXiv: 1704.05426, 2017. https://arxiv.org/abs/1704.05426, Nov. 2023.
Rus V, Banjade R, Lintean M. On paraphrase identification corpora. In Proc. the 9th International Conference on Language Resources and Evaluation, May 2014, pp.2422–2429.
Dzikovska M, Nielsen R, Brew C, Leacock C, Giampiccolo D, Bentivogli L, Clark P, Dagan I, Dang H T. SemEval-2013 task 7: The joint student response analysis and 8th recognizing textual entailment challenge. In Proc. the 2nd Joint Conference on Lexical and Computational Semantics, Jun. 2013, pp.263–274.
Nakov P, Hoogeveen D, Màrquez L, Moschitti A, Mubarak H, Baldwin T, Verspoor K. SemEval-2017 task 3: Community question answering. arXiv: 1912.00730, 2019. https://arxiv.org/abs/1912.00730, Nov. 2023.
Wang M Q, Smith N A, Mitamura T. What is the jeopardy model? A quasi-synchronous grammar for QA. In Proc. the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jun. 2007, pp.22–32.
Yang Y, Yih W T, Meek C. WikiQA: A challenge dataset for open-domain question answering. In Proc. the 2015 Conference on Empirical Methods in Natural Language Processing, Sept. 2015, pp.2013–2018. DOI: 10.18653/v1/D15-1237.
Bao X Q, Wu Y F. A tensor neural network with layerwise pretraining: Towards effective answer retrieval. Journal of Computer Science and Technology, 2016, 31(6): 1151–1160. DOI: https://doi.org/10.1007/s11390-016-1689-4.
Article Google Scholar
Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A. Supervised learning of universal sentence representations from natural language inference data. arXiv: 1705.02364, 2017. https://arxiv.org/abs/1705.02364, Nov. 2023.
Choi J, Yoo K M, Lee S. Learning to compose task-specific tree structures. arXiv: 1707.02786, 2017. https://arxiv.org/abs/1707.02786, Nov. 2023.
Nie Y X, Bansal M. Shortcut-stacked sentence encoders for multi-domain inference. arXiv: 1708.02312, 2017. https://arxiv.org/abs/1708.02312, Nov. 2023.
Shen T, Zhou T Y, Long G D, Jiang J, Wang S, Zhang C Q. Reinforced self-attention network: A hybrid of hard and soft attention for sequence modeling. arXiv: 1801.10296, 2018. https://arxiv.org/abs/1801.10296, Nov. 2023.
Chen Q, Zhu X D, Ling Z H, Wei S, Jiang H, Inkpen D. Enhanced LSTM for natural language inference. arXiv: 1609.06038, 2016. https://arxiv.org/abs/1609.06038, Nov. 2023.
Yang L, Ai Q Y, Guo J F, Croft W B. aNMM: Ranking short answer texts with attention-based neural matching model. In Proc. the 25th ACM International on Conference on Information and Knowledge Management, Oct. 2016, pp.287–296. DOI: 10.1145/2983323.2983818.
Wang Z G, Hamza W, Florian R. Bilateral multi-perspective matching for natural language sentences. arXiv: 1702.03814, 2017. https://arxiv.org/abs/1702.03814, Nov. 2023.
Gong Y C, Luo H, Zhang J. Natural language inference over interaction space. arXiv: 1709.04348, 2017. https://arxiv.org/abs/1709.04348, Nov. 2023.
Liang D, Zhang F B, Zhang Q, Huang X J. Asynchronous deep interaction network for natural language inference. In Proc. the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Nov. 2019, pp.2692–2700. DOI: 10.18653/v1/D19-1271.
Chen L, Zhao Y B, Lyu B E, Jin L S, Chen Z, Zhu S, Yu K. Neural graph matching networks for Chinese short text matching. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.6152–6158. DOI: 10.18653/v1/2020.acl-main.547.
Devlin J, Chang M W, Lee K, Toutanova K. BERT: Pretraining of deep bidirectional transformers for language understanding. arXiv: 1810.04805, 2018. https://arxiv.org/abs/1810.04805, Nov. 2023.
Pan S J, Yang Q. A survey on transfer learning. IEEE Trans. Knowledge and Data Engineering, 2010, 22(10): 1345–1359. DOI: https://doi.org/10.1109/TKDE.2009.191.
Article Google Scholar
Saenko K, Kulis B, Fritz M, Darrell T. Adapting visual category models to new domains. In Proc. the 11th European Conference on Computer Vision, Sept. 2010, pp.213–226. DOI: 10.1007/978-3-642-15561-1_16.
Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V. Domain-adversarial training of neural networks. The Journal of Machine Learning Research, 2016, 17(1): 2096–2030. DOI: https://doi.org/10.1007/978-3-319-58347-1_10.
Article MathSciNet Google Scholar
Wang Y Y, Gu J M, Wang C, Chen S C, Xue H. Discrimination- aware domain adversarial neural network. Journal of Computer Science and Technology, 2020, 35(2): 259–267. DOI: https://doi.org/10.1007/s11390-020-9969-4.
Article Google Scholar
Arjovsky M, Bottou L. Towards principled methods for training generative adversarial networks. arXiv: 1701.04862, 2017. https://arxiv.org/abs/1701.04862, Nov. 2023.
Raina R, Battle A, Lee H, Packer B, Ng A Y. Self-taught learning: Transfer learning from unlabeled data. In Proc. the 24th International Conference on Machine Learning, Jun. 2007, pp.759–766. DOI: 10.1145/1273496.1273592.
Bengio Y, Courville A, Vincent P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Analysis and Machine Intelligence, 2013, 35(8): 1798–1828. DOI: https://doi.org/10.1109/TPAMI.2013.50.
Article Google Scholar
Bengio Y, Louradour J, Collobert R, Weston J. Curriculum learning. In Proc. the 26th Annual International Conference on Machine Learning, Jun. 2009, pp.41–48. DOI: 10.1145/1553374.1553380.
Peng M L, Zhang Q, Jiang Y G, Huang X J. Cross-domain sentiment classification with target domain specific information. In Proc. the 56th Annual Meeting of the Association for Computational Linguistics, Jul. 2018, pp.2505–2513. DOI: 10.18653/v1/P18-1233.
Ghosal D, Hazarika D, Roy A, Majumder N, Mihalcea R, Poria S. KinGDOM: Knowledge-guided DOMain adaptation for sentiment analysis. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.3198–3210. DOI: 10.18653/v1/2020.acl-main.292.
Cao Y, Fang M, Yu B S, Zhou J T. Unsupervised domain adaptation on reading comprehension. In Proc. the 34th AAAI Conference on Artificial Intelligence, Feb. 2020, pp.7480–7487. DOI: 10.1609/aaai.v34i05.6245.
Kamath A, Jia R B, Liang P. Selective question answering under domain shift. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.5684–5696. DOI: 10.18653/v1/2020.acl-main.503.
Ding N, Long D K, Xu G W, Zhu M H, Xie P J, Wang X B, Zheng H T. Coupling distant annotation and adversarial training for cross-domain Chinese word segmentation. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.6662–6671. DOI: 10.18653/v1/2020.acl-main.595.
Rücklé A, Pfeiffer J, Gurevych I. MultiCQA: Zero –shot transfer of self-supervised text matching models on a massive scale. In Proc. the 2020 Conference on Empirical Methods in Natural Language Processing, Nov. 2020, pp.2471–2486. DOI: 10.18653/v1/2020.emnlp-main.194.
Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv: 1301.3781, 2013. https://arxiv.org/abs/1301.3781, Nov. 2023.
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In Proc. the 26th International Conference on Neural Information Processing Systems, Dec. 2013, pp.3111–3119.
Bengio Y, Ducharme R, Vincent P, Janvin C. A neural probabilistic language model. The Journal of Machine Learning Research, 2003, 3: 1137–1155. DOI: https://doi.org/10.1007/3-540-33486-6_6.
Article Google Scholar
Peters M E, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L. Deep contextualized word representations. arXiv: 1802.05365, 2018. https://arxiv.org/abs/1802.05365, Nov. 2023.
Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training, 2018. https://www.bibsonomy.org/bibtex/15c343ed9a31ac52fd17a898f72af228f/lepsky?lang=en, Nov. 2023.
Kumar M P, Packer B, Koller D. Self-paced learning for latent variable models. In Proc. the 23rd International Conference on Neural Information Processing Systems, Dec. 2010, pp.1189–1197.
Sachan M, Xing E. Easy questions first? A case study on curriculum learning for question answering. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, Aug. 2016, pp.453–463. DOI: 10.18653/v1/P16-1043.
Sachan M, Xing E. Self-training for jointly learning to ask and answer questions. In Proc. the 16th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun. 2018, pp.629–640. DOI: 10.18653/v1/N18-1058.
Tay Y, Wang S H, Tuan L A, Fu J, Phan M C, Yuan X D, Rao J F, Hui S C, Zhang A. Simple and effective curriculum pointer-generator networks for reading comprehension over long narratives. arXiv: 1905.10847, 2019. https://arxiv.org/abs/1905.10847, Nov. 2023.
Xu B F, Zhang L, Mao Z, Wang Q, Xie H, Zhang Y. Curriculum learning for natural language understanding. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.6095–6104. DOI: 10.18653/v1/2020.acl-main.542.
Wu J W, Wang X, Wang W Y. Self-supervised dialogue learning. arXiv: 1907.00448, 2019. https://arxiv.org/abs/1907.00448, Nov. 2023.
Lewis M, Liu Y H, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv: 1910.13461, 2019. https://arxiv.org/abs/1910.13461, Nov. 2023.
Jurczyk T, Zhai M, Choi J D. SelQA: A new benchmark for selection-based question answering. In Proc. the 28th International Conference on Tools with Artificial Intelligence, Nov. 2016, pp.820–827. DOI: 10.1109/ICTAI.2016.0128.
Kingma D P, Ba J. Adam: A method for stochastic optimization. arXiv: 1412.6980, 2014. https://arxiv.org/abs/1412.6980, Nov. 2023.
Bousmalis K, Trigeorgis G, Silberman N, Krishnan D, Erhan D. Domain separation networks. In Proc. the 30th International Conference on Neural Information Processing Systems, Dec. 2016, pp.343–351.
Ziser Y, Reichart R. Task refinement learning for improved accuracy and stability of unsupervised domain adaptation. In Proc. the 57th Annual Meeting of the Association for Computational Linguistics, Jul. 2019, pp.5895–5906. DOI: 10.18653/v1/P19-1591.
Long M S, Zhu H, Wang J M, Jordan M I. Deep transfer learning with joint adaptation networks. In Proc. the 34th International Conference on Machine Learning, Aug. 2017, pp.2208–2217.
Zellinger W, Grubinger T, Lughofer E, Natschläger T, Saminger-Platz S. Central moment discrepancy (CMD) for domain-invariant representation learning. arXiv: 1702.08811, 2017. https://arxiv.org/abs/1702.08811, Dec. 2023.
Ruder S, Plank B. Strong baselines for neural semi-supervised learning under domain shift. arXiv: 1804.09530, 2018. https://arxiv.org/abs/1804.09530, Nov. 2023.
Ge Y X, Chen D P, Li H S. Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv: 2001.01526, 2020. https://arxiv.org/abs/2001.01526, Nov. 2023.

Download references

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Gui-Rong Bai, Qing-Bin Liu, Shi-Zhu He, Kang Liu & Jun Zhao
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
Gui-Rong Bai, Qing-Bin Liu, Shi-Zhu He, Kang Liu & Jun Zhao

Authors

Gui-Rong Bai
View author publications
You can also search for this author inPubMed Google Scholar
Qing-Bin Liu
View author publications
You can also search for this author inPubMed Google Scholar
Shi-Zhu He
View author publications
You can also search for this author inPubMed Google Scholar
Kang Liu
View author publications
You can also search for this author inPubMed Google Scholar
Jun Zhao
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Shi-Zhu He.

Supplementary Information

ESM 1

(PDF 298 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bai, GR., Liu, QB., He, SZ. et al. Unsupervised Domain Adaptation on Sentence Matching Through Self-Supervision. J. Comput. Sci. Technol. 38, 1237–1249 (2023). https://doi.org/10.1007/s11390-022-1479-0

Download citation

Received: 30 March 2021
Accepted: 28 February 2022
Published: 30 November 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11390-022-1479-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised Domain Adaptation on Sentence Matching Through Self-Supervision

Abstract

Access this article

Subscribe and save

Buy Now

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now