Skip to main content
Log in

Unsupervised Domain Adaptation on Sentence Matching Through Self-Supervision

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Although neural approaches have yielded state-of-the-art results in the sentence matching task, their performance inevitably drops dramatically when applied to unseen domains. To tackle this cross-domain challenge, we address unsupervised domain adaptation on sentence matching, in which the goal is to have good performance on a target domain with only unlabeled target domain data as well as labeled source domain data. Specifically, we propose to perform self-supervised tasks to achieve it. Different from previous unsupervised domain adaptation methods, self-supervision can not only flexibly suit the characteristics of sentence matching with a special design, but also be much easier to optimize. When training, each self-supervised task is performed on both domains simultaneously in an easy-to-hard curriculum, which gradually brings the two domains closer together along the direction relevant to the task. As a result, the classifier trained on the source domain is able to generalize to the unlabeled target domain. In total, we present three types of self-supervised tasks and the results demonstrate their superiority. In addition, we further study the performance of different usages of self-supervised tasks, which would inspire how to effectively utilize self-supervision for cross-domain scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Bowman S R, Angeli G, Potts C, Manning C D. A large annotated corpus for learning natural language inference. arXiv: 1508.05326, 2015. https://arxiv.org/abs/1508.05326, Nov. 2023.

  2. Williams A, Nangia N, Bowman S R. A broad-coverage challenge corpus for sentence understanding through inference. arXiv: 1704.05426, 2017. https://arxiv.org/abs/1704.05426, Nov. 2023.

  3. Rus V, Banjade R, Lintean M. On paraphrase identification corpora. In Proc. the 9th International Conference on Language Resources and Evaluation, May 2014, pp.2422–2429.

  4. Dzikovska M, Nielsen R, Brew C, Leacock C, Giampiccolo D, Bentivogli L, Clark P, Dagan I, Dang H T. SemEval-2013 task 7: The joint student response analysis and 8th recognizing textual entailment challenge. In Proc. the 2nd Joint Conference on Lexical and Computational Semantics, Jun. 2013, pp.263–274.

  5. Nakov P, Hoogeveen D, Màrquez L, Moschitti A, Mubarak H, Baldwin T, Verspoor K. SemEval-2017 task 3: Community question answering. arXiv: 1912.00730, 2019. https://arxiv.org/abs/1912.00730, Nov. 2023.

  6. Wang M Q, Smith N A, Mitamura T. What is the jeopardy model? A quasi-synchronous grammar for QA. In Proc. the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jun. 2007, pp.22–32.

  7. Yang Y, Yih W T, Meek C. WikiQA: A challenge dataset for open-domain question answering. In Proc. the 2015 Conference on Empirical Methods in Natural Language Processing, Sept. 2015, pp.2013–2018. DOI: 10.18653/v1/D15-1237.

  8. Bao X Q, Wu Y F. A tensor neural network with layerwise pretraining: Towards effective answer retrieval. Journal of Computer Science and Technology, 2016, 31(6): 1151–1160. DOI: https://doi.org/10.1007/s11390-016-1689-4.

    Article  Google Scholar 

  9. Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A. Supervised learning of universal sentence representations from natural language inference data. arXiv: 1705.02364, 2017. https://arxiv.org/abs/1705.02364, Nov. 2023.

  10. Choi J, Yoo K M, Lee S. Learning to compose task-specific tree structures. arXiv: 1707.02786, 2017. https://arxiv.org/abs/1707.02786, Nov. 2023.

  11. Nie Y X, Bansal M. Shortcut-stacked sentence encoders for multi-domain inference. arXiv: 1708.02312, 2017. https://arxiv.org/abs/1708.02312, Nov. 2023.

  12. Shen T, Zhou T Y, Long G D, Jiang J, Wang S, Zhang C Q. Reinforced self-attention network: A hybrid of hard and soft attention for sequence modeling. arXiv: 1801.10296, 2018. https://arxiv.org/abs/1801.10296, Nov. 2023.

  13. Chen Q, Zhu X D, Ling Z H, Wei S, Jiang H, Inkpen D. Enhanced LSTM for natural language inference. arXiv: 1609.06038, 2016. https://arxiv.org/abs/1609.06038, Nov. 2023.

  14. Yang L, Ai Q Y, Guo J F, Croft W B. aNMM: Ranking short answer texts with attention-based neural matching model. In Proc. the 25th ACM International on Conference on Information and Knowledge Management, Oct. 2016, pp.287–296. DOI: 10.1145/2983323.2983818.

  15. Wang Z G, Hamza W, Florian R. Bilateral multi-perspective matching for natural language sentences. arXiv: 1702.03814, 2017. https://arxiv.org/abs/1702.03814, Nov. 2023.

  16. Gong Y C, Luo H, Zhang J. Natural language inference over interaction space. arXiv: 1709.04348, 2017. https://arxiv.org/abs/1709.04348, Nov. 2023.

  17. Liang D, Zhang F B, Zhang Q, Huang X J. Asynchronous deep interaction network for natural language inference. In Proc. the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Nov. 2019, pp.2692–2700. DOI: 10.18653/v1/D19-1271.

  18. Chen L, Zhao Y B, Lyu B E, Jin L S, Chen Z, Zhu S, Yu K. Neural graph matching networks for Chinese short text matching. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.6152–6158. DOI: 10.18653/v1/2020.acl-main.547.

  19. Devlin J, Chang M W, Lee K, Toutanova K. BERT: Pretraining of deep bidirectional transformers for language understanding. arXiv: 1810.04805, 2018. https://arxiv.org/abs/1810.04805, Nov. 2023.

  20. Pan S J, Yang Q. A survey on transfer learning. IEEE Trans. Knowledge and Data Engineering, 2010, 22(10): 1345–1359. DOI: https://doi.org/10.1109/TKDE.2009.191.

    Article  Google Scholar 

  21. Saenko K, Kulis B, Fritz M, Darrell T. Adapting visual category models to new domains. In Proc. the 11th European Conference on Computer Vision, Sept. 2010, pp.213–226. DOI: 10.1007/978-3-642-15561-1_16.

  22. Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V. Domain-adversarial training of neural networks. The Journal of Machine Learning Research, 2016, 17(1): 2096–2030. DOI: https://doi.org/10.1007/978-3-319-58347-1_10.

    Article  MathSciNet  Google Scholar 

  23. Wang Y Y, Gu J M, Wang C, Chen S C, Xue H. Discrimination- aware domain adversarial neural network. Journal of Computer Science and Technology, 2020, 35(2): 259–267. DOI: https://doi.org/10.1007/s11390-020-9969-4.

    Article  Google Scholar 

  24. Arjovsky M, Bottou L. Towards principled methods for training generative adversarial networks. arXiv: 1701.04862, 2017. https://arxiv.org/abs/1701.04862, Nov. 2023.

  25. Raina R, Battle A, Lee H, Packer B, Ng A Y. Self-taught learning: Transfer learning from unlabeled data. In Proc. the 24th International Conference on Machine Learning, Jun. 2007, pp.759–766. DOI: 10.1145/1273496.1273592.

  26. Bengio Y, Courville A, Vincent P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Analysis and Machine Intelligence, 2013, 35(8): 1798–1828. DOI: https://doi.org/10.1109/TPAMI.2013.50.

    Article  Google Scholar 

  27. Bengio Y, Louradour J, Collobert R, Weston J. Curriculum learning. In Proc. the 26th Annual International Conference on Machine Learning, Jun. 2009, pp.41–48. DOI: 10.1145/1553374.1553380.

  28. Peng M L, Zhang Q, Jiang Y G, Huang X J. Cross-domain sentiment classification with target domain specific information. In Proc. the 56th Annual Meeting of the Association for Computational Linguistics, Jul. 2018, pp.2505–2513. DOI: 10.18653/v1/P18-1233.

  29. Ghosal D, Hazarika D, Roy A, Majumder N, Mihalcea R, Poria S. KinGDOM: Knowledge-guided DOMain adaptation for sentiment analysis. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.3198–3210. DOI: 10.18653/v1/2020.acl-main.292.

  30. Cao Y, Fang M, Yu B S, Zhou J T. Unsupervised domain adaptation on reading comprehension. In Proc. the 34th AAAI Conference on Artificial Intelligence, Feb. 2020, pp.7480–7487. DOI: 10.1609/aaai.v34i05.6245.

  31. Kamath A, Jia R B, Liang P. Selective question answering under domain shift. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.5684–5696. DOI: 10.18653/v1/2020.acl-main.503.

  32. Ding N, Long D K, Xu G W, Zhu M H, Xie P J, Wang X B, Zheng H T. Coupling distant annotation and adversarial training for cross-domain Chinese word segmentation. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.6662–6671. DOI: 10.18653/v1/2020.acl-main.595.

  33. Rücklé A, Pfeiffer J, Gurevych I. MultiCQA: Zero –shot transfer of self-supervised text matching models on a massive scale. In Proc. the 2020 Conference on Empirical Methods in Natural Language Processing, Nov. 2020, pp.2471–2486. DOI: 10.18653/v1/2020.emnlp-main.194.

  34. Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv: 1301.3781, 2013. https://arxiv.org/abs/1301.3781, Nov. 2023.

  35. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In Proc. the 26th International Conference on Neural Information Processing Systems, Dec. 2013, pp.3111–3119.

  36. Bengio Y, Ducharme R, Vincent P, Janvin C. A neural probabilistic language model. The Journal of Machine Learning Research, 2003, 3: 1137–1155. DOI: https://doi.org/10.1007/3-540-33486-6_6.

    Article  Google Scholar 

  37. Peters M E, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L. Deep contextualized word representations. arXiv: 1802.05365, 2018. https://arxiv.org/abs/1802.05365, Nov. 2023.

  38. Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training, 2018. https://www.bibsonomy.org/bibtex/15c343ed9a31ac52fd17a898f72af228f/lepsky?lang=en, Nov. 2023.

  39. Kumar M P, Packer B, Koller D. Self-paced learning for latent variable models. In Proc. the 23rd International Conference on Neural Information Processing Systems, Dec. 2010, pp.1189–1197.

  40. Sachan M, Xing E. Easy questions first? A case study on curriculum learning for question answering. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, Aug. 2016, pp.453–463. DOI: 10.18653/v1/P16-1043.

  41. Sachan M, Xing E. Self-training for jointly learning to ask and answer questions. In Proc. the 16th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun. 2018, pp.629–640. DOI: 10.18653/v1/N18-1058.

  42. Tay Y, Wang S H, Tuan L A, Fu J, Phan M C, Yuan X D, Rao J F, Hui S C, Zhang A. Simple and effective curriculum pointer-generator networks for reading comprehension over long narratives. arXiv: 1905.10847, 2019. https://arxiv.org/abs/1905.10847, Nov. 2023.

  43. Xu B F, Zhang L, Mao Z, Wang Q, Xie H, Zhang Y. Curriculum learning for natural language understanding. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.6095–6104. DOI: 10.18653/v1/2020.acl-main.542.

  44. Wu J W, Wang X, Wang W Y. Self-supervised dialogue learning. arXiv: 1907.00448, 2019. https://arxiv.org/abs/1907.00448, Nov. 2023.

  45. Lewis M, Liu Y H, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv: 1910.13461, 2019. https://arxiv.org/abs/1910.13461, Nov. 2023.

  46. Jurczyk T, Zhai M, Choi J D. SelQA: A new benchmark for selection-based question answering. In Proc. the 28th International Conference on Tools with Artificial Intelligence, Nov. 2016, pp.820–827. DOI: 10.1109/ICTAI.2016.0128.

  47. Kingma D P, Ba J. Adam: A method for stochastic optimization. arXiv: 1412.6980, 2014. https://arxiv.org/abs/1412.6980, Nov. 2023.

  48. Bousmalis K, Trigeorgis G, Silberman N, Krishnan D, Erhan D. Domain separation networks. In Proc. the 30th International Conference on Neural Information Processing Systems, Dec. 2016, pp.343–351.

  49. Ziser Y, Reichart R. Task refinement learning for improved accuracy and stability of unsupervised domain adaptation. In Proc. the 57th Annual Meeting of the Association for Computational Linguistics, Jul. 2019, pp.5895–5906. DOI: 10.18653/v1/P19-1591.

  50. Long M S, Zhu H, Wang J M, Jordan M I. Deep transfer learning with joint adaptation networks. In Proc. the 34th International Conference on Machine Learning, Aug. 2017, pp.2208–2217.

  51. Zellinger W, Grubinger T, Lughofer E, Natschläger T, Saminger-Platz S. Central moment discrepancy (CMD) for domain-invariant representation learning. arXiv: 1702.08811, 2017. https://arxiv.org/abs/1702.08811, Dec. 2023.

  52. Ruder S, Plank B. Strong baselines for neural semi-supervised learning under domain shift. arXiv: 1804.09530, 2018. https://arxiv.org/abs/1804.09530, Nov. 2023.

  53. Ge Y X, Chen D P, Li H S. Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv: 2001.01526, 2020. https://arxiv.org/abs/2001.01526, Nov. 2023.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shi-Zhu He.

Supplementary Information

ESM 1

(PDF 298 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bai, GR., Liu, QB., He, SZ. et al. Unsupervised Domain Adaptation on Sentence Matching Through Self-Supervision. J. Comput. Sci. Technol. 38, 1237–1249 (2023). https://doi.org/10.1007/s11390-022-1479-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-022-1479-0

Keywords

Navigation