Text semantic matching algorithm based on the introduction of external knowledge under contrastive learning

Hu, Jie; Zhu, Yinglian; Wu, Lishan; Luo, Qilei; Teng, Fei; Li, Tianrui

doi:10.1007/s13042-024-02285-2

Text semantic matching algorithm based on the introduction of external knowledge under contrastive learning

Original Article
Published: 24 July 2024

Volume 16, pages 741–753, (2025)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Jie Hu^1,2,3,
Yinglian Zhu¹,
Lishan Wu⁴,
Qilei Luo¹,
Fei Teng^1,2,3 &
…
Tianrui Li^1,2,3

285 Accesses
Explore all metrics

Abstract

Measuring the semantic similarity between two texts is a fundamental aspect of text semantic matching. Each word in the texts holds a weighted meaning, and it is essential for the model to effectively capture the most crucial knowledge. However, current text matching methods based on BERT have limitations in acquiring professional domain knowledge. BERT requires extensive domain-specific training data to perform well in specialized fields such as medicine, where obtaining labeled data is challenging. In addition, current text matching models that inject domain knowledge often rely on creating new training tasks to fine-tune the model, which is time-consuming. Although existing works have directly injected domain knowledge into BERT through similarity matrices, they struggle to handle the challenge of small sample sizes in professional fields. Contrastive learning trains a representation learning model by generating instances that exhibit either similarity or dissimilarity, so that a more general representation can be learned with a small number of samples. In this paper, we propose to directly integrate the word similarity matrix into BERT’s multi-head attention mechanism under a contrastive learning framework to align similar words during training. Furthermore, in the context of Chinese medical applications, we propose an entity MASK approach to enhance the understanding of medical terms by pre-trained models. The proposed method helps BERT acquire domain knowledge to better learn text representations in professional fields. Extensive experimental results have shown that the algorithm significantly improves the performance of the text matching model, especially when training data is limited.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Text semantic matching with an enhanced sample building method based on contrastive learning

Article 23 March 2023

Learning to Attentively Represent Distinctive Information for Semantic Text Matching

EMBERT: A Pre-trained Language Model for Chinese Medical Text Mining

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The data generated and/or analyzed during the current study is available from the corresponding author on reasonable request.

References

Ranathunga S, Lee E-SA, Prifti Skenduli M, Shekhar R, Alam M, Kaur R (2023) Neural machine translation for low-resource languages: a survey. ACM Comput Surv 55(11):1–37
Article Google Scholar
Fan Y, Xie X, Cai Y, Chen J, Ma X, Li X, Zhang R, Guo J (2022) Pre-training methods in information retrieval. Found Trends® Inf Retriev 16(3):178–317
Article MATH Google Scholar
Deldjoo Y, Nazary F, Ramisa A, Mcauley J, Pellegrini G, Bellogin A, Noia TD (2023) A review of modern fashion recommender systems. ACM Comput Surv 56(4):1–37
Article Google Scholar
Kenton JDM-WC, Toutanova LK (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 4171–4186
Radford A, Narasimhan K, Salimans T, Sutskever I et al (2018) Improving language understanding by generative pre-training
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692
Liu W, Zhou P, Zhao Z, Wang Z, Ju Q, Deng H, Wang P (2020)K-BERT: Enabling language representation with knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 2901–2908
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Chen MY, Jiang H, Yang Y (2022) Context enhanced short text matching using clickthrough data. arXiv preprint arXiv:2203.01849
Xia T, Wang Y, Tian Y, Chang Y (2021) Using prior knowledge to guide BERT’s attention in semantic textual matching tasks. In: Proceedings of the Web Conference, pp 2466–2475
Rasmy L, Xiang Y, Xie Z, Tao C, Zhi D (2021) Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ Digit Med 4(1):86
Article Google Scholar
Su J, Cao J, Liu W, Ou Y (2021) Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:2103.15316
Caron M, Misra I, Mairal J, Goyal P, Bojanowski P, Joulin A (2020) Unsupervised learning of visual features by contrasting cluster assignments. Adv Neural Inf Process Syst 33:9912–9924
Google Scholar
Reimers N, Gurevych I (2019) Sentence-BERT: Sentence embeddings using siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 3982–3992
Sun Y, Wang S, Li Y, Feng S, Tian H, Wu H, Wang H (2020) ERNIE 2.0: A continual pre-training framework for language understanding. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 8968–8975
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. OpenAI Blog 1(8):9
Google Scholar
Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901
Google Scholar
Koubaa A (2023) GPT-4 vs. GPT-3.5: A concise showdown
Li B, Zhou H, He J, Wang M, Yang Y, Li L (2020) On the sentence embeddings from pre-trained language models. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 9119–9130
Liu Z, Xiong C, Sun M, Liu Z (2018) Entity-duet neural ranking: Understanding the role of knowledge graph semantics in neural information retrieval. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 2395–2405
Wang Z, Wu Z, Agarwal D, Sun J (2022) MedCLIP: Contrastive learning from unpaired medical images and text. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Rethmeier N, Augenstein I (2023) A primer on contrastive pretraining in language processing: methods, lessons learned, and perspectives. ACM Comput Surv 55(10):1–17
Article MATH Google Scholar
Gao T, Yao X, Chen D (2021) SimCSE: Simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp 6894–6910
Wu X, Gao C, Zang L, Han J, Wang Z, Hu S (2022)ESimCSE: Enhanced sample building method for contrastive learning of unsupervised sentence embedding. In: Proceedings of the 29th International Conference on Computational Linguistics, pp 3898–3907
Chuang Y-S, Dangovski R, Luo H, Zhang Y, Chang S, Soljačić M, Li S-W, Yih S, Kim Y, Glass J (2022) DiffCSE: Difference-based contrastive learning for sentence embeddings. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 4207–4218
Liu J, Liu J, Wang Q, Wang J, Wu W, Xian Y, Zhao D, Chen K, Yan R (2023) RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 13785–13802
Chanchani S, Huang R (2023) Composition-contrastive learning for sentence embeddings. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 15836–15848
Zhou K, Zhang B, Zhao WX, Wen J-R (2022) Debiased contrastive learning of unsupervised sentence representations. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 6120–6130
Wu X, Gao C, Su Y, Han J, Wang Z, Hu S (2022) Smoothed contrastive learning for unsupervised sentence embedding. In: Proceedings of the 29th International Conference on Computational Linguistics, pp 4902–4906
Huang X, Peng H, Zou D, Liu Z, Li J, Liu K, Wu J, Su J, Yu PS (2024) CoSENT: Consistent sentence embedding via similarity ranking. IEEE/ACM Trans Audio Speech Language Process 32:2800–2813
Article Google Scholar
Nishikawa S, Ri R, Yamada I, Tsuruoka Y, Echizen I (2022) EASE: Entity-aware contrastive learning of sentence embedding. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 3870–3885
Wu L, Hu J, Teng F, Li T, Du S (2023) Text semantic matching with an enhanced sample building method based on contrastive learning. Int J Mach Learn Cybern 14:3105–3112
Article MATH Google Scholar
Karimi A, Rossi L, Prati A (2021) AEDA: An easier data augmentation technique for text classification. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp 2748–2754
Liu H, Singh P (2004) ConceptNet–a practical commonsense reasoning tool-kit. BT Technol J 22(4):211–226
Article MATH Google Scholar
Cer D, Diab M, Agirre EE, Lopez-Gazpio I, Specia L (2017) SemEval-2017 Task 1: Semantic textual similarity multilingual and cross-lingual focused evaluation. In: The 11th International Workshop on Semantic Evaluation (SemEval-2017), pp 1–14
Le HT, Cao DT, Bui TH, Luong LT, Nguyen HQ (2021) Improve quora question pair dataset for question similarity task. In: 2021 RIVF International Conference on Computing and Communication Technologies (RIVF), pp 1–5
Dolan B, Brockett C (2005) Automatically constructing a corpus of sentential paraphrases. In: 3rd International Workshop on Paraphrasing (IWP2005)
Lan W, Qiu S, He H, Xu W (2017)A continuously growing dataset of sentential paraphrases. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp 1224–1234
Jin Q, Dhingra B, Liu Z, Cohen W, Lu X (2019) PubMedQA: A dataset for biomedical research question answering. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 2567–2577
Tianchi (2020) New crown epidemic question sentence judgment dataset. https://tianchi.aliyun.com/dataset/dataDetail?dataId=76751
Zhang N, Chen M, Bi Z, Liang X, Li L, Shang X, Yin K, Tan, C, Xu J, Huang F (2022) CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 7888–7915
Chen Q, Zhu X, Ling Z-H, Wei S, Jiang H, Inkpen D (2017) Enhanced LSTM for natural language inference. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1657–1668

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 62176221, 62272398), Sichuan Science and Technology Program (Nos. 2023YFG0354, MZGC20230073, 2024YFHZ0024), the Key Research and Development Program of Sichuan Province (Nos. 2022NSFSC0502), and the 2023 Southwest Jiaotong University International Student Education Management Research Project (No. 23LXSGL01).

Author information

Authors and Affiliations

School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, 611756, Sichuan, China
Jie Hu, Yinglian Zhu, Qilei Luo, Fei Teng & Tianrui Li
Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province, Southwest Jiaotong University, Chengdu, 611756, Sichuan, China
Jie Hu, Fei Teng & Tianrui Li
National Engineering Laboratory of Integrated Transportation Big Data Application Technology, Southwest Jiaotong University, Chengdu, 611756, Sichuan, China
Jie Hu, Fei Teng & Tianrui Li
Zhejiang Aikosolar Energy Technology Co., YiWu, 322000, Zhejiang, China
Lishan Wu

Authors

Jie Hu
View author publications
You can also search for this author inPubMed Google Scholar
Yinglian Zhu
View author publications
You can also search for this author inPubMed Google Scholar
Lishan Wu
View author publications
You can also search for this author inPubMed Google Scholar
Qilei Luo
View author publications
You can also search for this author inPubMed Google Scholar
Fei Teng
View author publications
You can also search for this author inPubMed Google Scholar
Tianrui Li
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Conception and design of study: Jie Hu; Acquisition of data and data curation: Jie Hu and Yinglian Zhu; Analysis and/or interpretation of data: Jie Hu and Lishan Wu; Drafting the manuscript: Jie Hu and Lishan Wu; Critical revision: Jie Hu, Yinglian Zhu, Qilei Luo, Fei Teng and Tianrui Li.

Corresponding author

Correspondence to Jie Hu.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hu, J., Zhu, Y., Wu, L. et al. Text semantic matching algorithm based on the introduction of external knowledge under contrastive learning. Int. J. Mach. Learn. & Cyber. 16, 741–753 (2025). https://doi.org/10.1007/s13042-024-02285-2

Download citation

Received: 10 November 2023
Accepted: 14 July 2024
Published: 24 July 2024
Issue Date: January 2025
DOI: https://doi.org/10.1007/s13042-024-02285-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Text semantic matching algorithm based on the introduction of external knowledge under contrastive learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Text semantic matching with an enhanced sample building method based on contrastive learning

Learning to Attentively Represent Distinctive Information for Semantic Text Matching

EMBERT: A Pre-trained Language Model for Chinese Medical Text Mining

Explore related subjects

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now