Abstract
Short text matching is a fundamental technique of natural language processing. It plays an important role in information retrieval, question answering and paraphrase identification, etc. However, due to the lack of available data after Chinese short text word segmentation, we need to take full advantage of the existing text information. In our paper, we propose a sentence matching model with multiway semantic interaction based on multi-granularity semantic embedding(MSIM) to dispose of the problem of Chinese short text matching. First, each sentence pair is represented as multi-granularity embedding: character embedding based on one hot vector, and word embedding obtained from the pre-trained model. In addition, we add the attention mechanism after the character embedding to weight the characters. In order to capture sufficient semantic features, we process short sentence pairs in three ways. We not only match each time step of the two encoded sentences and perform average pooling and maximum pooling operations, but also make deep interaction between each time step representation with attention representation. Finally, we employ BiLSTM to aggregate matching results into a fixed-length matching vector, with the decision made through a fully connected layer. Our method is evaluated on the Chinese datasets CCKS and ATEC. Experimental results demonstrate that the method in our paper takes full advantage of Chinese short text information, outperforming other methods.
Similar content being viewed by others
References
Liu M, Zhang Y, Xu J (2021) Deep bi-directional interaction network for sentence matching. Appl Intell 51:4305–4329. https://doi.org/10.1007/s10489-020-02156-7
Li Z, Wang W, Dong L, Wei F (2020) Harvesting and refining question-answer pairs for unsupervised QA. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 6719–6728
Aithal S, Rao A, Singh S (2021) Automatic question-answer pairs generation and question similarity mechanism in question answering system. Appl Intell 51:8484–8497. https://doi.org/10.1007/s10489-021-02348-9
Zhang W, Feng Y, Meng F, Liu Q (2019) Bridging the gap between training and inference for neural machine translation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 4334–4343
Daybelge T, Cicekli I (2011) A ranking method for example based machine translation results by learning from user feedback. Appl Intell 35(2):296–321. https://doi.org/10.1007/s10489-010-0222-7
Tan C, Wei F, Wang W, Lv W, Zhou M (2018) Multiway attention networks for modeling sentence pairs. In: Proceedings of the 27th international joint conference on artificial intelligence, pp 4411–4417
Djenouri Y, Belhadi A, Djenouri D (2020) Cluster-based information retrieval using pattern mining. Appl Intell:1888–1903. https://doi.org/10.1007/s10489-020-01922-x
Zhang K, Xiong C, Liu Z (2020) Selective weak supervision for neural Information retrieval. In: The Web Conference 2020, Taipei, Taiwan, China, April 20-24, 2020, pp 474–485
Zheng S, Yu J (2012) Automatic summarization of web page based on statistics and structure. In: Tan H (ed) Knowledge discovery and data mining. Advances in intelligent and soft computing, vol 135. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27708-5_89
Wang Z, Mi H, Ittycheriah A (2016) Sentence similarity learning by lexical decomposition and composition. arXiv:1602.07019
Fu C (2020) User correlation model for question recommendation in community question answering. Appl Intell 50(2):634–645. https://doi.org/10.1007/s10489-019-01544-y
Cui W, Zheng G, Wang W (2020) Unsupervised natural language inference via decoupled multimodal contrastive learning. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 5511–5520
Huang P, He X, Gao J, Deng L, Heck L (2013) Learning deep structured semantic models for web search using click through data. In: Proceedings of the 22nd ACM international conference on Information & Knowledge Management (CIKM), pp 2333–2338
Hu B, Lu Z, Li H (2015) Convolutional neural network architectures for matching natural language sentences. In: Advances in neural information processing systems, pp 2042–2050
Palangi H, Deng L, Shen Y (2016) Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24(4):694–707
Wan S, Lan Y, Guo J (2016) A deep architecture for semantic matching with multiple positional sentence representations. In: Proceedings of the 30th AAAI conference on artificial intelligence. Phoenix, USA, pp 2835–2841
Socher R, Huang E, Pennington J (2011) Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Proceedings of the advances in neural information processing systems. Granada, Spain, pp 801–809
Yin W, Schuitze T (2015) MultiGranCNN: an architecture for general matching of text chunks on multiple levels of granularity. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics. Beijing, China, pp 63–73
Wang S, Jiang J (2016) A compare-aggregate model for matching text sequences arXiv: 1611.01747
Bromley J, Guyon I, Lecun Y, Sckinger E, Shah R (1993) Signature verification using a “Siamese” time delay neural network. In: Advances in neural information processing systems 6, [7th NIPS conference, Denver, Colorado, USA, 1993]
Bowman S, Angeli G, Potts C, Manning C (2015) A large annotated corpus for learning natural language inference. In: Computer Science.Proceedings of the 2015 Conference on empirical methods in natural language processing. Lisbon, Portugal, pp 632–642
Tan M, Santos C, Xiang B, Zhou B (2016) Lstm-based deep learning models for non-factoid answer selection arXiv: 1511.04108
Severyn A, Moschitti A (2015) Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. In: The 38th International ACM SIGIR Conference, pp 373–382
Liang P, Lan Y, Guo J (2016) Text matching as image recognition. In: Proceedings of the 30th AAAI conference on artificial intelligence. Phoenix, USA, pp 2793–2799
Parikh A, Tckstrm O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. In: Proceedings of the 2016 Conference on empirical methods in natural language processing. Austin, Texas, pp 2249–2255
Chen Q, Zhu X, Ling Z, Wei S, Jiang H, Inkpen D (2017) Enhanced LSTM for natural language inference. In: Proceedings of the 55th annual meeting of the Association for Computational Linguistics. Vancouver, Canada, pp 1657–1668
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representation in vector space. arXiv:1301.3781
Perez J, Liu F (2017) Gated end-to-end memory networks. In: Proceedings of the 15th conference of the European chapter of the Association for Computational Linguistics. Valencia, Spain, pp 1–10
Mou L, Men R, Ge L, Yan X, Zhang L, Yan R, Jin Z (2016) Natural language inference by tree-based convolution and heuristic matching. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics, pp 130–136
Yin W, Schütze H (2016) Abcnn: attention-based convolutional neural network for modeling sentence pairs. In: Transactions of the Association for Computational Linguistics, pp 259–272
Acknowledgements
This work is supported by the National Nature Science Foundation of China under Project 61673079, cooperation projects between universities in Chongqing and institutes affiliated to the Chinese Academy of Sciences (HZ2021018) and Innovation research group of universities in Chongqing (CXQT20016).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tang, X., Luo, Y., Xiong, D. et al. Short text matching model with multiway semantic interaction based on multi-granularity semantic embedding. Appl Intell 52, 15632–15642 (2022). https://doi.org/10.1007/s10489-022-03410-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03410-w