Abstract
The field of Natural Language Processing (NLP) has flourished during the past decades in computer science, and that is largely due to the exponential growth of internet applications, like search engines, social network platforms, chatbots and Internet of Things (IoT). On the other hand, the robotics and human computer interaction fields have been largely connected to NLP development, by exploring ways of human-robot or human-computer communication in natural language. In this work, we deal with the problem of semantic similarity between text passages, which is one of the problems faced in many NLP applications, like human-computer/robot communication through natural language text. More specifically, we developed three deep learning models to face the problem: two variations of the Siamese BiLSTM model and a variation of the Simple BiLST model. We used two different techniques of word embeddings, (a) classic token-to-vec embedding using GloVe, and (b) one implementing the encoder part of the BERT model. Finally, we train and compare each model in terms of performance, through experimental studies on two datasets, MRPC (MSRP) and Quora, and we draw conclusions about the advantages and disadvantages of each one of them. Siamese BERT-BiLSTM model achieves accuracy 83,03% on the Quora dataset, which is comparable to the state of the art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tellex, S., Gopalan, N., Kress-Gazit, H., Matuszek, C.: Robots That Use Language. Ann. Rev. Control, Robot. Autono. Syst. 3, 25–55 (2020). https://doi.org/10.1146/annurev-control-101119-071628
Zhang, Y., et al.: Building Natural Language Interfaces Using Natural Language Understanding and Generation: A Case Study on Human-Machine Interaction in Agriculture. Appl. Sci. 12(22), 11830 (2022). https://doi.org/10.3390/app122211830
Pennington, J., Socher, R., Manning, C.D.: GloVe: Global Vectors for Word Representation. In: 2014 conference on empirical methods in natural language processing (EMNLP), Doha, Qatar, (2014)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. https://arxiv.org/abs/1810.04805. (2018)
Socher, R., Huang, E.H., Pennington, J., Ng, C.D.: Manning. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in neural information processing systems. (2011)
Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. Adv. Neural. Inf. Process. Syst. 27, 2042–2050 (2014)
Bengio, Y.: Learning deep architectures for AI, Montreal. Now Publishers Inc, Canada (2009)
Yin, W., Schütze, H.: Convolutional neural network for paraphrase identification. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2015)
He, H., Gimpel, K., Lin, J.: Multi-Perspective Sentence Similarity Modeling with Convolutional Neural. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, (2015)
Agarwal, B., Ramampiaro, H., Langseth, H., Ruocco, M.: A deep network model for paraphrase detection in short text messages. Inf. Process. Manage. 54(6), 922–937 (2018)
Song, Y., Hu, W., He, L.: Using fractional latent topic to enhance recurrent neural network in text similarity modeling. In: International Conference on Database Systems for Advanced Applications (2019)
Wang, S., Jiang, J.: Learning Natural Language Inference with LSTM. In: The 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (2016)
Rocktäschel, T., Grefenstette, E., Hermann, K.M., Kočiský, T., Blunsom, P.: Reasoning about entailment with neural attention. In ICLR 2016 (2016)
Logeswaran, L., Lee, H.:. An efficient framework for learning sentence representations. In: International Conference on Learning Representations (2018)
Liu, P., Qiu, X., Chen, J., Huang, X.: Deep fusion LSTMs for text semantic matching. In: Proceedings of ACL 2016, Berlin, Germany, (2016)
Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems (2017)
Zhang, Z., et al.: Semantics aware BERT for Language Understanding. AAAI-2020 (2020)
Parikh, A.P., Täckström, O., Das, D., Uszkoreit, D.: A Decomposable Attention Model for Natural Language Inference. In: MNLP 2016: Conference on Empirical Methods in Natural Language Processing, Austin. Texas, (2016)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.:Improving language understanding by generative pre-training. (2018)
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D: A large annotated corpus for learning natural language inference. In EMNLP, Lisbon, Portugal, (2015)
Liu, L., et al.: On the Variance of the Adaptive Learning Rate and Beyond. In: the 8th International Conference on Learning Representations (ICLR 2020) (2019)
Chen, S., Hou, Y.,Cui, Y., Che, W., Liu, T., Yu, x.: Recall and learn: Fine-tuning deep pretrained language models with less forgetting. (2020) arXiv preprint arXiv:2004.12651
Logeswaran, L., Lee, H.: An efficient framework for learning sentence representations, In: International Conference on Learning Representations. (2018) arXiv:1803.02893 [cs.CL]
Subramanian, S., Trischler, A., Bengio, Y., Pal, C.: Learning general purpose distributed sentence representations via large scale multi-task learning. (2018) arXiv:1804.00079 [cs.CL]
Zhang, X., Rong, W., Liu, J., Tian, C., Xiong, Z.: Convolution neural network based syntactic and semantic aware paraphrase identification. International Joint Conference on Neural Networks (IJCNN) 2017, 2158–2163 (2017)
Le, H.T., Cao, D.T., Bui, T.H., Luong, L.T., Nguyen, H.Q.: Improve Quora Question Pair Dataset for Question Similarity Task. RIVF International Conference on Computing and Communication Technologies (RIVF), pp. 1–5 (2021)
Poksappaiboon, N., Sundarabhogin, N., Tungruethaipak, N., Prom-on, S.: Detecting Text Semantic Similarity by Siamese Neural Networks with MaLSTM in Thai Language. In: Proceedings of the 2nd International Conference on Big Data Analytics and Practices (IBDAP), 07–11 (2021)
Han, S., Shi, L., Richie, R., Tsui, F.R.: Building siamese attention-augmented recurrent convolutional neural networks for document similarity scoring. Inf. Sci. 615, 90–102 (2022)
Viji, D., Revathy, S.: A hybrid approach of Weighted Fine-Tuned BERT extraction with deep Siamese Bi – LSTM model for semantic text similarity identification. Multi. Tools Appl 81, 6131 6157 (2022). https://doi.org/10.1007/s11042-021-11771-6
Gontumukkala, S.S.T., Godavarthi, Y.S.V., Gonugunta, B.R.R.T., Gupta, D., Palaniswam, S.: Quora Question Pairs Identification and Insincere Questions Classification. 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India, pp. 1–6 (2022) https://doi.org/10.1109/ICCCNT54827.2022.9984492
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 IFIP International Federation for Information Processing
About this paper
Cite this paper
Fradelos, G., Perikos, I., Hatzilygeroudis, I. (2023). Using Siamese BiLSTM Models for Identifying Text Semantic Similarity. In: Maglogiannis, I., Iliadis, L., Papaleonidas, A., Chochliouros, I. (eds) Artificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops. AIAI 2023. IFIP Advances in Information and Communication Technology, vol 677. Springer, Cham. https://doi.org/10.1007/978-3-031-34171-7_31
Download citation
DOI: https://doi.org/10.1007/978-3-031-34171-7_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34170-0
Online ISBN: 978-3-031-34171-7
eBook Packages: Computer ScienceComputer Science (R0)