Using Siamese BiLSTM Models for Identifying Text Semantic Similarity

Fradelos, Georgios; Perikos, Isidoros; Hatzilygeroudis, Ioannis

doi:10.1007/978-3-031-34171-7_31

Georgios Fradelos¹⁹,
Isidoros Perikos^19,20 &
Ioannis Hatzilygeroudis¹⁹

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 677))

Included in the following conference series:

IFIP International Conference on Artificial Intelligence Applications and Innovations

593 Accesses
1 Citations

Abstract

The field of Natural Language Processing (NLP) has flourished during the past decades in computer science, and that is largely due to the exponential growth of internet applications, like search engines, social network platforms, chatbots and Internet of Things (IoT). On the other hand, the robotics and human computer interaction fields have been largely connected to NLP development, by exploring ways of human-robot or human-computer communication in natural language. In this work, we deal with the problem of semantic similarity between text passages, which is one of the problems faced in many NLP applications, like human-computer/robot communication through natural language text. More specifically, we developed three deep learning models to face the problem: two variations of the Siamese BiLSTM model and a variation of the Simple BiLST model. We used two different techniques of word embeddings, (a) classic token-to-vec embedding using GloVe, and (b) one implementing the encoder part of the BERT model. Finally, we train and compare each model in terms of performance, through experimental studies on two datasets, MRPC (MSRP) and Quora, and we draw conclusions about the advantages and disadvantages of each one of them. Siamese BERT-BiLSTM model achieves accuracy 83,03% on the Quora dataset, which is comparable to the state of the art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Tellex, S., Gopalan, N., Kress-Gazit, H., Matuszek, C.: Robots That Use Language. Ann. Rev. Control, Robot. Autono. Syst. 3, 25–55 (2020). https://doi.org/10.1146/annurev-control-101119-071628
Article Google Scholar
Zhang, Y., et al.: Building Natural Language Interfaces Using Natural Language Understanding and Generation: A Case Study on Human-Machine Interaction in Agriculture. Appl. Sci. 12(22), 11830 (2022). https://doi.org/10.3390/app122211830
Article Google Scholar
Pennington, J., Socher, R., Manning, C.D.: GloVe: Global Vectors for Word Representation. In: 2014 conference on empirical methods in natural language processing (EMNLP), Doha, Qatar, (2014)
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. https://arxiv.org/abs/1810.04805. (2018)
Socher, R., Huang, E.H., Pennington, J., Ng, C.D.: Manning. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in neural information processing systems. (2011)
Google Scholar
Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. Adv. Neural. Inf. Process. Syst. 27, 2042–2050 (2014)
Google Scholar
Bengio, Y.: Learning deep architectures for AI, Montreal. Now Publishers Inc, Canada (2009)
Book MATH Google Scholar
Yin, W., Schütze, H.: Convolutional neural network for paraphrase identification. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2015)
Google Scholar
He, H., Gimpel, K., Lin, J.: Multi-Perspective Sentence Similarity Modeling with Convolutional Neural. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, (2015)
Google Scholar
Agarwal, B., Ramampiaro, H., Langseth, H., Ruocco, M.: A deep network model for paraphrase detection in short text messages. Inf. Process. Manage. 54(6), 922–937 (2018)
Article Google Scholar
Song, Y., Hu, W., He, L.: Using fractional latent topic to enhance recurrent neural network in text similarity modeling. In: International Conference on Database Systems for Advanced Applications (2019)
Google Scholar
Wang, S., Jiang, J.: Learning Natural Language Inference with LSTM. In: The 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (2016)
Google Scholar
Rocktäschel, T., Grefenstette, E., Hermann, K.M., Kočiský, T., Blunsom, P.: Reasoning about entailment with neural attention. In ICLR 2016 (2016)
Google Scholar
Logeswaran, L., Lee, H.:. An efficient framework for learning sentence representations. In: International Conference on Learning Representations (2018)
Google Scholar
Liu, P., Qiu, X., Chen, J., Huang, X.: Deep fusion LSTMs for text semantic matching. In: Proceedings of ACL 2016, Berlin, Germany, (2016)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems (2017)
Google Scholar
Zhang, Z., et al.: Semantics aware BERT for Language Understanding. AAAI-2020 (2020)
Google Scholar
Parikh, A.P., Täckström, O., Das, D., Uszkoreit, D.: A Decomposable Attention Model for Natural Language Inference. In: MNLP 2016: Conference on Empirical Methods in Natural Language Processing, Austin. Texas, (2016)
Google Scholar
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.:Improving language understanding by generative pre-training. (2018)
Google Scholar
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D: A large annotated corpus for learning natural language inference. In EMNLP, Lisbon, Portugal, (2015)
Google Scholar
Liu, L., et al.: On the Variance of the Adaptive Learning Rate and Beyond. In: the 8^th International Conference on Learning Representations (ICLR 2020) (2019)
Google Scholar
Chen, S., Hou, Y.,Cui, Y., Che, W., Liu, T., Yu, x.: Recall and learn: Fine-tuning deep pretrained language models with less forgetting. (2020) arXiv preprint arXiv:2004.12651
Logeswaran, L., Lee, H.: An efficient framework for learning sentence representations, In: International Conference on Learning Representations. (2018) arXiv:1803.02893 [cs.CL]
Subramanian, S., Trischler, A., Bengio, Y., Pal, C.: Learning general purpose distributed sentence representations via large scale multi-task learning. (2018) arXiv:1804.00079 [cs.CL]
Zhang, X., Rong, W., Liu, J., Tian, C., Xiong, Z.: Convolution neural network based syntactic and semantic aware paraphrase identification. International Joint Conference on Neural Networks (IJCNN) 2017, 2158–2163 (2017)
Google Scholar
Le, H.T., Cao, D.T., Bui, T.H., Luong, L.T., Nguyen, H.Q.: Improve Quora Question Pair Dataset for Question Similarity Task. RIVF International Conference on Computing and Communication Technologies (RIVF), pp. 1–5 (2021)
Google Scholar
Poksappaiboon, N., Sundarabhogin, N., Tungruethaipak, N., Prom-on, S.: Detecting Text Semantic Similarity by Siamese Neural Networks with MaLSTM in Thai Language. In: Proceedings of the 2nd International Conference on Big Data Analytics and Practices (IBDAP), 07–11 (2021)
Google Scholar
Han, S., Shi, L., Richie, R., Tsui, F.R.: Building siamese attention-augmented recurrent convolutional neural networks for document similarity scoring. Inf. Sci. 615, 90–102 (2022)
Article Google Scholar
Viji, D., Revathy, S.: A hybrid approach of Weighted Fine-Tuned BERT extraction with deep Siamese Bi – LSTM model for semantic text similarity identification. Multi. Tools Appl 81, 6131 6157 (2022). https://doi.org/10.1007/s11042-021-11771-6
Article Google Scholar
Gontumukkala, S.S.T., Godavarthi, Y.S.V., Gonugunta, B.R.R.T., Gupta, D., Palaniswam, S.: Quora Question Pairs Identification and Insincere Questions Classification. 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India, pp. 1–6 (2022) https://doi.org/10.1109/ICCCNT54827.2022.9984492

Download references

Author information

Authors and Affiliations

Computer Engineering and Informatics Department, University of Patras, 26504, Patras, Greece
Georgios Fradelos, Isidoros Perikos & Ioannis Hatzilygeroudis
Computer Technology Institute and Press Diophantus, 26504, Patras, Greece
Isidoros Perikos

Authors

Georgios Fradelos
View author publications
You can also search for this author in PubMed Google Scholar
Isidoros Perikos
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis Hatzilygeroudis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ioannis Hatzilygeroudis .

Editor information

Editors and Affiliations

University of Piraeus, Piraeus, Greece
Ilias Maglogiannis
Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
Hellenic Telecom Organization OTE, Athens, Greece
Ioannis Chochliouros

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fradelos, G., Perikos, I., Hatzilygeroudis, I. (2023). Using Siamese BiLSTM Models for Identifying Text Semantic Similarity. In: Maglogiannis, I., Iliadis, L., Papaleonidas, A., Chochliouros, I. (eds) Artificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops. AIAI 2023. IFIP Advances in Information and Communication Technology, vol 677. Springer, Cham. https://doi.org/10.1007/978-3-031-34171-7_31

Download citation

DOI: https://doi.org/10.1007/978-3-031-34171-7_31
Published: 02 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34170-0
Online ISBN: 978-3-031-34171-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)