Skip to main content

Using Siamese BiLSTM Models for Identifying Text Semantic Similarity

  • Conference paper
  • First Online:
Artificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops (AIAI 2023)

Abstract

The field of Natural Language Processing (NLP) has flourished during the past decades in computer science, and that is largely due to the exponential growth of internet applications, like search engines, social network platforms, chatbots and Internet of Things (IoT). On the other hand, the robotics and human computer interaction fields have been largely connected to NLP development, by exploring ways of human-robot or human-computer communication in natural language. In this work, we deal with the problem of semantic similarity between text passages, which is one of the problems faced in many NLP applications, like human-computer/robot communication through natural language text. More specifically, we developed three deep learning models to face the problem: two variations of the Siamese BiLSTM model and a variation of the Simple BiLST model. We used two different techniques of word embeddings, (a) classic token-to-vec embedding using GloVe, and (b) one implementing the encoder part of the BERT model. Finally, we train and compare each model in terms of performance, through experimental studies on two datasets, MRPC (MSRP) and Quora, and we draw conclusions about the advantages and disadvantages of each one of them. Siamese BERT-BiLSTM model achieves accuracy 83,03% on the Quora dataset, which is comparable to the state of the art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Tellex, S., Gopalan, N., Kress-Gazit, H., Matuszek, C.: Robots That Use Language. Ann. Rev. Control, Robot. Autono. Syst. 3, 25–55 (2020). https://doi.org/10.1146/annurev-control-101119-071628

    Article  Google Scholar 

  2. Zhang, Y., et al.: Building Natural Language Interfaces Using Natural Language Understanding and Generation: A Case Study on Human-Machine Interaction in Agriculture. Appl. Sci. 12(22), 11830 (2022). https://doi.org/10.3390/app122211830

    Article  Google Scholar 

  3. Pennington, J., Socher, R., Manning, C.D.: GloVe: Global Vectors for Word Representation. In: 2014 conference on empirical methods in natural language processing (EMNLP), Doha, Qatar, (2014)

    Google Scholar 

  4. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. https://arxiv.org/abs/1810.04805. (2018)

  5. Socher, R., Huang, E.H., Pennington, J., Ng, C.D.: Manning. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in neural information processing systems. (2011)

    Google Scholar 

  6. Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. Adv. Neural. Inf. Process. Syst. 27, 2042–2050 (2014)

    Google Scholar 

  7. Bengio, Y.: Learning deep architectures for AI, Montreal. Now Publishers Inc, Canada (2009)

    Book  MATH  Google Scholar 

  8. Yin, W., Schütze, H.: Convolutional neural network for paraphrase identification. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2015)

    Google Scholar 

  9. He, H., Gimpel, K., Lin, J.: Multi-Perspective Sentence Similarity Modeling with Convolutional Neural. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, (2015)

    Google Scholar 

  10. Agarwal, B., Ramampiaro, H., Langseth, H., Ruocco, M.: A deep network model for paraphrase detection in short text messages. Inf. Process. Manage. 54(6), 922–937 (2018)

    Article  Google Scholar 

  11. Song, Y., Hu, W., He, L.: Using fractional latent topic to enhance recurrent neural network in text similarity modeling. In: International Conference on Database Systems for Advanced Applications (2019)

    Google Scholar 

  12. Wang, S., Jiang, J.: Learning Natural Language Inference with LSTM. In: The 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (2016)

    Google Scholar 

  13. Rocktäschel, T., Grefenstette, E., Hermann, K.M., Kočiský, T., Blunsom, P.: Reasoning about entailment with neural attention. In ICLR 2016 (2016)

    Google Scholar 

  14. Logeswaran, L., Lee, H.:. An efficient framework for learning sentence representations. In: International Conference on Learning Representations (2018)

    Google Scholar 

  15. Liu, P., Qiu, X., Chen, J., Huang, X.: Deep fusion LSTMs for text semantic matching. In: Proceedings of ACL 2016, Berlin, Germany, (2016)

    Google Scholar 

  16. Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems (2017)

    Google Scholar 

  17. Zhang, Z., et al.: Semantics aware BERT for Language Understanding. AAAI-2020 (2020)

    Google Scholar 

  18. Parikh, A.P., Täckström, O., Das, D., Uszkoreit, D.: A Decomposable Attention Model for Natural Language Inference. In: MNLP 2016: Conference on Empirical Methods in Natural Language Processing, Austin. Texas, (2016)

    Google Scholar 

  19. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.:Improving language understanding by generative pre-training. (2018)

    Google Scholar 

  20. Bowman, S.R., Angeli, G., Potts, C., Manning, C.D: A large annotated corpus for learning natural language inference. In EMNLP, Lisbon, Portugal, (2015)

    Google Scholar 

  21. Liu, L., et al.: On the Variance of the Adaptive Learning Rate and Beyond. In: the 8th International Conference on Learning Representations (ICLR 2020) (2019)

    Google Scholar 

  22. Chen, S., Hou, Y.,Cui, Y., Che, W., Liu, T., Yu, x.: Recall and learn: Fine-tuning deep pretrained language models with less forgetting. (2020) arXiv preprint arXiv:2004.12651

  23. Logeswaran, L., Lee, H.: An efficient framework for learning sentence representations, In: International Conference on Learning Representations. (2018) arXiv:1803.02893 [cs.CL]

  24. Subramanian, S., Trischler, A., Bengio, Y., Pal, C.: Learning general purpose distributed sentence representations via large scale multi-task learning. (2018) arXiv:1804.00079 [cs.CL]

  25. Zhang, X., Rong, W., Liu, J., Tian, C., Xiong, Z.: Convolution neural network based syntactic and semantic aware paraphrase identification. International Joint Conference on Neural Networks (IJCNN) 2017, 2158–2163 (2017)

    Google Scholar 

  26. Le, H.T., Cao, D.T., Bui, T.H., Luong, L.T., Nguyen, H.Q.: Improve Quora Question Pair Dataset for Question Similarity Task. RIVF International Conference on Computing and Communication Technologies (RIVF), pp. 1–5 (2021)

    Google Scholar 

  27. Poksappaiboon, N., Sundarabhogin, N., Tungruethaipak, N., Prom-on, S.: Detecting Text Semantic Similarity by Siamese Neural Networks with MaLSTM in Thai Language. In: Proceedings of the 2nd International Conference on Big Data Analytics and Practices (IBDAP), 07–11 (2021)

    Google Scholar 

  28. Han, S., Shi, L., Richie, R., Tsui, F.R.: Building siamese attention-augmented recurrent convolutional neural networks for document similarity scoring. Inf. Sci. 615, 90–102 (2022)

    Article  Google Scholar 

  29. Viji, D., Revathy, S.: A hybrid approach of Weighted Fine-Tuned BERT extraction with deep Siamese Bi – LSTM model for semantic text similarity identification. Multi. Tools Appl 81, 6131 6157 (2022). https://doi.org/10.1007/s11042-021-11771-6

    Article  Google Scholar 

  30. Gontumukkala, S.S.T., Godavarthi, Y.S.V., Gonugunta, B.R.R.T., Gupta, D., Palaniswam, S.: Quora Question Pairs Identification and Insincere Questions Classification. 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India, pp. 1–6 (2022) https://doi.org/10.1109/ICCCNT54827.2022.9984492

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ioannis Hatzilygeroudis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fradelos, G., Perikos, I., Hatzilygeroudis, I. (2023). Using Siamese BiLSTM Models for Identifying Text Semantic Similarity. In: Maglogiannis, I., Iliadis, L., Papaleonidas, A., Chochliouros, I. (eds) Artificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops. AIAI 2023. IFIP Advances in Information and Communication Technology, vol 677. Springer, Cham. https://doi.org/10.1007/978-3-031-34171-7_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34171-7_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34170-0

  • Online ISBN: 978-3-031-34171-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics