Skip to main content
Log in

SiMaLSTM-SNP: novel semantic relatedness learning model preserving both Siamese networks and membrane computing

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Semantic relatedness is one of the most significant aspects of natural language processing. It has been identified as a critical technology for developing intelligent systems like Siri, Microsoft Ice, Cortana, and Xiaoai. In 2014, SemEval ranked SR as the top task. While many existing studies have focused on analyzing the entailment of single phrases, advancements in deep learning have made it possible to analyze complete sentences or texts. While the natural parallelism of membrane computing has shown promise for data processing, harnessing this potential to advance semantic relatedness remains an open problem yet to be tackled. This paper proposes a novel Siamese Manhattan LSTM-SNP approach (SiMaLSTM-SNP) for the SR problem. The approach uses a collaborative Word2vec and 10-Layer Attention strategy to represent and extract sentence pairs and a Siamese LSTM-SNP structure to calculate the hidden states of sentences. The multi-head self-attention layer identifies text associations and redistributes hidden state weights. The last hidden state is extracted, and the relatedness score is calculated using the Manhattan distance. The experiments demonstrate that SiMaLSTM-SNP outperforms 17 classical SR baselines and 7 novel approaches on the standard datasets SICK and STS in terms of mean square error performance. This indicates that SiMaLSTM-SNP can accurately capture the semantic distinction between two sentences and effectively preserve their semantic information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Availability of supporting data

Data will be shared on request.

Notes

  1. The code is available at: https://github.com/gooSAMA/SiMaLSTM-SNP.

  2. https://www.kaggle.com/competitions/quora-question-pairs.

References

  1. Zaremba W, Sutskever I, Vinyals O (2014) Recurrent neural network regularization. arXiv:1409.2329, https://doi.org/10.48550/arXiv.1409.2329

  2. Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2015) LSTM: a search space odyssey. CoRR arXiv: abs/1503.04069, https://doi.org/10.1109/TNNLS.2016.2582924

  3. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. CoRR arXiv: abs/1503.00075, https://doi.org/10.48550/arXiv.1503.00075

  4. Mueller J, Thyagarajan A (2016) Siamese recurrent architectures for learning sentence similarity. In: 30th AAAI conference on artificial intelligence, AAAI 2016, February 12, 2016–February 17, 2016

  5. Păun G (2000) Computing with membranes. J Comput Syst Sci 61:108–143. https://doi.org/10.1006/jcss.1999.1693

    Article  MathSciNet  Google Scholar 

  6. Păun G, Rozenberg G, Salomaa A (eds) (2010) The Oxford handbook of membrane computing. Oxford University Press, The Netherlands

    Google Scholar 

  7. Păun MIG, Yokomori T (2006) Spiking neural p systems. Fund Inform 71:279–308. https://doi.org/10.1109/BICTA.2010.5645192

    Article  MathSciNet  Google Scholar 

  8. Chen X, Peng H, Wang J, Hao F (2022) Supervisory control of discrete event systems under asynchronous spiking neuron P systems. Inf Sci 597:253–273. https://doi.org/10.1016/j.ins.2022.03.003

    Article  Google Scholar 

  9. Liu Q, Long L, Peng H, Wang J, Yang Q, Song X, Riscos-Nunez A, Perez-Jimenez MJ (2021) Gated spiking neural p systems for time series forecasting. https://doi.org/10.1109/TNNLS.2021.3134792

  10. Peng H, Lv Z, Li B, Luo X, Wang J, Song X, Wang T, Pérez-Jiménez MJ, Riscos-Núñez A (2020) Nonlinear spiking neural P systems. Int J Neural Syst 30(10):2050008–1205000817. https://doi.org/10.1142/S0129065720500082

    Article  Google Scholar 

  11. Liu Q, Long L, Yang Q, Peng H, Wang J, Luo X (2022) Lstm-snp: a long short-term memory model inspired from spiking neural p systems. Knowl Based Syst 235:107656. https://doi.org/10.1016/j.knosys.2021.107656

    Article  Google Scholar 

  12. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: 1st International conference on learning representations, ICLR 2013, Scottsdale, Arizona, USA, May 2–4, 2013, Workshop track proceedings

  13. Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015

  14. Liu B, Lane IR (2016) Attention-based recurrent neural network models for joint intent detection and slot filling. CoRR arXiv: abs/1609.01454, https://doi.org/10.48550/arXiv.1609.01454

  15. Xiao F, Liu B, Li R (2020) Pedestrian object detection with fusion of visual attention mechanism and semantic computation. Multimedia Tools Appl 79(21–22):14593–14607. https://doi.org/10.1007/s11042-018-7143-6

    Article  Google Scholar 

  16. Won K, Jang Y, Choi H, Shin S (2020) Semantic classification of emf-related literature using deep learning models with attention mechanism. In: 2020 Research in adaptive and convergent systems, RACS 2020, October 13, 2020–October 16, 2020

  17. Marelli M, Menini S, Baroni M, Bentivogli L, Bernardi R, Zamparelli R (2014) A SICK cure for the evaluation of compositional distributional semantic models. In: Proceedings of the ninth international conference on language resources and evaluation, LREC 2014, Reykjavik, Iceland, May 26–31, 2014

  18. Cer DM, Diab MT, A E, Gazpio IL, Specia L (2017) Semeval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation. In: Proceedings of the 11th international workshop on semantic evaluation, SemEval@ACL 2017, Vancouver, Canada, August 3–4, 2017

  19. He H, Gimpel K, Lin J (2015) Multi-perspective sentence similarity modeling with convolutional neural networks. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015

  20. Lei F, Liu X, Dai Q, Ling BW (2019) Shallow convolutional neural network for image classification. SN Appl Sci 2(1):97. https://doi.org/10.1007/s42452-019-1903-4

    Article  Google Scholar 

  21. Kiros R, Zhu Y, Salakhutdinov R, Zemel RS, Torralba A, Urtasun R, Fidler S (2015) Skip-thought vectors. In: 29th Annual conference on neural information processing systems, NIPS 2015, December 7, 2015–December 12, 2015

  22. Wieting J, Kirkpatrick TB, Gimpel K, Neubig G (2019) Beyond BLEU: training neural machine translation with semantic similarity. CoRR arXiv: abs/1909.06694, https://doi.org/10.18653/v1/P19-1427

  23. Lieto A, Moro D, Devoti F, Parera C, Lipari V, Bestagini P, Tubaro S (2019) "hello? who am I talking to?" A shallow CNN approach for human vs. bot speech classification. In: IEEE international conference on acoustics, speech and signal processing, ICASSP 2019, Brighton, United Kingdom, May 12–17, 2019

  24. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. CoRR arXiv: abs/1310.4546, https://doi.org/10.5555/2999792.2999959

  25. Lee S, Lee D, Jang S, Yu H (2022) Toward interpretable semantic textual similarity via optimal transport-based contrastive sentence learning. In: Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: long papers), ACL 2022, Dublin, Ireland, May 22–27, 2022, pp 5969–5979. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.412

  26. Li H, Wang W, Liu Z, Niu Y, Wang H, Zhao S, Liao Y, Yang W, Liu X (2022) A novel locality-sensitive hashing relational graph matching network for semantic textual similarity measurement. Expert Syst Appl 207:117832. https://doi.org/10.1016/j.eswa.2022.117832

    Article  Google Scholar 

  27. Cho K, Van MB, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078, https://doi.org/10.48550/arXiv.1406.1078

  28. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. CoRR arXiv: abs/1706.03762. https://doi.org/10.48550/arXiv.1706.03762

  29. Devlin J, Chang M, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. CoRR arXiv: abs/1810.04805, https://doi.org/10.48550/arXiv.1810.04805

  30. Chandrasekaran D, Mago V (2021) Comparative analysis of word embeddings in assessing semantic similarity of complex sentences. IEEE Access 9:166395–166408. https://doi.org/10.1109/ACCESS.2021.3135807

    Article  Google Scholar 

  31. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized BERT pretraining approach. CoRR arXiv: abs/1907.11692, https://doi.org/10.48550/arXiv.1907.11692

  32. Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2019) ALBERT: a lite BERT for self-supervised learning of language representations. CoRR arXiv: abs/1909.11942, https://doi.org/10.48550/arXiv.1909.11942

  33. Wang T, Shi H, Liu W, Yan X (2022) A joint framenet and element focusing sentence-bert method of sentence similarity computation. Expert Syst Appl 200:117084. https://doi.org/10.1016/j.eswa.2022.117084

    Article  Google Scholar 

  34. Viji D, Revathy S (2022) A hybrid approach of weighted fine-tuned BERT extraction with deep siamese bi-LSTM model for semantic text similarity identification. Multimedia Tools Appl 81(5):6131–6157. https://doi.org/10.1007/s11042-021-11771-6

    Article  Google Scholar 

  35. Long L, Liu Q, Peng H, Wang J, Yang Q (2022) Multivariate time series forecasting method based on nonlinear spiking neural P systems and non-subsampled shearlet transform. Neural Netw 152:300–310. https://doi.org/10.1016/j.neunet.2022.04.030

    Article  Google Scholar 

  36. Saruladha K, Thirumagal E, Arthi J, Aghila G (2013) Manhattan based hybrid semantic similarity algorithm for geospatial ontologies. 15th International Conference on Asia-Pacific Digital Libraries, ICADL 2013, December 9, 2013 - December 11, 2013

  37. Chopra S, Hadsell R, LeCun Y (2005) Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR 2005), 20–26 June 2005, San Diego, CA, USA

  38. Prechelt L (2012) Early stopping-but when? Neural Networks, Tricks of the Trade-Second Edition

  39. Bromley J, Bentz JW, Bottou L, Guyon I, LeCun Y, Moore C, Säckinger E, Shah R (1993) Signature verification using A siamese time delay neural network. Int J Pattern Recognit Artif Intell 7:669–688. https://doi.org/10.1142/S0218001493000339

    Article  Google Scholar 

  40. Lee DH (2019) Fully convolutional single-crop siamese networks for real-time visual object tracking. Electronics 8:10. https://doi.org/10.3390/electronics8101084

    Article  Google Scholar 

  41. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473, https://doi.org/10.48550/arXiv.1409.0473

  42. Pearson K (1896) Mathematical contributions to the theory of evolution on a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc Roy Soc Lond 60(359367):489–498. https://doi.org/10.1098/rspl.1896.0076

    Article  Google Scholar 

  43. Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15(1):72–101. https://doi.org/10.2307/1412159

    Article  Google Scholar 

  44. Levine SZ (1967) Some remarks on the coefficient of determination for the normal distribution. J Am Stat Assoc 62(320):1329–1333

    Google Scholar 

  45. Huang B, Bai Y, Zhou X (2021) hub at semeval-2021 task 2: word meaning similarity prediction model based on roberta and word frequency. In: Proceedings of the 15th international workshop on semantic evaluation, SemEval@ACL/IJCNLP 2021, Virtual Event/Bangkok, Thailand, August 5–6, 2021

  46. Lai A, Hockenmaier J (2014) Illinois-lh: a denotational and distributional approach to semantics. In: 8th International workshop on semantic evaluation, SemEval 2014, August 23, 2014–August 24, 2014

  47. Jimenez S, Duenas G, Baquero J, Gelbukh A (2014) Unal-nlp: Combining soft cardinality features for semantic textual similarity, relatedness and entailment. 8th International Workshop on Semantic Evaluation, SemEval 2014, August 23, 2014 - August 24, 2014

  48. Zhao J, Zhu T, Lan M (2014) Ecnu: one stone two birds: ensemble of heterogenous measures for semantic relatedness and textual entailment. In: 8th International workshop on semantic evaluation, SemEval 2014, August 23, 2014–August 24, 2014

  49. Bjerva J, Bos J, Goot RVD, Nissim M (2014) The meaning factory: formal semantics for recognizing textual entailment and determining semantic similarity. In: 8th International workshop on semantic evaluation, SemEval 2014, August 23, 2014–August 24, 2014

  50. Proisl T, Evert S, Greiner P, Kabashi B (2014) Semantiklue: Robust semantic similarity at multiple levels using maximum weight matching. In: Proceedings of the 8th international workshop on semantic evaluation, SemEval@COLING 2014, Dublin, Ireland, August 23–24, 2014

  51. Bestgen Y (2014) CECL: a new baseline and a non-compositional approach for the sick benchmark. In: Proceedings of the 8th international workshop on semantic evaluation, SemEval@COLING 2014, Dublin, Ireland, August 23–24, 2014

  52. Socher R, Karpathy A, Le QV, Manning CD, Ng AY (2014) Grounded compositional semantics for finding and describing images with sentences. Trans Assoc Comput Linguist 2:207–218. https://doi.org/10.1162/tacl_a_00177

    Article  Google Scholar 

  53. Sutskever I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks. Advances in neural information processing systems 27. https://doi.org/10.5555/2969033.2969173

  54. Huang DG, Arafat AASY, Rashid KI, Abbas Q, Ren FJ (2020) Sentence-embedding and similarity via hybrid bidirectional-lstm and cnn utilizing weighted-pooling attention. IEICE Trans Inf Syst E103D(10):2216–2227. https://doi.org/10.1587/transinf.2018EDP7410

    Article  Google Scholar 

  55. Chen Y (2018) CT-LSTM: detection and estimation duplexed system for robust object tracking. In: The 2nd international conference on computer science and application engineering, CSAE 2018, Hohhot, China, October 22–24, 2018

  56. Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020

  57. Chang E (2022) A vector-based semantic relatedness measure using multiple relations within SNOMED CT and UMLS. J Biomed Inf 131:104118. https://doi.org/10.1016/j.jbi.2022.104118

    Article  Google Scholar 

  58. Ethayarajh K, Duvenaud D, Hirst G (2018) Towards understanding linear word analogies. CoRR arXiv: abs/1810.04882, https://doi.org/10.18653/v1/P19-1315

  59. Li B, Zhou H, He J, Wang M, Yang Y, Li L (2020) On the sentence embeddings from pre-trained language models. CoRR arXiv: abs/2011.05864, https://doi.org/10.48550/arXiv.2011.05864

  60. Gao J, He D, Tan X, Qin T, Wang L, Liu T (2019) Representation degeneration problem in training natural language generation models. CoRR arXiv: abs/1907.12009, https://doi.org/10.48550/arXiv.1907.12009

  61. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, a meeting of SIGDAT, a Special Interest Group of the ACL

  62. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16x16 words: transformers for image recognition at scale. CoRR arXiv: abs/2010.11929, https://doi.org/10.48550/arXiv.2010.11929

Download references

Acknowledgements

Not Applicable.

Funding

This work is supported by the the Science and Technology Program of Sichuan Province (Grant no. 2023YFS0424), the “Open bidding for selecting the best candidates” Science and Technology Project of Chengdu (Grant no. 2023-JB00-00020-GX), and the National Natural Science Foundation (Grant nos. 61902324, 11426179, and 61872298).

Author information

Authors and Affiliations

Authors

Contributions

XG took part in conceptualization; XG and XLC involved in data curation; XG took part in formal analysis; XLC and YJD took part in funding acquisition; XG involved in investigation; XLC participated in methodology; XLC took part in project administration; XL involved in resources; XG took part in software; XLC involved in supervision; PL and XYL involved in validation; XG took part in visualization; XG involved in writing-original draft; XLC and PL took part in writing-review & editing

Corresponding author

Correspondence to Xiaoliang Chen.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethica approval

Not Applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gu, X., Chen, X., Lu, P. et al. SiMaLSTM-SNP: novel semantic relatedness learning model preserving both Siamese networks and membrane computing. J Supercomput 80, 3382–3411 (2024). https://doi.org/10.1007/s11227-023-05592-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05592-7

Keywords

Navigation