SiMaLSTM-SNP: novel semantic relatedness learning model preserving both Siamese networks and membrane computing

Gu, Xu; Chen, Xiaoliang; Lu, Peng; Lan, Xiang; Li, Xianyong; Du, Yajun

doi:10.1007/s11227-023-05592-7

SiMaLSTM-SNP: novel semantic relatedness learning model preserving both Siamese networks and membrane computing

Published: 04 September 2023

Volume 80, pages 3382–3411, (2024)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Xu Gu¹,
Xiaoliang Chen^1,2,
Peng Lu²,
Xiang Lan³,
Xianyong Li¹ &
…
Yajun Du¹

129 Accesses
1 Citation
Explore all metrics

Abstract

Semantic relatedness is one of the most significant aspects of natural language processing. It has been identified as a critical technology for developing intelligent systems like Siri, Microsoft Ice, Cortana, and Xiaoai. In 2014, SemEval ranked SR as the top task. While many existing studies have focused on analyzing the entailment of single phrases, advancements in deep learning have made it possible to analyze complete sentences or texts. While the natural parallelism of membrane computing has shown promise for data processing, harnessing this potential to advance semantic relatedness remains an open problem yet to be tackled. This paper proposes a novel Siamese Manhattan LSTM-SNP approach (SiMaLSTM-SNP) for the SR problem. The approach uses a collaborative Word2vec and 10-Layer Attention strategy to represent and extract sentence pairs and a Siamese LSTM-SNP structure to calculate the hidden states of sentences. The multi-head self-attention layer identifies text associations and redistributes hidden state weights. The last hidden state is extracted, and the relatedness score is calculated using the Manhattan distance. The experiments demonstrate that SiMaLSTM-SNP outperforms 17 classical SR baselines and 7 novel approaches on the standard datasets SICK and STS in terms of mean square error performance. This indicates that SiMaLSTM-SNP can accurately capture the semantic distinction between two sentences and effectively preserve their semantic information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on deep learning approaches for text-to-SQL

Article Open access 23 January 2023

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Article Open access 17 February 2024

Pre-trained models for natural language processing: A survey

Article 15 September 2020

Availability of supporting data

Data will be shared on request.

Notes

The code is available at: https://github.com/gooSAMA/SiMaLSTM-SNP.
https://www.kaggle.com/competitions/quora-question-pairs.

References

Zaremba W, Sutskever I, Vinyals O (2014) Recurrent neural network regularization. arXiv:1409.2329, https://doi.org/10.48550/arXiv.1409.2329
Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2015) LSTM: a search space odyssey. CoRR arXiv: abs/1503.04069, https://doi.org/10.1109/TNNLS.2016.2582924
Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. CoRR arXiv: abs/1503.00075, https://doi.org/10.48550/arXiv.1503.00075
Mueller J, Thyagarajan A (2016) Siamese recurrent architectures for learning sentence similarity. In: 30th AAAI conference on artificial intelligence, AAAI 2016, February 12, 2016–February 17, 2016
Păun G (2000) Computing with membranes. J Comput Syst Sci 61:108–143. https://doi.org/10.1006/jcss.1999.1693
Article MathSciNet Google Scholar
Păun G, Rozenberg G, Salomaa A (eds) (2010) The Oxford handbook of membrane computing. Oxford University Press, The Netherlands
Google Scholar
Păun MIG, Yokomori T (2006) Spiking neural p systems. Fund Inform 71:279–308. https://doi.org/10.1109/BICTA.2010.5645192
Article MathSciNet Google Scholar
Chen X, Peng H, Wang J, Hao F (2022) Supervisory control of discrete event systems under asynchronous spiking neuron P systems. Inf Sci 597:253–273. https://doi.org/10.1016/j.ins.2022.03.003
Article Google Scholar
Liu Q, Long L, Peng H, Wang J, Yang Q, Song X, Riscos-Nunez A, Perez-Jimenez MJ (2021) Gated spiking neural p systems for time series forecasting. https://doi.org/10.1109/TNNLS.2021.3134792
Peng H, Lv Z, Li B, Luo X, Wang J, Song X, Wang T, Pérez-Jiménez MJ, Riscos-Núñez A (2020) Nonlinear spiking neural P systems. Int J Neural Syst 30(10):2050008–1205000817. https://doi.org/10.1142/S0129065720500082
Article Google Scholar
Liu Q, Long L, Yang Q, Peng H, Wang J, Luo X (2022) Lstm-snp: a long short-term memory model inspired from spiking neural p systems. Knowl Based Syst 235:107656. https://doi.org/10.1016/j.knosys.2021.107656
Article Google Scholar
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: 1st International conference on learning representations, ICLR 2013, Scottsdale, Arizona, USA, May 2–4, 2013, Workshop track proceedings
Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015
Liu B, Lane IR (2016) Attention-based recurrent neural network models for joint intent detection and slot filling. CoRR arXiv: abs/1609.01454, https://doi.org/10.48550/arXiv.1609.01454
Xiao F, Liu B, Li R (2020) Pedestrian object detection with fusion of visual attention mechanism and semantic computation. Multimedia Tools Appl 79(21–22):14593–14607. https://doi.org/10.1007/s11042-018-7143-6
Article Google Scholar
Won K, Jang Y, Choi H, Shin S (2020) Semantic classification of emf-related literature using deep learning models with attention mechanism. In: 2020 Research in adaptive and convergent systems, RACS 2020, October 13, 2020–October 16, 2020
Marelli M, Menini S, Baroni M, Bentivogli L, Bernardi R, Zamparelli R (2014) A SICK cure for the evaluation of compositional distributional semantic models. In: Proceedings of the ninth international conference on language resources and evaluation, LREC 2014, Reykjavik, Iceland, May 26–31, 2014
Cer DM, Diab MT, A E, Gazpio IL, Specia L (2017) Semeval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation. In: Proceedings of the 11th international workshop on semantic evaluation, SemEval@ACL 2017, Vancouver, Canada, August 3–4, 2017
He H, Gimpel K, Lin J (2015) Multi-perspective sentence similarity modeling with convolutional neural networks. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015
Lei F, Liu X, Dai Q, Ling BW (2019) Shallow convolutional neural network for image classification. SN Appl Sci 2(1):97. https://doi.org/10.1007/s42452-019-1903-4
Article Google Scholar
Kiros R, Zhu Y, Salakhutdinov R, Zemel RS, Torralba A, Urtasun R, Fidler S (2015) Skip-thought vectors. In: 29th Annual conference on neural information processing systems, NIPS 2015, December 7, 2015–December 12, 2015
Wieting J, Kirkpatrick TB, Gimpel K, Neubig G (2019) Beyond BLEU: training neural machine translation with semantic similarity. CoRR arXiv: abs/1909.06694, https://doi.org/10.18653/v1/P19-1427
Lieto A, Moro D, Devoti F, Parera C, Lipari V, Bestagini P, Tubaro S (2019) "hello? who am I talking to?" A shallow CNN approach for human vs. bot speech classification. In: IEEE international conference on acoustics, speech and signal processing, ICASSP 2019, Brighton, United Kingdom, May 12–17, 2019
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. CoRR arXiv: abs/1310.4546, https://doi.org/10.5555/2999792.2999959
Lee S, Lee D, Jang S, Yu H (2022) Toward interpretable semantic textual similarity via optimal transport-based contrastive sentence learning. In: Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: long papers), ACL 2022, Dublin, Ireland, May 22–27, 2022, pp 5969–5979. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.412
Li H, Wang W, Liu Z, Niu Y, Wang H, Zhao S, Liao Y, Yang W, Liu X (2022) A novel locality-sensitive hashing relational graph matching network for semantic textual similarity measurement. Expert Syst Appl 207:117832. https://doi.org/10.1016/j.eswa.2022.117832
Article Google Scholar
Cho K, Van MB, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078, https://doi.org/10.48550/arXiv.1406.1078
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. CoRR arXiv: abs/1706.03762. https://doi.org/10.48550/arXiv.1706.03762
Devlin J, Chang M, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. CoRR arXiv: abs/1810.04805, https://doi.org/10.48550/arXiv.1810.04805
Chandrasekaran D, Mago V (2021) Comparative analysis of word embeddings in assessing semantic similarity of complex sentences. IEEE Access 9:166395–166408. https://doi.org/10.1109/ACCESS.2021.3135807
Article Google Scholar
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized BERT pretraining approach. CoRR arXiv: abs/1907.11692, https://doi.org/10.48550/arXiv.1907.11692
Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2019) ALBERT: a lite BERT for self-supervised learning of language representations. CoRR arXiv: abs/1909.11942, https://doi.org/10.48550/arXiv.1909.11942
Wang T, Shi H, Liu W, Yan X (2022) A joint framenet and element focusing sentence-bert method of sentence similarity computation. Expert Syst Appl 200:117084. https://doi.org/10.1016/j.eswa.2022.117084
Article Google Scholar
Viji D, Revathy S (2022) A hybrid approach of weighted fine-tuned BERT extraction with deep siamese bi-LSTM model for semantic text similarity identification. Multimedia Tools Appl 81(5):6131–6157. https://doi.org/10.1007/s11042-021-11771-6
Article Google Scholar
Long L, Liu Q, Peng H, Wang J, Yang Q (2022) Multivariate time series forecasting method based on nonlinear spiking neural P systems and non-subsampled shearlet transform. Neural Netw 152:300–310. https://doi.org/10.1016/j.neunet.2022.04.030
Article Google Scholar
Saruladha K, Thirumagal E, Arthi J, Aghila G (2013) Manhattan based hybrid semantic similarity algorithm for geospatial ontologies. 15th International Conference on Asia-Pacific Digital Libraries, ICADL 2013, December 9, 2013 - December 11, 2013
Chopra S, Hadsell R, LeCun Y (2005) Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR 2005), 20–26 June 2005, San Diego, CA, USA
Prechelt L (2012) Early stopping-but when? Neural Networks, Tricks of the Trade-Second Edition
Bromley J, Bentz JW, Bottou L, Guyon I, LeCun Y, Moore C, Säckinger E, Shah R (1993) Signature verification using A siamese time delay neural network. Int J Pattern Recognit Artif Intell 7:669–688. https://doi.org/10.1142/S0218001493000339
Article Google Scholar
Lee DH (2019) Fully convolutional single-crop siamese networks for real-time visual object tracking. Electronics 8:10. https://doi.org/10.3390/electronics8101084
Article Google Scholar
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473, https://doi.org/10.48550/arXiv.1409.0473
Pearson K (1896) Mathematical contributions to the theory of evolution on a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc Roy Soc Lond 60(359367):489–498. https://doi.org/10.1098/rspl.1896.0076
Article Google Scholar
Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15(1):72–101. https://doi.org/10.2307/1412159
Article Google Scholar
Levine SZ (1967) Some remarks on the coefficient of determination for the normal distribution. J Am Stat Assoc 62(320):1329–1333
Google Scholar
Huang B, Bai Y, Zhou X (2021) hub at semeval-2021 task 2: word meaning similarity prediction model based on roberta and word frequency. In: Proceedings of the 15th international workshop on semantic evaluation, SemEval@ACL/IJCNLP 2021, Virtual Event/Bangkok, Thailand, August 5–6, 2021
Lai A, Hockenmaier J (2014) Illinois-lh: a denotational and distributional approach to semantics. In: 8th International workshop on semantic evaluation, SemEval 2014, August 23, 2014–August 24, 2014
Jimenez S, Duenas G, Baquero J, Gelbukh A (2014) Unal-nlp: Combining soft cardinality features for semantic textual similarity, relatedness and entailment. 8th International Workshop on Semantic Evaluation, SemEval 2014, August 23, 2014 - August 24, 2014
Zhao J, Zhu T, Lan M (2014) Ecnu: one stone two birds: ensemble of heterogenous measures for semantic relatedness and textual entailment. In: 8th International workshop on semantic evaluation, SemEval 2014, August 23, 2014–August 24, 2014
Bjerva J, Bos J, Goot RVD, Nissim M (2014) The meaning factory: formal semantics for recognizing textual entailment and determining semantic similarity. In: 8th International workshop on semantic evaluation, SemEval 2014, August 23, 2014–August 24, 2014
Proisl T, Evert S, Greiner P, Kabashi B (2014) Semantiklue: Robust semantic similarity at multiple levels using maximum weight matching. In: Proceedings of the 8th international workshop on semantic evaluation, SemEval@COLING 2014, Dublin, Ireland, August 23–24, 2014
Bestgen Y (2014) CECL: a new baseline and a non-compositional approach for the sick benchmark. In: Proceedings of the 8th international workshop on semantic evaluation, SemEval@COLING 2014, Dublin, Ireland, August 23–24, 2014
Socher R, Karpathy A, Le QV, Manning CD, Ng AY (2014) Grounded compositional semantics for finding and describing images with sentences. Trans Assoc Comput Linguist 2:207–218. https://doi.org/10.1162/tacl_a_00177
Article Google Scholar
Sutskever I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks. Advances in neural information processing systems 27. https://doi.org/10.5555/2969033.2969173
Huang DG, Arafat AASY, Rashid KI, Abbas Q, Ren FJ (2020) Sentence-embedding and similarity via hybrid bidirectional-lstm and cnn utilizing weighted-pooling attention. IEICE Trans Inf Syst E103D(10):2216–2227. https://doi.org/10.1587/transinf.2018EDP7410
Article Google Scholar
Chen Y (2018) CT-LSTM: detection and estimation duplexed system for robust object tracking. In: The 2nd international conference on computer science and application engineering, CSAE 2018, Hohhot, China, October 22–24, 2018
Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020
Chang E (2022) A vector-based semantic relatedness measure using multiple relations within SNOMED CT and UMLS. J Biomed Inf 131:104118. https://doi.org/10.1016/j.jbi.2022.104118
Article Google Scholar
Ethayarajh K, Duvenaud D, Hirst G (2018) Towards understanding linear word analogies. CoRR arXiv: abs/1810.04882, https://doi.org/10.18653/v1/P19-1315
Li B, Zhou H, He J, Wang M, Yang Y, Li L (2020) On the sentence embeddings from pre-trained language models. CoRR arXiv: abs/2011.05864, https://doi.org/10.48550/arXiv.2011.05864
Gao J, He D, Tan X, Qin T, Wang L, Liu T (2019) Representation degeneration problem in training natural language generation models. CoRR arXiv: abs/1907.12009, https://doi.org/10.48550/arXiv.1907.12009
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, a meeting of SIGDAT, a Special Interest Group of the ACL
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16x16 words: transformers for image recognition at scale. CoRR arXiv: abs/2010.11929, https://doi.org/10.48550/arXiv.2010.11929

Download references

Acknowledgements

Not Applicable.

Funding

This work is supported by the the Science and Technology Program of Sichuan Province (Grant no. 2023YFS0424), the “Open bidding for selecting the best candidates” Science and Technology Project of Chengdu (Grant no. 2023-JB00-00020-GX), and the National Natural Science Foundation (Grant nos. 61902324, 11426179, and 61872298).

Author information

Authors and Affiliations

School of Computer and Software Engineering, Xihua University, Chengdu, 610039, People’s Republic of China
Xu Gu, Xiaoliang Chen, Xianyong Li & Yajun Du
Department of Computer Science and Operations Research, University of Montreal, Montreal, QC, H3C3J7, Canada
Xiaoliang Chen & Peng Lu
SiChuan Provincial Bureau of Statistics, Chengdu, 610041, People’s Republic of China
Xiang Lan

Authors

Xu Gu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoliang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Peng Lu
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Lan
View author publications
You can also search for this author in PubMed Google Scholar
Xianyong Li
View author publications
You can also search for this author in PubMed Google Scholar
Yajun Du
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

XG took part in conceptualization; XG and XLC involved in data curation; XG took part in formal analysis; XLC and YJD took part in funding acquisition; XG involved in investigation; XLC participated in methodology; XLC took part in project administration; XL involved in resources; XG took part in software; XLC involved in supervision; PL and XYL involved in validation; XG took part in visualization; XG involved in writing-original draft; XLC and PL took part in writing-review & editing

Corresponding author

Correspondence to Xiaoliang Chen.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethica approval

Not Applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gu, X., Chen, X., Lu, P. et al. SiMaLSTM-SNP: novel semantic relatedness learning model preserving both Siamese networks and membrane computing. J Supercomput 80, 3382–3411 (2024). https://doi.org/10.1007/s11227-023-05592-7

Download citation

Accepted: 17 August 2023
Published: 04 September 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11227-023-05592-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SiMaLSTM-SNP: novel semantic relatedness learning model preserving both Siamese networks and membrane computing

Abstract

Access this article

Similar content being viewed by others

A survey on deep learning approaches for text-to-SQL

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Pre-trained models for natural language processing: A survey

Availability of supporting data

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethica approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SiMaLSTM-SNP: novel semantic relatedness learning model preserving both Siamese networks and membrane computing

Abstract

Access this article

Similar content being viewed by others

A survey on deep learning approaches for text-to-SQL

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Pre-trained models for natural language processing: A survey

Availability of supporting data

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethica approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation