Improving Natural Language Inference with Residual Attention

Yu, Shanshan; Su, Jindian; Ye, Xiaobin; Ma, Dandan

doi:10.1007/978-981-97-0827-7_29

Shanshan Yu⁸,
Jindian Su⁹,
Xiaobin Ye¹⁰ &
…
Dandan Ma¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2015))

Included in the following conference series:

International Conference on Applied Intelligence

300 Accesses

Abstract

Natural language inference (NLI) aims to identify the logical relationship between a premise and a corresponding hypothesis, which requires the model should have the ability of effectively capturing their semantic relationship. Most of the existing transformer-based models tend to concatenate the premise and hypothesis together as the input of the model and capture their relationship through multi-head self-attention mechanism, which as a result might only consider their plain context-sensitive relationship and neglect the potentially mutual impacts of their contextual semantics. To better model the relationship between the premise and hypothesis, we propose a new transformer-based model RAN4NLI that consists of a sequence encoder based on pre-trained language model for encoding the input semantics and an interaction network based on residual attention for further capturing their relationship. We utilize residual attention for combining multi-head self-attention and cross-attention information so as to strengthen the potential semantic relationship between the premise and hypothesis. Experiments conducted on two canonical datasets, SNLI and SciTail, demonstrate that our RAN4NLI achieves comparable performance with other strong baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Syntax-Aware Attention for Natural Language Inference with Phrase-Level Matching

Collaborative Attention Network for Natural Language Inference

Context-Aware Dual-Attention Network for Natural Language Inference

References

Ghaeini, R., Hasan, S.A., Datla, V., et al.: DR-BiLSTM: dependent reading bidirectional LSTM for natural language inference. In: Proceedings of NAACL-HLT, pp. 1460–1469. Association for Computational Linguistics, New Orleans (2018)
Google Scholar
Mou, L., Men, R., Li, G., et al.: Natural language inference by tree-based convolution and heuristic matching. In: The 54th Annual Meeting of the Association for Computational Linguistics, pp. 130–136. Association for Computational Linguistics, Berlin (2016)
Google Scholar
Chen, Q., Zhu, X., Ling, Z.-H., et al.: Enhanced LSTM for natural language inference. In: The 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017), pp. 1657–1668. Association for Computational Linguistics, Berlin (2017)
Google Scholar
Aarne, Y., Anssi, J., Jorg, T.: Sentence embeddings in NLI with iterative refinement encoders. Nat. Lang. Eng. 25(4), 467–482 (2019)
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186. Association for Computational Linguistics, Minneapolis (2019)
Google Scholar
Liu, Y.H., Ott, M., Goyal, N., Du, J.F.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Bowman, S.R., Gauthier, J., Rastogi, A., et al.: A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processings, pp. 632–642. Association for Computational Linguistics, Portugal (2015)
Google Scholar
Khot, T., Sabharwal, A., Clark, P.: SciTaiL: a textual entailment dataset from science question answering. In: 32nd AAAI Conference on Artificial Intelligence (AAAI 2018), pp. 5189–5197. AAAI Press, CA (2018)
Google Scholar
Bowman, S.R., Gauthier, J., Rastogi, A., et al.: A fast unified model for parsing and sentence understanding. In: The 54th Annual Meeting of the Association for Computational Linguistics, pp. 1466–1477. Association for Computational Linguistics, Berlin (2016)
Google Scholar
Conneau, A., Kiela, D., Schwenk, H., et al.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 670–680. Association for Computational Linguistics, Copenhagen (2017)
Google Scholar
Tay, Y., Tuan, L.A., Hui, S.C.: Compare, compress and propagate: enhancing neural architectures with alignment factorization for natural language inference. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 1565–1575. Association for Computational Linguistics, Stroudsburg (2018)
Google Scholar
Liu, X., He, P., Chen, W., Gao, J.: Multi-task deep neural networks for natural language understanding. In: The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), pp. 4487–4496. Association for Computational Linguistics, Berlin (2019)
Google Scholar
Zhang, Z., Wu, Y., Zhao, H., et al.: Semantics-aware BERT for language understanding. In: Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020), vol. 34, no. (5), pp. 9628–9635. AAAI Press, CA (2020)
Google Scholar
Gajbhiye, A., Moubayed, N.A., Bradley, S.: ExBERT: an external knowledge enhanced BERT for natural language inference. In: 30th International Conference on Artificial Neural Networks, pp. 460–472. European Neural Network Society, Switzerland (2021)
Google Scholar
Pilault, J., Elhattami, A., Pal, C.: Conditionally adaptive multi-task learning: improving transfer learning in NLP using fewer parameters & less data. In: International Conference on Learning Representations (ICLR 2021). OpenReview.net, Vienna (2021)
Google Scholar
Vaswani, A., Noam, N., Niki, P., et al.: Attention is all you need. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), vol. 30, pp. 5998–6008. Currant Associates, CA (2017)
Google Scholar
Liu, X.D., Cheng, H., He, P.C., et al.: Adversarial training for large neural language models. arXiv preprint arXiv.2004.08994 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, Guangdong, China
Shanshan Yu
College of Computer Science and Engineering, South China University of Technology, Guangzhou, 510641, Guangdong, China
Jindian Su
Guangdong Unicomm, Guangzhou, 440100, China
Xiaobin Ye & Dandan Ma

Authors

Shanshan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jindian Su
View author publications
You can also search for this author in PubMed Google Scholar
Xiaobin Ye
View author publications
You can also search for this author in PubMed Google Scholar
Dandan Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jindian Su .

Editor information

Editors and Affiliations

Eastern Institute of Technology, Zhejiang, China
De-Shuang Huang
University of Wollongong, North Wollongong, NSW, Australia
Prashan Premaratne
Guangxi Academy of Sciences, Guangxi, China
Changan Yuan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, S., Su, J., Ye, X., Ma, D. (2024). Improving Natural Language Inference with Residual Attention. In: Huang, DS., Premaratne, P., Yuan, C. (eds) Applied Intelligence. ICAI 2023. Communications in Computer and Information Science, vol 2015. Springer, Singapore. https://doi.org/10.1007/978-981-97-0827-7_29

Download citation

DOI: https://doi.org/10.1007/978-981-97-0827-7_29
Published: 01 March 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0826-0
Online ISBN: 978-981-97-0827-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improving Natural Language Inference with Residual Attention

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Syntax-Aware Attention for Natural Language Inference with Phrase-Level Matching

Collaborative Attention Network for Natural Language Inference

Context-Aware Dual-Attention Network for Natural Language Inference

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Improving Natural Language Inference with Residual Attention

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Syntax-Aware Attention for Natural Language Inference with Phrase-Level Matching

Collaborative Attention Network for Natural Language Inference

Context-Aware Dual-Attention Network for Natural Language Inference

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation