Abstract
Neural networks have attracted great attention for natural language inference in recent years. Interactions between the premise and the hypothesis have been proved to be effective in improving the representations. Existing methods mainly focused on a single interaction, while multiple interactions have not been well studied. In this paper, we propose a dependent multilevel interaction (DMI) Network which models multiple interactions between the premise and the hypothesis to boost the performance of natural language inference. In specific, a single-interaction unit (SIU) structure with a novel combining attention mechanism is presented to capture features in an interaction. Then, we cascade a serial of SIUs in a multilevel interaction layer to obtain more comprehensive features. Experiments on two benchmark datasets, namely SciTail and SNLI, show the effectiveness of our proposed model. Our model outperforms the state-of-the-art approaches on the SciTail dataset without using any external resources. For the SNLI dataset, our model also achieves competitive results.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Bowman, S.R., Angeli, G.: A large annotated corpus for learning natural language inference. In: EMNLP, pp. 632–642 (2015). https://doi.org/10.18653/v1/D15-1075
Chen, Q., Zhu, X.: Recurrent neural network-based sentence encoder with gated attention for natural language inference. In: Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP, RepEval@EMNLP, pp. 36–40 (2017). https://doi.org/10.18653/v1/W17-5307
Chen, Q., Zhu, X.: Neural natural language inference models enhanced with external knowledge. In: ACL, pp. 2406–2417 (2018). https://doi.org/10.18653/v1/P18-1224
Conneau, A., Kiela, D.: Supervised learning of universal sentence representations from natural language inference data. In: EMNLP, pp. 670–680 (2017). https://doi.org/10.18653/v1/D17-1070
Ghaeini, R., Hasan, S.A.: DR-BiLSTM: dependent reading bidirectional LSTM for natural language inference. In: NAACL-HLT, pp. 1460–1469 (2018). https://doi.org/10.18653/v1/N18-1132
Gong, Y., Luo, H., Zhang, J.: Natural language inference over interaction space. In: ICLR (2018)
Hu, B., Lu, Z.: Convolutional neural network architectures for matching natural language sentences. CoRR (2015)
Im, J., Cho, S.: Distance-based self-attention network for natural language inference. CoRR abs/1712.02047 (2017). arXiv:1712.02047v1
Khot, T., Sabharwal, A., Clark, P.: SciTail: a textual entailment dataset from science question answering. In: AAAI, pp. 5189–5197 (2018)
Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP, pp. 1746–1751 (2014). https://doi.org/10.3115/v1/D14-1181
Liu, X., Gao, J.: Representation learning using multi-task deep neural networks for semantic classification and information retrieval. In: NAACL, pp. 912–921 (2015). https://doi.org/10.3115/v1/N15-1092
Marelli, M., Menini, S.: A SICK cure for the evaluation of compositional distributional semantic models. In: LREC, pp. 216–223 (2014)
Mou, L., Men, R.: Natural language inference by tree-based convolution and heuristic matching. In: ACL (2016). arXiv:1512.08422v3
Parikh, A.P., Täckström, O.: A decomposable attention model for natural language inference. In: EMNLP, pp. 2249–2255 (2016). https://doi.org/10.18653/v1/D16-1244
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014). https://doi.org/10.3115/v1/D14-1162
Peters, M.E., Neumann, M.: Deep contextualized word representations. In: NAACL-HLT, pp. 2227–2237 (2018). arXiv:1802.05365v2
Rocktäschel, T., Grefenstette, E.: Reasoning about entailment with neural attention. In: ICLR (2016). arXiv:1509.06664v4
Shen, T., Zhou, T.: DiSAN: directional self-attention network for RNN/CNN-free language understanding. In: AAAI, pp. 5446–5455 (2018). arXiv:1709.04696v1
Shen, T., Zhou, T.: Reinforced self-attention network: a hybrid of hard and soft attention for sequence modeling. In: IJCAI, pp. 4345–4352 (2018). arXiv:1801.10296v2
Tay, Y., Luu, A.T., Hui, S.C.: Compare, compress and propagate: enhancing neural architectures with alignment factorization for natural language inference. In: EMNLP, pp. 1565–1575 (2018). https://doi.org/10.18653/v1/D18-1185
Wang, Z., Hamza, W.: Bilateral multi-perspective matching for natural language sentences. In: IJCAI, pp. 4144–4150 (2017). https://doi.org/10.24963/ijcai.2017/579
Acknowledgement
This research is funded by Xiaoi Research, the Science and Technology Commission of Shanghai Municipality (No. 18511105502) and the Key Teaching Reform Project for Undergraduates in Shanghai Universities.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Y. et al. (2019). Dependent Multilevel Interaction Network for Natural Language Inference. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series. ICANN 2019. Lecture Notes in Computer Science(), vol 11730. Springer, Cham. https://doi.org/10.1007/978-3-030-30490-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-30490-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30489-8
Online ISBN: 978-3-030-30490-4
eBook Packages: Computer ScienceComputer Science (R0)