research-article

Siamese Network with Soft Attention for Semantic Text Understanding

Authors:

Adebayo Kolawole John,

Guido BoellaAuthors Info & Claims

Semantics2017: Proceedings of the 13th International Conference on Semantic Systems

Pages 160 - 167

https://doi.org/10.1145/3132218.3132236

Published: 11 September 2017 Publication History

Abstract

We propose a task independent neural networks model, based on a Siamese-twin architecture. Our model specifically benefits from two forms of attention scheme, which we use to extract high level feature representation of the underlying texts, both at word level (intra-attention) as well as sentence level (inter-attention). The inter attention scheme uses one of the text to create a contextual interlock with the other text, thus paying attention to mutually important parts. We evaluate our system on three tasks, i.e. Textual Entailment, Paraphrase Detection and Answer-Sentence selection. We set a near state-of-the-art result on the textual entailment task with the SNLI corpus while obtaining strong performance across the other tasks that we evaluate our model on.

References

[1]

Kolawole Adebayo, Luigi Di Caro, Livio Robaldo, and Guido Boella. 2016. Textual Inference with Deep Learning Technique. In Proc. of the 28th Annual Benelux Conference on Artificial Intelligence (BNAIC2016).

[2]

E. Agirrea, C. Baneab, D. Cerd, M. Diabe, A. Gonzalez-Agirrea, R. Mihalceab, G. Rigaua, J. Wiebef, and B. Donostia. 2016. Semeval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. Proceedings of SemEval (2016), 497--511.

[3]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).

[4]

Petr Baudiš and Jan Šedivy. 2016. Sentence Pair Scoring: Towards Unified Framework for Text Comprehension. arXiv preprint arXiv:1603.06127 (2016).

[5]

Luisa Bentivogli, Peter Clark, Ido Dagan, Hoa Dang, and Danilo Giampiccolo. 2011. The seventh pascal recognizing textual entailment challenge. Proceedings of TAC 2011 (2011).

[6]

Samuel R Bowman, Gabor Angeli, Christopher Potts, and Christopher D Manning. 2015. A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015).

[7]

Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, and Hui Jiang. 2016. Enhancing and combining sequential and tree lstm for natural language inference. arXiv preprint arXiv:1609.06038 (2016).

[8]

Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016. Long short-term memory-networks for machine reading. arXiv preprint arXiv:1601.06733 (2016).

[9]

Cicero Nogueira dos Santos, Ming Tan, Bing Xiang, and Bowen Zhou. 2016. Attentive pooling networks. CoRR, abs/1602.03609 (2016).

[10]

Minwei Feng, Bing Xiang, Michael R Glass, Lidan Wang, and Bowen Zhou. 2015. Applying deep learning to answer selection: A study and an open task. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE, 813--820.

[11]

Hua He and Jimmy Lin. 2016. Pairwise word interaction modeling with deep neural networks for semantic similarity measurement. In Proceedings of NAACL-HLT. 937--948.

[12]

S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997).

Digital Library

[13]

Adebayo Kolawole John, Luigi Di Caro, and Guido Boella. 2016. NORMAS at SemEval-2016 Task 1: SEMSIM: A Multi-Feature Approach to Semantic Text Similarity. Proceedings of SemEval (2016).

[14]

Chen Liu. 2013. Probabilistic siamese network for learning representations. Ph.D. Dissertation. University of Toronto.

[15]

Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Modelling Interaction of Sentence Pair with coupled-LSTMs. arXiv preprint arXiv:1605.05573 (2016).

[16]

Yang Liu, Chengjie Sun, Lei Lin, and Xiaolong Wang. 2016. Learning natural language inference using bidirectional LSTM model and inner-attention. arXiv preprint arXiv:1605.09090 (2016).

[17]

Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernockygrave;, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Interspeech, Vol. 2. 3.

[18]

Tsendsuren Munkhdalai and Hong Yu. 2016. Neural Tree Indexers for Text Understanding. arXiv preprint arXiv:1607.04492 (2016).

[19]

Ankur P Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. 2016. A decomposable attention model for natural language inference. arXiv preprint arXiv:1606.01933 (2016).

[20]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP, Vol. 14. 1532--43.

[21]

Jinfeng Rao, Hua He, and Jimmy Lin. 2016. Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 1913--1916.

Digital Library

[22]

Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočisky, and Phil Blunsom. 2015. Reasoning about entailment with neural attention. arXiv preprint arXiv:1509.06664 (2015).

[23]

Lei Sha, Baobao Chang, Zhifang Sui, and Sujian Li. {n. d.}. Reading and Thinking: Re-read LSTM Unit for Textual Entailment Recognition. ({n. d.}).

[24]

R. Socher, E. Huang, J. Pennin, C. Manning, and A. Ng. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in Neural Information Processing Systems. 801--809.

Digital Library

[25]

Richard Socher, Cliff C Lin, Chris Manning, and Andrew Y Ng. 2011. Parsing natural scenes and natural language with recursive neural networks. In Proceedings of the 28th international conference on machine learning (ICML-11). 129--136.

Digital Library

[26]

Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958.

Digital Library

[27]

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104-- 3112.

Digital Library

[28]

Ming Tan, Bing Xiang, and Bowen Zhou. 2015. LSTM-based Deep Learning Models for non-factoid answer selection. arXiv preprint arXiv:1511.04108 (2015).

[29]

Bingning Wang, Kang Liu, and Jun Zhao. 2016. Inner attention based recurrent neural networks for answer selection. In The Annual Meeting of the Association for Computational Linguistics.

[30]

Shuohang Wang and Jing Jiang. 2016. A Compare-Aggregate Model for Matching Text Sequences. arXiv preprint arXiv:1611.01747 (2016).

[31]

Shuohang Wang and Jing Jiang. 2016. Machine comprehension using match-lstm and answer pointer. arXiv preprint arXiv:1608.07905 (2016).

[32]

Zhiguo Wang, Wael Hamza, and Radu Florian. 2017. Bilateral Multi-Perspective Matching for Natural Language Sentences. arXiv preprint arXiv:1702.03814 (2017).

Digital Library

[33]

Zhiguo Wang, Haitao Mi, and Abraham Ittycheriah. 2016. Sentence similarity learning by lexical decomposition and composition. arXiv preprint arXiv:1602.07019 (2016).

[34]

Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M Rush, Bart van Merriënboer, Armand Joulin, and Tomas Mikolov. 2015. Towards ai-complete question answering: A set of prerequisite toy tasks. arXiv preprint arXiv:1502.05698 (2015).

[35]

Jason Weston, Sumit Chopra, and Antoine Bordes. 2014. Memory networks. arXiv preprint arXiv:1410.3916 (2014).

[36]

Yi Yang, Wen-tau Yih, and Christopher Meek. 2015. WikiQA: A Challenge Dataset for Open-Domain Question Answering. In EMNLP. Citeseer, 2013--2018.

[37]

Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of NAACL-HLT. 1480--1489.

[38]

Wenpeng Yin, Hinrich Schütze, Bing Xiang, and Bowen Zhou. 2015. Abcnn: Attention-based convolutional neural network for modeling sentence pairs. arXiv preprint arXiv:1512.05193 (2015).

[39]

Xiang Zhang and Yann LeCun. 2015. Text understanding from scratch. arXiv preprint arXiv:1502.01710 (2015).

[40]

Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Advances in Neural Information Processing Systems. 649--657.

Digital Library

Index Terms

Siamese Network with Soft Attention for Semantic Text Understanding
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Morphologically Annotated Amharic Text Corpora
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

In information retrieval (IR), documents that match the query are retrieved. Search engines usually conflate word variants into a common stem when indexing documents because queries and documents do not need to use exactly the same word variant for the ...
A statistics-based semantic textual entailment system
MICAI'11: Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I

We present a Textual Entailment (TE) recognition system that uses semantic features based on the Universal Networking Language (UNL). The proposed TE system compares the UNL relations in both the text and the hypothesis to arrive at the two-way ...
Deep Neural Network Models for Paraphrased Text Classification in the Arabic Language
Natural Language Processing and Information Systems
Abstract
Paraphrase is the act of reusing original texts without proper citation of the source. Different obfuscation operations can be employed such as addition/deletion of words, synonym substitutions, lexical changes, active to passive switching, etc. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

Semantics2017: Proceedings of the 13th International Conference on Semantic Systems

September 2017

202 pages

ISBN:9781450352963

DOI:10.1145/3132218

Editors:
Rinke Hoekstra
Elsevier B.V., Amsterdam, The Netherlands
,
Catherine Faron-Zucker
University of Nice Sophia Antipolis, France
,
Tassilo Pellegrini
University of Applied Sciences St. Poelten, Austria
,
Victor de Boer
Vrije Universiteit Amsterdam, The Netherlands

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

St. Pölten University: St. Pölten University of Applied Sciences, Austria
Wolters Kluwer: Wolters Kluwer, Germany
Vrije Universeit Amsterdam: Vrije Universeit Amsterdam
Semantic Web Company: Semantic Web Company
Uinv. Leipzig: Universität Leipzig

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 September 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Erasmus Mundus Joint Doctorate in Law Science and Technology
European Union's H2020 research and innovation programme: MIREL: MIning and REasoning with Legal texts

Conference

Semantics2017

Semantics2017: Semantics 2017 - 13th International Conference on Semantic Systems

September 11 - 14, 2017

Amsterdam, Netherlands

Acceptance Rates

Overall Acceptance Rate 40 of 182 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
220
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten