ABSTRACT
We propose a task independent neural networks model, based on a Siamese-twin architecture. Our model specifically benefits from two forms of attention scheme, which we use to extract high level feature representation of the underlying texts, both at word level (intra-attention) as well as sentence level (inter-attention). The inter attention scheme uses one of the text to create a contextual interlock with the other text, thus paying attention to mutually important parts. We evaluate our system on three tasks, i.e. Textual Entailment, Paraphrase Detection and Answer-Sentence selection. We set a near state-of-the-art result on the textual entailment task with the SNLI corpus while obtaining strong performance across the other tasks that we evaluate our model on.
- Kolawole Adebayo, Luigi Di Caro, Livio Robaldo, and Guido Boella. 2016. Textual Inference with Deep Learning Technique. In Proc. of the 28th Annual Benelux Conference on Artificial Intelligence (BNAIC2016).Google Scholar
- E. Agirrea, C. Baneab, D. Cerd, M. Diabe, A. Gonzalez-Agirrea, R. Mihalceab, G. Rigaua, J. Wiebef, and B. Donostia. 2016. Semeval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. Proceedings of SemEval (2016), 497--511.Google Scholar
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google Scholar
- Petr Baudiš and Jan Šedivy. 2016. Sentence Pair Scoring: Towards Unified Framework for Text Comprehension. arXiv preprint arXiv:1603.06127 (2016).Google Scholar
- Luisa Bentivogli, Peter Clark, Ido Dagan, Hoa Dang, and Danilo Giampiccolo. 2011. The seventh pascal recognizing textual entailment challenge. Proceedings of TAC 2011 (2011).Google Scholar
- Samuel R Bowman, Gabor Angeli, Christopher Potts, and Christopher D Manning. 2015. A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015).Google Scholar
- Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, and Hui Jiang. 2016. Enhancing and combining sequential and tree lstm for natural language inference. arXiv preprint arXiv:1609.06038 (2016).Google Scholar
- Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016. Long short-term memory-networks for machine reading. arXiv preprint arXiv:1601.06733 (2016).Google Scholar
- Cicero Nogueira dos Santos, Ming Tan, Bing Xiang, and Bowen Zhou. 2016. Attentive pooling networks. CoRR, abs/1602.03609 (2016).Google Scholar
- Minwei Feng, Bing Xiang, Michael R Glass, Lidan Wang, and Bowen Zhou. 2015. Applying deep learning to answer selection: A study and an open task. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE, 813--820.Google ScholarCross Ref
- Hua He and Jimmy Lin. 2016. Pairwise word interaction modeling with deep neural networks for semantic similarity measurement. In Proceedings of NAACL-HLT. 937--948.Google ScholarCross Ref
- S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997). Google ScholarDigital Library
- Adebayo Kolawole John, Luigi Di Caro, and Guido Boella. 2016. NORMAS at SemEval-2016 Task 1: SEMSIM: A Multi-Feature Approach to Semantic Text Similarity. Proceedings of SemEval (2016).Google Scholar
- Chen Liu. 2013. Probabilistic siamese network for learning representations. Ph.D. Dissertation. University of Toronto.Google Scholar
- Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Modelling Interaction of Sentence Pair with coupled-LSTMs. arXiv preprint arXiv:1605.05573 (2016).Google Scholar
- Yang Liu, Chengjie Sun, Lei Lin, and Xiaolong Wang. 2016. Learning natural language inference using bidirectional LSTM model and inner-attention. arXiv preprint arXiv:1605.09090 (2016).Google Scholar
- Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernockygrave;, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Interspeech, Vol. 2. 3.Google Scholar
- Tsendsuren Munkhdalai and Hong Yu. 2016. Neural Tree Indexers for Text Understanding. arXiv preprint arXiv:1607.04492 (2016).Google Scholar
- Ankur P Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. 2016. A decomposable attention model for natural language inference. arXiv preprint arXiv:1606.01933 (2016).Google Scholar
- Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP, Vol. 14. 1532--43.Google Scholar
- Jinfeng Rao, Hua He, and Jimmy Lin. 2016. Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 1913--1916. Google ScholarDigital Library
- Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočisky, and Phil Blunsom. 2015. Reasoning about entailment with neural attention. arXiv preprint arXiv:1509.06664 (2015).Google Scholar
- Lei Sha, Baobao Chang, Zhifang Sui, and Sujian Li. {n. d.}. Reading and Thinking: Re-read LSTM Unit for Textual Entailment Recognition. ({n. d.}).Google Scholar
- R. Socher, E. Huang, J. Pennin, C. Manning, and A. Ng. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in Neural Information Processing Systems. 801--809. Google ScholarDigital Library
- Richard Socher, Cliff C Lin, Chris Manning, and Andrew Y Ng. 2011. Parsing natural scenes and natural language with recursive neural networks. In Proceedings of the 28th international conference on machine learning (ICML-11). 129--136. Google ScholarDigital Library
- Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958. Google ScholarDigital Library
- Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104-- 3112. Google ScholarDigital Library
- Ming Tan, Bing Xiang, and Bowen Zhou. 2015. LSTM-based Deep Learning Models for non-factoid answer selection. arXiv preprint arXiv:1511.04108 (2015).Google Scholar
- Bingning Wang, Kang Liu, and Jun Zhao. 2016. Inner attention based recurrent neural networks for answer selection. In The Annual Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
- Shuohang Wang and Jing Jiang. 2016. A Compare-Aggregate Model for Matching Text Sequences. arXiv preprint arXiv:1611.01747 (2016).Google Scholar
- Shuohang Wang and Jing Jiang. 2016. Machine comprehension using match-lstm and answer pointer. arXiv preprint arXiv:1608.07905 (2016).Google Scholar
- Zhiguo Wang, Wael Hamza, and Radu Florian. 2017. Bilateral Multi-Perspective Matching for Natural Language Sentences. arXiv preprint arXiv:1702.03814 (2017). Google ScholarDigital Library
- Zhiguo Wang, Haitao Mi, and Abraham Ittycheriah. 2016. Sentence similarity learning by lexical decomposition and composition. arXiv preprint arXiv:1602.07019 (2016).Google Scholar
- Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M Rush, Bart van Merriënboer, Armand Joulin, and Tomas Mikolov. 2015. Towards ai-complete question answering: A set of prerequisite toy tasks. arXiv preprint arXiv:1502.05698 (2015).Google Scholar
- Jason Weston, Sumit Chopra, and Antoine Bordes. 2014. Memory networks. arXiv preprint arXiv:1410.3916 (2014).Google Scholar
- Yi Yang, Wen-tau Yih, and Christopher Meek. 2015. WikiQA: A Challenge Dataset for Open-Domain Question Answering. In EMNLP. Citeseer, 2013--2018.Google ScholarCross Ref
- Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of NAACL-HLT. 1480--1489.Google ScholarCross Ref
- Wenpeng Yin, Hinrich Schütze, Bing Xiang, and Bowen Zhou. 2015. Abcnn: Attention-based convolutional neural network for modeling sentence pairs. arXiv preprint arXiv:1512.05193 (2015).Google Scholar
- Xiang Zhang and Yann LeCun. 2015. Text understanding from scratch. arXiv preprint arXiv:1502.01710 (2015).Google Scholar
- Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Advances in Neural Information Processing Systems. 649--657. Google ScholarDigital Library
Index Terms
- Siamese Network with Soft Attention for Semantic Text Understanding
Recommendations
Morphologically Annotated Amharic Text Corpora
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information RetrievalIn information retrieval (IR), documents that match the query are retrieved. Search engines usually conflate word variants into a common stem when indexing documents because queries and documents do not need to use exactly the same word variant for the ...
A statistics-based semantic textual entailment system
MICAI'11: Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part IWe present a Textual Entailment (TE) recognition system that uses semantic features based on the Universal Networking Language (UNL). The proposed TE system compares the UNL relations in both the text and the hypothesis to arrive at the two-way ...
Deep Neural Network Models for Paraphrased Text Classification in the Arabic Language
Natural Language Processing and Information SystemsAbstractParaphrase is the act of reusing original texts without proper citation of the source. Different obfuscation operations can be employed such as addition/deletion of words, synonym substitutions, lexical changes, active to passive switching, etc. ...
Comments