Abstract
Matching two natural language sentences is a fundamental problem in both natural language processing and information retrieval. Preliminary studies have shown that the syntactic structures help improve the matching accuracy, and different syntactic structures in natural language are complementary to sentence semantic understanding. Ideally, a matching model would leverage all syntactic information. Existing models, however, are only able to combine limited (usually one) types of syntactic information due to the complex and heterogeneous nature of the syntactic information. To deal with the problem, we propose a novel matching model, which formulates sentence matching as a representation learning task on a syntactic-informed heterogeneous graph. The model, referred to as SIGN (Syntactic-Informed Graph Network), first constructs a heterogeneous matching graph based on the multiple syntactic structures of two input sentences. Then the graph attention network algorithm is applied to the matching graph to learn the high-level representations of the nodes. With the help of the graph learning framework, the multiple syntactic structures, as well as the word semantics, can be represented and interacted in the matching graph and therefore collectively enhance the matching accuracy. We conducted comprehensive experiments on three public datasets. The results demonstrate that SIGN outperforms the state of the art and also can discriminate the sentences in an interpretable way.
- [1] . 1996. Partial parsing via finite-state cascades. Natural Language Engineering 2, 4 (1996), 337–344.Google ScholarDigital Library
- [2] . 2021. Syntax-BERT: Improving pre-trained transformers with syntax trees. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 3011–3020.
DOI: Google ScholarCross Ref - [3] . 2017. Graph convolutional encoders for syntax-aware neural machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1957–1967.
DOI: Google ScholarCross Ref - [4] . 2015. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 632–642.
DOI: Google ScholarCross Ref - [5] . 2013. An introduction to string re-writing kernel. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI’13). 2982–2986. http://www.aaai.org/ocs/index.php/IJCAI/IJCAI13/paper/view/6544Google Scholar
- [6] . 2018. MIX: Multi-channel information crossing for text matching. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’18). ACM, New York, NY, 110–119.
DOI: Google ScholarDigital Library - [7] . 2020. Neural graph matching networks for Chinese short text matching. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 6152–6158.
DOI: Google ScholarCross Ref - [8] . 2016. Enhancing and combining sequential and tree lstm for natural language inference. ArXiv preprint abs/1609.06038 (2016). https://arxiv.org/abs/1609.06038Google Scholar
- [9] . 2017. Enhanced LSTM for natural language inference. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1657–1668.
DOI: Google ScholarCross Ref - [10] . 2009. Paraphrase identification as probabilistic quasi-synchronous recognition. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 468–476. https://aclanthology.org/P09-1053Google ScholarCross Ref
- [11] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers). 4171–4186.
DOI: Google ScholarCross Ref - [12] . 2021. QuillBot as an online tool: Students’ alternative in paraphrasing and rewriting of English writing. Englisia: Journal of Language, Education, and Humanities 9, 1 (2021), 183–196.Google ScholarCross Ref
- [13] . 2007. Mapping sentence form onto meaning: The syntax–semantic interface. Brain Research 1146 (2007), 50–58.Google ScholarCross Ref
- [14] . 2014. Modeling interestingness with deep neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 2–13.
DOI: Google ScholarCross Ref - [15] . 2018. Natural language inference over interaction space. In Proceedings of the 6th International Conference on Learning Representations: Conference Track Proceedings (ICLR’18). https://openreview.net/forum?id=r1dHXnH6-Google Scholar
- [16] . 2010. The linguistic processes underlying the P600. Language and Cognitive Processes 25, 2 (2010), 149–188.Google ScholarCross Ref
- [17] . 2016. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM’16). 55–64.
DOI: Google ScholarDigital Library - [18] . 2014. Convolutional neural network architectures for matching natural language sentences. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS’14), Vol. 2. 2042–2050. https://proceedings.neurips.cc/paper/2014/hash/b9d487a30398d42ecff55c228ed5652b-Abstract.htmlGoogle Scholar
- [19] . 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM’13). ACM, New York, NY, 2333–2338.
DOI: Google ScholarDigital Library - [20] . 2018. SciTaiL: A textual entailment dataset from science question answering. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence(AAAI’18), the 30th Innovative Applications of Artificial Intelligence (IAAI’18), and the 8thAAAI Symposium on Education Advances in Artificial Intelligence (EAAI’18). 5189–5198. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17368Google Scholar
- [21] . 2019. Semantic sentence matching with densely-connected recurrent and co-attentive information. In Proceedings of the 33rd AAAI Conference on ArtificialIntelligence (AAAI’19), the 31st Innovative Applications of Artificial Intelligence Conference (IAAI’19), and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’19). 6586–6593.
DOI: Google ScholarDigital Library - [22] . 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15). http://arxiv.org/abs/1412.6980Google Scholar
- [23] . 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations: Conference Track Proceedings (ICLR’17). https://openreview.net/forum?id=SJU4ayYglGoogle Scholar
- [24] . 2014. Semantic matching in search. Foundations and Trends in Information Retrieval 7, 5 (2014), 343–469.Google ScholarDigital Library
- [25] . 2020. Sentence matching with syntax- and semantics-aware BERT. In Proceedings of the 28th International Conference on Computational Linguistics. 3302–3312.
DOI: Google ScholarCross Ref - [26] . 2018. Stochastic answer networks for natural language inference. arXiv preprint abs/1804.07888 (2018). https://arxiv.org/abs/1804.07888Google Scholar
- [27] . 2018. Structured alignment networks for matching sentences. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 1554–1564.
DOI: Google ScholarCross Ref - [28] . 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint abs/1907.11692 (2019). https://arxiv.org/abs/1907.11692Google Scholar
- [29] . 2013. A deep architecture for matching short texts. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Vol. 1 (NIPS’13). 1367–1375. https://proceedings.neurips.cc/paper/2013/hash/8a0e1141fd37fa5b98d5bb769ba1a7cc-Abstract.htmlGoogle Scholar
- [30] . 2020. Entity-aware dependency-based deep graph attention network for comparative preference classification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5782–5788.
DOI: Google ScholarCross Ref - [31] . 2014. The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 55–60.
DOI: Google ScholarCross Ref - [32] . 2017. Learning to match using local and distributed representations of text for web search. In Proceedings of the 26th International Conference on World Wide Web (WWW’17). ACM, New York, NY, 1291–1299.
DOI: Google ScholarDigital Library - [33] . 2017. Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features. Information Processing & Management 53, 3 (2017), 640–652.Google Scholar
- [34] . 2016. Natural language inference by tree-based convolution and heuristic matching. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 130–136.
DOI: Google ScholarCross Ref - [35] . 2016. Text matching as image recognition. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2793–2799.. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11895Google Scholar
- [36] . 2016. A decomposable attention model for natural language inference. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2249–2255.
DOI: Google ScholarCross Ref - [37] . 2011. Cross-language plagiarism detection. Language Resources and Evaluation 45 (2011), 45–62.Google ScholarDigital Library
- [38] . 2021. Do syntax trees help pre-trained transformers extract information? In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 2647–2661.
DOI: Google ScholarCross Ref - [39] . 2014. Learning semantic representations using convolutional neural networks for web search. In Proceedings of the 23rd International Conference on World Wide Web. 373–374.Google ScholarDigital Library
- [40] . 2018. Multiway attention networks for modeling sentence pairs. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI’18). 4411–4417.
DOI: Google ScholarCross Ref - [41] . 2018. Co-stack residual affinity networks with multi-level attention refinement for matching text sequences. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 4492–4502.
DOI: Google ScholarCross Ref - [42] . 2018. Compare, compress and propagate: Enhancing neural architectures with alignment factorization for natural language inference. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 1565–1575.
DOI: Google ScholarCross Ref - [43] . 2018. Hermitian co-attention networks for text matching in asymmetrical domains. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI’18). 4425–4431.
DOI: Google ScholarCross Ref - [44] . 2017. Neural paraphrase identification of questions with noisy pretraining. In Proceedings of the 1st Workshop on Subword and Character Level Models in NLP. 142–147.
DOI: Google ScholarCross Ref - [45] . 2018. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations: Conference Track Proceedings (ICLR’18). https://openreview.net/forum?id=rJXMpikCZGoogle Scholar
- [46] . 2015. Syntax-based deep matching of short texts. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI’15). 1354–1361. http://ijcai.org/Abstract/15/195Google ScholarDigital Library
- [47] . 2019. Heterogeneous graph attention network. In Proceedings of the World Wide Web Conference (WWW’19). ACM, New York, NY, 2022–2032.
DOI: Google ScholarDigital Library - [48] . 2019. Improving natural language inference using external knowledge in the science questions domain. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI’19), the 31st Innovative Applications of Artificial Intelligence Conference (IAAI’19), and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’19).7208–7215.
DOI: Google ScholarDigital Library - [49] . 2017. Bilateral multi-perspective matching for natural language sentences. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17). 4144–4150.
DOI: Google ScholarCross Ref - [50] . 2017. Bilateral multi-perspective matching for natural language sentences. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17). 4144–4150.
DOI: Google ScholarCross Ref - [51] . 2021. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1 (2021), 4–24.Google ScholarCross Ref
- [52] . 2017. End-to-end neural ad-hoc ranking with kernel pooling. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 55–64.
DOI: Google ScholarDigital Library - [53] . 2022. Semantic sentence matching via interacting syntax graphs. In Proceedings of the 29th International Conference on Computational Linguistics. 938–949. https://aclanthology.org/2022.coling-1.78Google Scholar
- [54] . 2019. Deep learning for matching in search and recommendation. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining (WSDM’19). ACM, New York, NY, 832–833.
DOI: Google ScholarDigital Library - [55] . 2019. Simple and effective text matching with richer alignment features. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4699–4709.
DOI: Google ScholarCross Ref - [56] . 2019. Graph convolutional networks for text classification. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI’19), the 31st Innovative Applications of Artificial Intelligence Conference (IAAI’19), and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’19). 7370–7377.
DOI: Google ScholarDigital Library - [57] . 2021. Graph-based hierarchical relevance matching signals for ad-hoc retrieval. In Proceedings of the Web Conference 2021. 778–787.Google ScholarDigital Library
- [58] . 2020. Syntax-aware opinion role labeling with dependency graph convolutional networks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3249–3258.
DOI: Google ScholarCross Ref - [59] . 2020. Semantics-aware BERT for language understanding. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI’20), the 32nd Innovative Applications of Artificial Intelligence Conference (IAAI’20), and the 10th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’20). 9628–9635. https://aaai.org/ojs/index.php/AAAI/article/view/6510Google Scholar
Index Terms
- Syntactic-Informed Graph Networks for Sentence Matching
Recommendations
Syntax-Aware Sentence Matching with Graph Convolutional Networks
Knowledge Science, Engineering and ManagementAbstractNatural language sentence matching, as a fundamental technology for a variety of tasks, plays a key role in many natural language processing systems. In this article, we propose a new method which incorporates syntactic structure into “matching-...
Enhanced distance-aware self-attention and multi-level match for sentence semantic matching
Highlights- Propose a novel enhanced distance-aware self-attention network for sentence modeling.
AbstractSentence semantic matching is a core research area in natural language processing, which is widely used in various natural language tasks. In recent years, attention mechanism has shown good performance in deep neural networks for ...
MKPM: Multi keyword-pair matching for natural language sentences
AbstractSentence matching is widely used in various natural language tasks, such as natural language inference, paraphrase identification and question answering. For these tasks, we need to understand the logical and semantic relationship between two ...
Comments