research-article

Lightweight Text Matching Method with Rich Features

Authors:

Li ZhangAuthors Info & Claims

ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial Intelligence

Pages 332 - 337

https://doi.org/10.1145/3532213.3532262

Published: 13 July 2022 Publication History

Abstract

Text matching is one of the research hotspots in Natural Language Processing (NLP). The study of text matching is of great practical importance for applications such as text de-duplication, web retrieval, and question answering systems. A lightweight text matching method with rich features is proposed for the problem of large number of model parameters and low efficiency of text matching tasks in natural language processing. The whole model architecture is based on Siamese neural networks with shared parameters. Furthermore, the method utilizes an improved residual Network and attention mechanism for the extraction and alignment of vector representations. Only three key features for alignment operations are retained. In addition, an averaging operation is added to the fusion layer to provide vector representations with rich information for the prediction layer. Experimental results on the paraphrase identification dataset and two natural language inference datasets show that the proposed approach not only effectively reduces the number of parameters compared with existing models but also ensures good text matching performance. Experiments demonstrate that this method can be used in general text matching tasks.

References

[1]

Chen J, Choi E, Durrett G. 2021. Can NLI Models Verify QA Systems' Predictions? [J]. arXiv preprint arXiv:2104.08731.

[2]

Khot T, Sabharwal A, Clark P. 2018. Scitail: A textual entailment dataset from science question answering [C]//Thirty-Second AAAI Conference on Artificial Intelligence.

[3]

Wang Z, Hamza W, Florian R. 2017. Bilateral multi-perspective matching for natural language sentences [J]. arXiv preprint arXiv:1702.03814.

[4]

Rajpurkar P, Zhang J, Lopyrev K, 2016. Squad: 100,000+ questions for machine comprehension of text [J]. arXiv preprint arXiv:1606.05250.

[5]

Rao J, He H, Lin J. 2017. Experiments with convolutional neural network models for answer selection [C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1217-1220.

[6]

Chen Q, Zhu X, Ling Z, 2016. Enhanced lstm for natural language inference [J]. arXiv preprint arXiv:1609.06038.

[7]

Tan C, Wei F, Wang W, 2018. Multiway Attention Networks for Modeling Sentence Pairs [C]//IJCAI. 4411-4417.

[8]

Jiang Z, Zhang Y, Yang Z, 2021. Alignment Rationale for Natural Language Inference [C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 5372-5387.

[9]

Tay Y, Tuan L A, Hui S C. 2017. Compare, compress and propagate: Enhancing neural architectures with alignment factorization for natural language inference [J]. arXiv preprint arXiv:1801.00102.

[10]

Tay Y, Tuan L A, Hui S C. 2018. Co-stack residual affinity networks with multi-level attention refinement for matching text sequences [J]. arXiv preprint arXiv:1810.02938.

[11]

Kim S, Kang I, Kwak N. 2019. Semantic sentence matching with densely-connected recurrent and co-attentive information [C]//Proceedings of the AAAI conference on artificial intelligence. 33(01): 6586-6593.

[12]

Liu X, Duh K, Gao J. 2018. Stochastic answer networks for natural language inference [J]. arXiv preprint arXiv:1804.07888.

[13]

Devlin J, Chang M W, Lee K, 2018. Bert: Pre-training of deep bidirectional transformers for language understanding [J]. arXiv preprint arXiv:1810.04805.

[14]

Peters M E, Neumann M, Iyyer M, 2018. Deep contextualized word representations [J]. arXiv preprint arXiv:1802.05365.

[15]

Brown T B, Mann B, Ryder N, 2020. Language models are few-shot learners [J]. arXiv preprint arXiv:2005.14165.

[16]

Topal M O, Bas A, van Heerden I. 2021. Exploring transformers in natural language generation: Gpt, bert, and xlnet [J]. arXiv preprint arXiv:2102.08036.

[17]

Parikh A P, Täckström O, Das D, 2016. A decomposable attention model for natural language inference [J]. arXiv preprint arXiv:1606.01933.

[18]

Gong Y, Luo H, Zhang J. 2017. Natural language inference over interaction space [J]. arXiv preprint arXiv:1709.04348.

[19]

Collobert R, Weston J, Bottou L, 2011. Natural language processing (almost) from scratch [J]. Journal of machine learning research, 12(ARTICLE): 2493-2537.

[20]

Vaswani A, Shazeer N, Parmar N, 2017. Attention is all you need [J]. Advances in neural information processing systems, 30.

[21]

Basiri M E, Nemati S, Abdar M, 2021. ABCDM: An attention-based bidirectional CNN-RNN deep model for sentiment analysis [J]. Future Generation Computer Systems, 115: 279-294.

Digital Library

[22]

Mou L, Men R, Li G, 2015. Natural language inference by tree-based convolution and heuristic matching [J]. arXiv preprint arXiv:1512.08422.

[23]

Bowman S R, Angeli G, Potts C, 2015. A large annotated corpus for learning natural language inference [J]. arXiv preprint arXiv:1508.05326.

[24]

Shankar Iyer, Nikhil Dandekar, and Korn´el Csernai. 2017. First Quora Dataset Release: Question Pairs.

[25]

Paszke A, Gross S, Massa F, 2019. Pytorch: An imperative style, high-performance deep learning library [J]. Advances in neural information processing systems, 32: 8026-8037.

[26]

Bird S, Klein E, Loper E. 2009. Natural language processing with Python: analyzing text with the natural language toolkit [M]. " O'Reilly Media, Inc.".

[27]

Pennington J, Socher R, Manning C D. 2014. Glove: Global vectors for word representation [C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532-1543.

[28]

He K, Zhang X, Ren S, 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification [C]//Proceedings of the IEEE international conference on computer vision. 1026-1034.

[29]

Salimans T, Kingma D P. 2016. Weight normalization: A simple reparameterization to accelerate training of deep neural networks [J]. Advances in neural information processing systems, 29: 901-909.

[30]

Hendrycks D, Gimpel K. 2016. A baseline for detecting misclassified and out-of-distribution examples in neural networks [J]. arXiv preprint arXiv:1610.02136.

Lightweight Text Matching Method with Rich Features
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Machine learning approaches

Recommendations

Attention-Based Multi-level Network for Text Matching with Feature Fusion
ACAI '21: Proceedings of the 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence

Text matching is a basic and common task in natural language processing. Recently, deep learning has achieved excellent performance in text matching tasks. The major process of the existing model is to pass two sentences through shallow encoder and ...
Deep Text Matching in Medical Question Answering System
ACM ICEA '21: Proceedings of the 2021 ACM International Conference on Intelligent Computing and its Emerging Applications

The retrieval question-answering(Q&A) system based on Q&A library is a system that can retrieve the most similar question from Q&A library to get the correct answer. Classic approaches only use TF-IDF, BM25 and other algorithms to calculate the shallow ...
Multi-Level Matching Networks for Text Matching
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

Text matching aims to establish the matching relationship between two texts. It is an important operation in some information retrieval related tasks such as question duplicate detection, question answering, and dialog systems. Bidirectional long short ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial Intelligence

March 2022

809 pages

ISBN:9781450396110

DOI:10.1145/3532213

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 July 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICCAI '22

ICCAI '22: 2022 8th International Conference on Computing and Artificial Intelligence

March 18 - 21, 2022

Tianjin, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
44
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten