short-paper

Attending to Inter-sentential Features in Neural Text Classification

Authors:

Sunil Kumar Sahu,

Mohammady MahdyAuthors Info & Claims

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 1685 - 1688

https://doi.org/10.1145/3397271.3401203

Published: 25 July 2020 Publication History

Abstract

Text classification requires a deep understanding of the linguistic features in text; in particular, the intra-sentential (local) and inter-sentential features (global). Models that operate on word sequences have been successfully used to capture the local features, yet they are not effective in capturing the global features in long-text. We investigate graph-level extensions to such models and propose a novel architecture for combining alternative text features. It uses an attention mechanism to dynamically decide how much information to use from a sequence- or graph-level component. We evaluated different architectures on a range of text classification datasets, and graph-level extensions were found to improve performance on most benchmarks. In addition, the attention-based architecture, as adaptively-learned from the data, outperforms the generic and fixed-value concatenation ones.

Supplementary Material

MP4 File (3397271.3401203.mp4)

This is the video presentation for the paper "Attending to Inter-sentential Features in Neural Text Classification" as submitted to SIGIR 2020. The video provides an overview of the proposed model, the experiment, as well as the results obtained in the study.

Download
7.26 MB

References

[1]

Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann Lecun. 2014. Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations (ICLR2014), CBLS, April 2014 .

[2]

Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).

[3]

Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in neural information processing systems. 3844--3852.

[4]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186.

[5]

Jeffrey L Elman. 1990. Finding structure in time. Cognitive science, Vol. 14, 2 (1990), 179--211.

[6]

Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163 (2015).

[7]

Matthew Honnibal and Mark Johnson. 2015. An Improved Non-monotonic Transition System for Dependency Parsing. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, 1373--1378. https://aclweb.org/anthology/D/D15/D15--1162

[8]

Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In European conference on machine learning. Springer, 137--142.

Digital Library

[9]

Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1746--1751.

[10]

Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. ICLR (2015).

[11]

Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. ICLR (2017).

[12]

Kamran Kowsari, Kiana Jafari Meimandi, Mojtaba Heidarysafa, Sanjana Mendu, Laura Barnes, and Donald Brown. 2019. Text classification algorithms: A survey. Information, Vol. 10, 4 (2019), 150.

[13]

Ken Lang. 1995. Newsweeder: Learning to filter netnews. In Proceedings of the Twelfth International Conference on Machine Learning. 331--339.

Digital Library

[14]

David D Lewis, Yiming Yang, Tony G Rose, and Fan Li. 2004. Rcv1: A new benchmark collection for text categorization research. Journal of machine learning research, Vol. 5, Apr (2004), 361--397.

Digital Library

[15]

Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016).

Digital Library

[16]

Andrew L Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1. Association for Computational Linguistics, 142--150.

Digital Library

[17]

Diego Marcheggiani and Ivan Titov. 2017. Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1506--1515.

[18]

Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.

[19]

Sunil Kumar Sahu, Fenia Christopoulou, Makoto Miwa, and Sophia Ananiadou. 2019. Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. 4309--4316.

[20]

Martin Simonovsky and Nikos Komodakis. 2017. Dynamic edge-conditioned filters in convolutional neural networks on graphs. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3693--3702.

[21]

Baoxin Wang. 2018. Disconnected Recurrent Neural Networks for Text Categorization. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2311--2320.

[22]

Boya Wang, Jianqing Xu, Junbao Li, Cong Hu, and Jeng-Shyang Pan. 2017. Scene text recognition algorithm based on faster RCNN. In 2017 First International Conference on Electronics Instrumentation & Information Systems (EIIS). IEEE, 1--4.

[23]

Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying Graph Convolutional Networks. In International Conference on Machine Learning. 6861--6871.

[24]

Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv preprint arXiv:1906.08237 (2019).

[25]

Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 7370--7377.

Digital Library

Cited By

Michał Mirończuk MMüller APedrycz W(2024)The Outcomes and Publication Standards of Research Descriptions in Document Classification: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.351355012(189253-189287)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3513550
Chai YZhang HYin QZhang J(2023)Neural Text Classification by Jointly Learning to Cluster and Align2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191269(1-8)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10191269

Index Terms

Attending to Inter-sentential Features in Neural Text Classification
1. Applied computing
  1. Document management and text processing
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

Static, Dynamic, and Hybrid Neural Networks in Forecasting Inflation

The back-propagation neural network (BPN) model has been the most popular form of artificial neural network model used for forecasting, particularly in economics and finance. It is a static (feed-forward) model which has a learning process in both hidden ...
A Convolutional Neural Network with Word-level Attention for Text Classification
ICCMS '20: Proceedings of the 12th International Conference on Computer Modeling and Simulation

Text classification is a classic task in the NLP area which aims to predict the categories for given texts. Many neural network models are applied to this task with the development of the neural network technology. One of the typical neural network ...
Text Classification Method Based on BiGRU-Attention and CNN Hybrid Model
AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern Recognition

Aiming at the problem that traditional Gated Recurrent Unit (GRU) and Convolution Neural Network (CNN) can not reflect the importance of each word in the text when extracting features, a text classification method based on BiGRU Attention and CNN is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2020

2548 pages

ISBN:9781450380164

DOI:10.1145/3397271

General Chairs:
Jimmy Huang
York University, Canada
,
Yi Chang
Jilin University, China
,
Xueqi Cheng
Chinese Academy of Sciences, China
,
Program Chairs:
Jaap Kamps
University of Amsterdam, Netherlands
,
Vanessa Murdock
Amazon, U.S.A.
,
Ji-Rong Wen
Renmin University of China, China
,
Yiqun Liu
Tsinghua University, China

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

SIGIR '20

Sponsor:

SIGIR

SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval

July 25 - 30, 2020

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
226
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Michał Mirończuk MMüller APedrycz W(2024)The Outcomes and Publication Standards of Research Descriptions in Document Classification: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.351355012(189253-189287)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3513550
Chai YZhang HYin QZhang J(2023)Neural Text Classification by Jointly Learning to Cluster and Align2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191269(1-8)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10191269

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents