skip to main content
10.1145/3397271.3401203acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Attending to Inter-sentential Features in Neural Text Classification

Published: 25 July 2020 Publication History

Abstract

Text classification requires a deep understanding of the linguistic features in text; in particular, the intra-sentential (local) and inter-sentential features (global). Models that operate on word sequences have been successfully used to capture the local features, yet they are not effective in capturing the global features in long-text. We investigate graph-level extensions to such models and propose a novel architecture for combining alternative text features. It uses an attention mechanism to dynamically decide how much information to use from a sequence- or graph-level component. We evaluated different architectures on a range of text classification datasets, and graph-level extensions were found to improve performance on most benchmarks. In addition, the attention-based architecture, as adaptively-learned from the data, outperforms the generic and fixed-value concatenation ones.

Supplementary Material

MP4 File (3397271.3401203.mp4)
This is the video presentation for the paper "Attending to Inter-sentential Features in Neural Text Classification" as submitted to SIGIR 2020. The video provides an overview of the proposed model, the experiment, as well as the results obtained in the study.

References

[1]
Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann Lecun. 2014. Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations (ICLR2014), CBLS, April 2014 .
[2]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
[3]
Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in neural information processing systems. 3844--3852.
[4]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186.
[5]
Jeffrey L Elman. 1990. Finding structure in time. Cognitive science, Vol. 14, 2 (1990), 179--211.
[6]
Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163 (2015).
[7]
Matthew Honnibal and Mark Johnson. 2015. An Improved Non-monotonic Transition System for Dependency Parsing. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, 1373--1378. https://aclweb.org/anthology/D/D15/D15--1162
[8]
Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In European conference on machine learning. Springer, 137--142.
[9]
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1746--1751.
[10]
Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. ICLR (2015).
[11]
Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. ICLR (2017).
[12]
Kamran Kowsari, Kiana Jafari Meimandi, Mojtaba Heidarysafa, Sanjana Mendu, Laura Barnes, and Donald Brown. 2019. Text classification algorithms: A survey. Information, Vol. 10, 4 (2019), 150.
[13]
Ken Lang. 1995. Newsweeder: Learning to filter netnews. In Proceedings of the Twelfth International Conference on Machine Learning. 331--339.
[14]
David D Lewis, Yiming Yang, Tony G Rose, and Fan Li. 2004. Rcv1: A new benchmark collection for text categorization research. Journal of machine learning research, Vol. 5, Apr (2004), 361--397.
[15]
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016).
[16]
Andrew L Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1. Association for Computational Linguistics, 142--150.
[17]
Diego Marcheggiani and Ivan Titov. 2017. Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1506--1515.
[18]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.
[19]
Sunil Kumar Sahu, Fenia Christopoulou, Makoto Miwa, and Sophia Ananiadou. 2019. Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. 4309--4316.
[20]
Martin Simonovsky and Nikos Komodakis. 2017. Dynamic edge-conditioned filters in convolutional neural networks on graphs. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3693--3702.
[21]
Baoxin Wang. 2018. Disconnected Recurrent Neural Networks for Text Categorization. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2311--2320.
[22]
Boya Wang, Jianqing Xu, Junbao Li, Cong Hu, and Jeng-Shyang Pan. 2017. Scene text recognition algorithm based on faster RCNN. In 2017 First International Conference on Electronics Instrumentation & Information Systems (EIIS). IEEE, 1--4.
[23]
Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying Graph Convolutional Networks. In International Conference on Machine Learning. 6861--6871.
[24]
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv preprint arXiv:1906.08237 (2019).
[25]
Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 7370--7377.

Cited By

View all
  • (2024)The Outcomes and Publication Standards of Research Descriptions in Document Classification: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.351355012(189253-189287)Online publication date: 2024
  • (2023)Neural Text Classification by Jointly Learning to Cluster and Align2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191269(1-8)Online publication date: 18-Jun-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2020
2548 pages
ISBN:9781450380164
DOI:10.1145/3397271
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. attention mechanism
  2. graph network
  3. hybrid neural network

Qualifiers

  • Short-paper

Conference

SIGIR '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)The Outcomes and Publication Standards of Research Descriptions in Document Classification: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.351355012(189253-189287)Online publication date: 2024
  • (2023)Neural Text Classification by Jointly Learning to Cluster and Align2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191269(1-8)Online publication date: 18-Jun-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media