skip to main content
10.1145/3448218.3448235acmotherconferencesArticle/Chapter ViewAbstractPublication PagescceaiConference Proceedingsconference-collections
Article

Label-Based Convolutional Neural Network for Text Classification

Published: 15 February 2021 Publication History

Editorial Notes

The authors have requested minor, non-substantive changes to the VoR and, in accordance with ACM policies, a Corrected VoR was published on February 19, 2021. For reference purposes the VoR may still be accessed via the Supplemental Material section on this page.

Abstract

The neural network models based on word embedding have achieved remarkable results in text classification. Even so, these models hardly consider that the importance of each word and labels for text classification is beneficial to obtain informative text representation. The attention mechanisms usually are used to measure the weights of words to improve predictive performance, but we attempt to achieve the same goal in a simple way. Since that word embedding can capture semantic regularities between words. we introduce a text representation based on label by embedding each label and the word vectors in the same space in this paper. In this label-based text representation, each word has weight information of the number of classes, which play an important role in the final performance. So we proposed a labelbased convolutional neural network (LBCNN) to obtain the importance of different word in the label-based text sequence and the most influential semantic features in the word vector respectively. The experimental results show that our proposed method outperforms the state-of-art methods on the several large text classification datasets.

Supplementary Material

3448235-vor (3448235-vor.pdf)
Version of Record for "Label-Based Convolutional Neural Network for Text Classification" by Wang et al., Proceedings of the 5th International Conference on Control Engineering and Artificial Intelligence (CCEAI '21).

References

[1]
Kavi Mahesh and Sergei Nirenburg. Knowledge-based systems for natural language processing. New Mexico State University, Computing Research Laboratory, 1996.
[2]
Sida Wang and Christopher D Manning. Baselines and bigrams: Simple, good sentiment and topic classification. In Proceedings of the 50th annual meeting of the association for computational linguistics: Short papers-volume 2, pages 90--94. Association for Computational Linguistics, 2012.
[3]
Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188, 2014.
[4]
Yizhe Zhang, Dinghan Shen, Guoyin Wang, Zhe Gan, Ricardo Henao, and Lawrence Carin. Deconvolutional paragraph representation learning. In Advances in Neural Information Processing Systems, pages 4169--4179, 2017.
[5]
Dinghan Shen, Yizhe Zhang, Ricardo Henao, Qinliang Su, and Lawrence Carin. Deconvolutional latent-variable model for text sequence matching. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
[6]
[Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997.
[7]
Wenlin Wang, Zhe Gan, Wenqi Wang, Dinghan Shen, Jiaji Huang, Wei Ping, Sanjeev Satheesh, and Lawrence Carin. Topic compositional neural language model. arXiv preprint arXiv:1712.09783, 2017.
[8]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
[9]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pages 1480--1489, 2016.
[10]
Sofia Serrano and Noah A. Smith. Is attention interpretable? In ACL, pages 2931 -2951. Association for Computational Linguistics, 2019.
[11]
Sarthak Jain and Byron C Wallace. Attention is not explanation. arXiv preprint arXiv:1902.10186, 2019.
[12]
Sarah Wiegreffe and Yuval Pinter. Attention is not not explanation. arXiv preprint arXiv:1908.04626, 2019.
[13]
Yoon Kim. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882, 2014.
[14]
Xiang Zhang, Junbo Zhao, and Yann LeCun. Character-level convolutional networks for text classification. In Advances in neural information processing systems, pages 649--657, 2015.
[15]
Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. Recurrent convolutional neural networks for text classification. In Twenty-ninth AAAI conference on artificial intelligence, 2015.
[16]
Yijun Xiao and Kyunghyun Cho. Efficient character-level document classification by combining convolution and recurrent layers. arXiv preprint arXiv:1602.00367, 2016.
[17]
Alexis Conneau, Holger Schwenk, Loïc Barrault, and Yann Lecun. Very deep convolutional networks for text classification. arXiv preprint arXiv:1606.01781, 2016.
[18]
Rie Johnson and Tong Zhang. Deep pyramid convolutional neural networks for text categorization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 562--570, 2017.
[19]
Shiyao Wang, Minlie Huang, and Zhidong Deng. Densely connected cnn with multi-scale feature attention for text classification. In IJCAI, pages 4468--4474, 2018.
[20]
Byung-Ju Choi, Jun-Hyung Park, and SangKeun Lee. Adaptive convolution for text classification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2475--2485, 2019.
[21]
Zeynep Akata, Florent Perronnin, Zaid Harchaoui, and Cordelia Schmid. Label-embedding for image classification. IEEE transactions on pattern analysis and machine intelligence, 38(7):1425--1438, 2015.
[22]
Andrea Frome, Greg S Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Marc'Aurelio Ranzato, and Tomas Mikolov. Devise: A deep visual-semantic embedding model. In Advances in neural information processing systems, pages 2121 -2129, 2013.
[23]
Ryan Kiros, Ruslan Salakhutdinov, and Richard S Zemel. Unifying visual-semantic embeddings with multimodal neural language models. arXiv preprint arXiv:1411.2539, 2014.
[24]
Jose A Rodriguez-Serrano, Florent Perronnin, and France Meylan. Label embedding for text recognition. In BMVC, pages 5--1. Citeseer, 2013.
[25]
Jian Tang, Meng Qu, and Qiaozhu Mei. Pte: Predictive text embedding through large-scale heterogeneous text networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1165--1174. ACM, 2015.
[26]
Honglun Zhang, Liqiang Xiao, Wenqing Chen, Yongkun Wang, and Yaohui Jin. Multi-task label embedding for text classification. arXiv preprint arXiv:1710.07210, pages 4545--4553, 2018.
[27]
Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, and Lawrence Carin. Joint embedding of words and labels for text classification. arXiv preprint arXiv:1805.04174, 2018.
[28]
Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizh eZhang, Chunyuan Li, Ricardo Henao, and Lawrence Carin. Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms. arXiv preprint arXiv:1805.09843, 2018.
[29]
Haopeng Ren, ZeQuan Zeng, Yi Cai, Qing Du, Qing Li, and Haoran Xie. A weighted word embedding model for text classification. In International Conference on Database Systems for Advanced Applications, pages 419--434. Springer, 2019.
[30]
Tomas Mikolov, Ilya Sutskever, Kai Chen, GregSCorrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111--3119, 2013.
[31]
Quoc Le and Tomas Mikolov. Distributed representations of sentences and documents. In International conference on machine learning, pages 1188--1196, 2014.
[32]
Jeffrey Pennington, Richard Socher, and Christopher Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), volume 14, pages 1532--1543, 2014.
[33]
Ryan Kiros, Yukun Zhu, Ruslan R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. Skip-thought vectors. In Advances in neural information processing systems, pages 3294--3302, 2015.
[34]
Gianna M Del Corso, Antonio Gulli, and Francesco Romani. Ranking a stream of news. In Proceedings of the 14th international conference on World Wide Web, pages 97--106. ACM, 2005.
[35]
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[36]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929--1958, 2014.
[37]
Andrew M Dai and Quoc V Le. Semi-supervised sequence learning. In Advances in neural information processing systems, pages 3079--3087, 2015.

Cited By

View all
  • (2023)Fast Text Classification using Lean Gradient Descent Feed Forward Neural Network for Category Feature Augmentation2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00330(2341-2348)Online publication date: 1-Nov-2023
  • (2022)Graph convolutional networks in language and vision: A surveyKnowledge-Based Systems10.1016/j.knosys.2022.109250251(109250)Online publication date: Sep-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CCEAI '21: Proceedings of the 5th International Conference on Control Engineering and Artificial Intelligence
January 2021
165 pages
ISBN:9781450388870
DOI:10.1145/3448218
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • York University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 February 2021

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article
  • Research
  • Refereed limited

Conference

CCEAI 2021

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Fast Text Classification using Lean Gradient Descent Feed Forward Neural Network for Category Feature Augmentation2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00330(2341-2348)Online publication date: 1-Nov-2023
  • (2022)Graph convolutional networks in language and vision: A surveyKnowledge-Based Systems10.1016/j.knosys.2022.109250251(109250)Online publication date: Sep-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media