Abstract
Convolutional neural network (CNNs) has made a breakthrough since deep learning was employed to text classification. However, the traditional CNNs usually use the same set of filters for feature extraction, where labels play a less central role of the final performance, as label information is not better utilized. To solve the problem of how to represent label information better and how to apply the learned label representations to the text classification tasks, we propose an adaptive convolution with label embedding(ACLE) in this paper, which adaptively generated convolutional filters that are conditioned on inputs and made each label embedded in the same space with the word vectors. Our method maintains the flexibility of adaptive convolution, and fully extracts label information to play an auxiliary role. The experimental results on the several large text datasets show that the proposed model is feasible and out-performs the state-of-the-art methods by a large margin in terms of accuracy.
Similar content being viewed by others
References
Li Q, Peng H, Li J, Xia C, Yang R, Sun L et al (2020) A survey on text classification: from shallow to deep learning
Yao L, Mao C, Luo Y (2018) Graph convolutional networks for text classification
Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Gao J (2020) Deep learning based text classification: a comprehensive review
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504– 507
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv:1408.5882
Johnson R, Zhang T (2017) Deep pyramid convolutional neural networks for text categorization. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 562–570
Wang S, Huang M, Deng Z (2018) Densely connected cnn with multi-scale feature attention for text classification. In: IJCAI, pp 4468–4474
Wu F, Fan A, Baevski A, Dauphin YN, Auli M (2019) Pay less attention with lightweight and dynamic convolutions. arXiv:1901.10430
Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. arXiv:1404.2188
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Twenty-ninth AAAI conference on artificial intelligence
Joulin A, Grave E, Bojanowski P, Mikolov T (2016) Bag of tricks for efficient text classification. arXiv:1607.01759
Klein B, Wolf L, Afek Y (2015) A dynamic convolutional layer for short range weather prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4840–4848
Jia X, De Brabandere B, Tuytelaars T, Gool LV (2016) Dynamic filter networks. In: Advances in neural information processing systems, pp 667–675
Shen D, Min MR, Li Y, Carin L (2017) Learning context-sensitive convolutional filters for text processing. arXiv:1709.08294
Xiao L, Zhang H, Chen W, Wang Y, Jin Y (2018) Transformable convolutional neural network for text classification. In: Twenty-seventh international joint conference on artificial intelligence. IJCAI-18
Akata Z, Perronnin F, Harchaoui Z, Schmid C (2015) Label-embedding for image classification. IEEE Trans Pattern Anal Mach Intell 38(7):1425–1438
Du C, Chen Z, Feng F, et al. (2019) Explicit interaction model towards text classification [J]. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 6359– 6366
Weinberger K, Chapelle O (2008) Large margin taxonomy embedding with an application to document categorization. Curran Associates Inc.
Zhang H, Xiao L, Chen W, Wang Y, Jin Y (2017) Multi-task label embedding for text classification. arXiv:1710.07210
Wang G, Li C, Wang W, Zhang Y, Shen D, Zhang X, ..., Carin L (2018) Joint embedding of words and labels for text classification. arXiv:1805.04174
Cho K, Van Merriënboer B., Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078
Lin Z, Feng M, Santos CND, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. arXiv:1703.03130
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Advances in neural information processing systems, pp 649–657
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980
Shen D, Wang G, Wang W, Min MR, Su Q, Zhang Y, ..., Carin L (2018) Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms. arXiv:1805.09843
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Shen T, Zhou T, Long G, Jiang J, Zhang C (2018) Bi-directional block self-attention for fast and memory-efficient sequence modeling
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tan, C., Ren, Y. & Wang, C. An adaptive convolution with label embedding for text classification. Appl Intell 53, 804–812 (2023). https://doi.org/10.1007/s10489-021-02702-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02702-x