Combining weighted category-aware contextual information in convolutional neural networks for text classification

Wu, Xin; Cai, Yi; Li, Qing; Xu, Jingyun; Leung, Ho-fung

doi:10.1007/s11280-019-00757-y

Combining weighted category-aware contextual information in convolutional neural networks for text classification

Published: 27 February 2020

Volume 23, pages 2815–2834, (2020)
Cite this article

World Wide Web Aims and scope Submit manuscript

Xin Wu¹,
Yi Cai¹,
Qing Li²,
Jingyun Xu¹ &
…
Ho-fung Leung³

611 Accesses
6 Citations
Explore all metrics

Abstract

Convolutional neural networks (CNNs) are widely used in many natural language processing tasks, which employ some convolutional filters to capture useful semantic features of a text. However, a small window size convolutional filter is short of the ability to capture contextual information, simply increasing the window size may bring the problems of data sparsity and enormous parameters. To capture the contextual information, we propose to use the weighted sum operation to obtain contextual word representation. We present one implicit weighting method and two explicit category-aware weighting methods to assign the weights of the contextual information. Experimental results on five text classification datasets show the effectiveness of our proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on Non-pooling Convolutional Text Classification Technology Combined with Attention Mechanism

Weighted N-grams CNN for Text Classification

Gated Convolutional Neural Networks for Text Classification

Notes

References

Aggarwal, C.C., Zhai, C.: A survey of text classification algorithms. In: Mining Text Data, pp. 163–222. Springer (2012)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate, arXiv:1409.0473 (2014)
Chen, X., Xu, L., Liu, Z., Sun, M., Luan, H.: Joint learning of character and word embeddings. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)
Cheng, J., Dong, L., Lapata, M.: Long short-term memory-networks for machine reading, arXiv:1601.06733 (2016)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)
MATH Google Scholar
Cotterell, R., Schütze, H.: Morphological word-embeddings. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1287–1292 (2015)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (2012)
MATH Google Scholar
Debole, F., Sebastiani, F.: Supervised term weighting for automated text categorization. In: Text Mining and its Applications, pp. 81–97. Springer (2004)
Hu, Z., Ma, X., Liu, Z., Hovy, E., Xing, E.: Harnessing deep neural networks with logic rules, arXiv:1603.06318 (2016)
Iacobacci, I., Pilehvar, M.T., Navigli, R.: Sensembed: learning sense embeddings for word and relational similarity. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), vol. 1, pp. 95–105 (2015)
Irsoy, O., Cardie, C.: Deep recursive neural networks for compositionality in language. In: Advances in Neural Information Processing Systems, pp. 2096–2104 (2014)
Kim, Y.: Convolutional neural networks for sentence classification, arXiv:1408.5882 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: AAAI, vol. 333, pp. 2267–2273 (2015)
Lan, M., Tan, C.L., Low, H.B.: Proposing a new term weighting scheme for text categorization. In: AAAI, vol. 6, pp. 763–768 (2006)
Lan, M., Tan, C.L., Su, J., Lu, Y.: Supervised and traditional term weighting methods for automatic text categorization. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 721–735 (2009)
Article Google Scholar
LeCun, Y., Bengio, Y., et al.: Convolutional networks for images, speech, and time series. Handbook Brain Theory Neural Netw. 3361(10), 1995 (1995)
Google Scholar
Leopold, E., Kindermann, J.: Text categorization with support vector machines. How to represent texts in input space? Mach. Learn. 46(1-3), 423–444 (2002)
Article Google Scholar
Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th international conference on Computational linguistics, vol. 1, pp. 1–7. Association for Computational Linguistics (2002)
Li, S., Zhao, Z., Liu, T., Hu, R., Du, X.: Initializing convolutional filters with semantic features for text classification. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1884–1889 (2017)
Li, X., Rao, Y., Xie, H., Lau, R.Y.K., Yin, J., Wang, F.L.: Bootstrapping social emotion classification with semantically rich hybrid neural networks. IEEE Trans. Affect. Comput. 8(4), 428–442 (2017)
Article Google Scholar
Li, Y., Cai, Y., Leung, H.F., Li, Q.: Improving short text modeling by two-level attention networks for sentiment classification. In: International Conference on Database Systems for Advanced Applications, pp. 878–890. Springer (2018)
Liang, W., Xie, H., Rao, Y., Lau, R.Y., Wang, F.L.: Universal affective model for readers’ emotion classification over short texts. Expert Syst. Appl. 114, 322–333 (2018)
Article Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Mnih, A., Hinton, G.E.: A scalable hierarchical distributed language model. In: Advances in Neural Information Processing Systems, pp. 1081–1088 (2009)
Ng, A.Y.: Feature selection, l 1 vs. l 2 regularization, and rotational invariance. In: Proceedings of the twenty-first international conference on Machine learning, p 78. ACM (2004)
Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics, p 271. Association for Computational Linguistics (2004)
Pang, B., Lee, L.: Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 115–124. Association for Computational Linguistics (2005)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)
Parikh, A.P., Täckström, O., Das, D., Uszkoreit, J.: A decomposable attention model for natural language inference, arXiv:1606.01933(2016)
Post, M., Bergsma, S.: Explicit and implicit syntactic features for text classification. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 866–872 (2013)
Quan, X., Wenyin, L., Qiu, B.: Term weighting schemes for question categorization. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 1009–1021 (2011)
Article Google Scholar
Rao, Y., Xie, H., Li, J., Jin, F., Wang, F.L., Li, Q.: Social emotion classification of short text via topic-level maximum entropy model. Inform. Manag. 53 (8), 978–986 (2016)
Article Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533 (1986)
Article Google Scholar
Santos, C.D., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 1818–1826 (2014)
Socher, R., Lin, C.C., Manning, C., Ng, A.Y.: Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 129–136 (2011)
Socher, R., Bauer, J., Manning, C.D., et al.: Parsing with compositional vector grammars. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 455–465 (2013)
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C.D., Ng, A., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631–1642 (2013)
Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Document. 28(1), 11–21 (1972)
Article Google Scholar
Tang, D., Qin, B., Feng, X., Liu, T.: Target-dependent sentiment classification with long short term memory, arXiv:1512.01100 (2015)
Turian, J., Ratinov, L., Bengio, Y.: Word representations: A simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 384–394. Association for Computational Linguistics (2010)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 6000–6010 (2017)
Wang, S., Manning, C.D.: Baselines and bigrams: Simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2, pp. 90–94. Association for Computational Linguistics (2012)
Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph and text jointly embedding. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1591–1601 (2014)
Wang, T., Cai, Y., Leung, H.f., Cai, Z., Min, H.: Entropy-based term weighting schemes for text categorization in vsm. In: 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 325–332 (2015)
Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 2915–2921. AAAI Press (2017)
Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. Lang. Resour. Eval. 39(2–3), 165–210 (2005)
Article Google Scholar
Wu, X., Cai, Y., Li, Q., Xu, J., Leung, H.f.: Combining contextual information by self-attention mechanism in convolutional neural networks for text classification. In: International Conference on Web Information Systems Engineering, pp. 453–467. Springer (2018)
Yang, C., Lin, K.H.Y., Chen, H.H.: Emotion classification using Web blog corpora. In: IEEE/WIC/ACM International Conference on Web Intelligence (WI’07), pp. 275–278 (2007)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)
Yang, Q., Rao, Y., Xie, H., Wang, J., Wang, F.L., Chan, W.H., Cambria, E.C.: Segment-level joint topic-sentiment model for online review analysis. IEEE Intell. Syst. 34(1), 43–50 (2019)
Article Google Scholar
Yin, W., Schütze, H.: Multichannel variable-size convolution for sentence classification. arXiv:1603.04513 (2016)
Yu, M., Dredze, M.: Improving lexical embeddings with semantic knowledge. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 545–550 (2014)
Zeiler, M.D.: Adadelta: An adaptive learning rate method. arXiv:1212.5701 (2012)
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Zhang, Y., Roller, S., Wallace, B.: Mgnc-cnn: a simple approach to exploiting multiple word embeddings for sentence classification. arXiv:1603.00968 (2016)
Zhou, P., Shi, W., Tian, J., Qi, Z., Li, B., Hao, H., Xu, B.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 207–212 (2016)

Download references

Acknowledgements

This article is the extension of the conference paper: Wu, X., Cai, Y., Li, Q., Xu, J., Leung, H. F. (2018, November). Combining Contextual Information by Self-attention Mechanism in Convolutional Neural Networks for Text Classification. In International Conference on Web Information Systems Engineering (pp. 453-467). Springer, Cham.

In this article, we make the following contributions beyond the conference paper.

– We conduct several extension experiments to further test the performance of our methods proposed in the conference paper. We find that the weights computed by self-attention could turn to some unexplainable values and become a harmful noise to the model.

– To address the limitations of the self-attention mechanism and further improve the interpretability and controllability or the model, we propose an explicit category-aware term weighting method to explicitly assign the weights to words and compute the contextual word embedding.

– To further leverage the information between words and words, we propose a co-occurrence based weighting method.

– We conduct several experiments on five short text classification datasets to demonstrate the effectiveness of our new proposed methods. The results show that our methods can outperform some state-of-the-art methods.

This work was supported by the Fundamental Research Funds for the Central Universities, SCUT (No. 2017ZD048, D2182480), the Science and Technology Planning Project of Guangdong Province (No.2017B050506004), the Science and Technology Program of Guangzhou (No. 201704030076,201802010027). The research described in this article been supported by a collaborative research grant from the Hong Kong Research Grants Council (project no. C1031-18G) and a CUHK Direct Grant (Project Code EE16963).

Author information

Authors and Affiliations

School of Software Engineering, South China University of Technology, Guangzhou, China
Xin Wu, Yi Cai & Jingyun Xu
Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China
Qing Li
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
Ho-fung Leung

Authors

Xin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yi Cai
View author publications
You can also search for this author in PubMed Google Scholar
Qing Li
View author publications
You can also search for this author in PubMed Google Scholar
Jingyun Xu
View author publications
You can also search for this author in PubMed Google Scholar
Ho-fung Leung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Cai.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Web Information Systems Engineering 2018

Guest Editors: Hakim Hacid, Wojciech Cellary, Hua Wang and Yanchun Zhang

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, X., Cai, Y., Li, Q. et al. Combining weighted category-aware contextual information in convolutional neural networks for text classification. World Wide Web 23, 2815–2834 (2020). https://doi.org/10.1007/s11280-019-00757-y

Download citation

Received: 26 March 2019
Revised: 15 July 2019
Accepted: 28 October 2019
Published: 27 February 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s11280-019-00757-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining weighted category-aware contextual information in convolutional neural networks for text classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Research on Non-pooling Convolutional Text Classification Technology Combined with Attention Mechanism

Weighted N-grams CNN for Text Classification

Gated Convolutional Neural Networks for Text Classification

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Combining weighted category-aware contextual information in convolutional neural networks for text classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Research on Non-pooling Convolutional Text Classification Technology Combined with Attention Mechanism

Weighted N-grams CNN for Text Classification

Gated Convolutional Neural Networks for Text Classification

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation