Skip to main content
Log in

Character-level text classification via convolutional neural network and gated recurrent unit

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Text categorization, or text classification, is one of key tasks for representing the semantic information of documents. Traditional deep leaning models for text categorization are generally time-consuming on large scale datasets due to slow convergence rate or heavily rely on the pre-trained word vectors. Motivated by fully convolutional networks in the field of image processing, we introduce fully convolutional layers to substantially reduce the number of parameters in the text classification model. A character-level model for short text classification, integrating convolutional neural network, bidirectional gated recurrent unit, highway network with the fully connected layers, is proposed to capture both the global and the local textual semantics at the fast convergence speed. Furthermore, In addition, error minimization extreme learning machine is incorporated into the proposed model to improve the classification accuracy further. Extensive experiments show that our approach achieves the state-of-the-art performance compared with the existing methods on the large scale text datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://github.com/kuberkaul/SentimentAnalysis-MovieReviews.

  2. https://nlp.stanford.edu/sentiment/.

  3. https://trec.nist.gov/data/tweets/.

  4. http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html.

  5. https://webscope.sandbox.yahoo.com/.

  6. https://wiki.dbpedia.org/Datasets.

  7. https://s3.amazonaws.com/fast-ai-nlp/yelp_review_full_csv.tgz.

  8. https://www.sogou.com/labs/resource/c1s.php.

  9. https://github.com/Jacob-Zhou/LRMR_Core.

References

  1. Zhang W, Tang X, Yoshida T (2015) TESC: an approach to text classification using semi-supervised clustering. Knowl-Based Syst 75:152–160

    Article  Google Scholar 

  2. Zhang W, Du Y, Yoshida T, Wang Q (2018) DRI-RCNN: an approach to deceptive review identification using recurrent convolutional neural network. Inf Process Manag 54(4):576–592

    Article  Google Scholar 

  3. Poria S, Cambria E, Bajpai R, Hussain A (2017) A review of affective computing: from unimodal analysis to multimodal fusion. Inf Fusion 37:98–125

    Article  Google Scholar 

  4. Cambria E (2016) Affective computing and sentiment analysis. IEEE Intell Syst 31(2):102–107

    Article  Google Scholar 

  5. Gong Y, Ke Q, Isard M, Lazebnik S (2014) A multi-view embedding space for modeling internet images, tags, and their semantics. Int J Comput Vision 106(2):210–233

    Article  Google Scholar 

  6. Carreras X, Marquez L (2001) Boosting trees for anti-spam email filtering. In: RANLP, pp 58–64

  7. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196

  8. Poria S, Cambria E, Howard N, Huang G-B, Hussain A (2016) Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing 174:50–59

    Article  Google Scholar 

  9. Cambria E, White B (2014) Jumping NLP curves: a review of natural language processing research. IEEE Comput Intell Mag 9(2):48–57

    Article  Google Scholar 

  10. Pavlinek M, Podgorelec V (2017) Text classification method based on self-training and LDA topic models. Expert Syst Appl 80:83–93

    Article  Google Scholar 

  11. Fu R, Qin B, Liu T (2015) Open-categorical text classification based on multi-LDA models. Soft Comput 19(1):29–38

    Article  Google Scholar 

  12. Agrawal A, Fu W, Menzies T (2018) What is wrong with topic modeling? And how to fix it using search-based software engineering. Inform Software Tech 98:74–88

    Article  Google Scholar 

  13. Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2013) How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms. In: Proceedings of the 2013 international conference on software engineering, IEEE Press, pp 522–531

  14. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, pp 1556–1566

  15. Cambria E, Fu J, Bisio F, Poria S (2015) AffectiveSpace 2: enabling affective intuition for concept-level sentiment analysis. In: AAAI, Austin, pp 508–514

  16. Chunting Z, Chonglin S, Zhiyuan L et al (2015) A C-LSTM neural network for text classification. Comput Sci 1(4):39–44

    Google Scholar 

  17. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1746–1751

  18. dos Santos C, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers, pp 69–78, Dublin, Ireland

  19. Bengio Y, Schwenk H, Senécal J-S, Morin F, Gauvain J-L (2016) Neural probabilistic language models. In: Innovations in machine learning. Springer, pp 137–186

  20. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537

    MATH  Google Scholar 

  21. Zhang X, Zhao JB, Lecun Y (2015) Character-level convolutional networks for text classification. Adv Neural Inf Process Syst 28:1–9

    Google Scholar 

  22. Kanaris I, Kanaris K, Houvardas I, Stamatatos E (2007) Words versus character n-grams for anti-spam filtering. Int J Artif Intell Tools 16(06):1047–1067

    Article  Google Scholar 

  23. Santos CD, Zadrozny B (2014) Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31st international conference on machine learning (ICML-14), pp 1818–1826

  24. Shen Y, He X, Gao J, Deng L, Mesnil G (2014) A latent semantic model with convolutional-pooling structure for information retrieval. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management, pp 101–110

  25. Poria S, Cambria E, Gelbukh A (2015) Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In: EMNLP, pp 2539–2544

  26. Xu B, Ye D, Xing Z, Xia X, Chen G, Li S (2016) Predicting semantically linkable knowledge in developer online forums via convolutional neural network. In: Proceedings of the 31st IEEE/ACM international conference on automated software engineering (ASE 2016). New York, NY, USA, pp 51–62

  27. Chaturvedi I, Ong Y-S, Tsang I, Welsch R, Cambria E (2016) Learning word dependencies in text by means of a deep recurrent belief network. Knowl-Based Syst 108:144–154

    Article  Google Scholar 

  28. Cho K, Van Merriёnboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Proceedings of empirical methods in natural language processing, pp 1724–1734

  29. Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl-Based Syst 108:42–49

    Article  Google Scholar 

  30. Majumder N, Poria S, Gelbukh A, Cambria E (2017) Deep learning based document modeling for personality detection from text. IEEE Intell Syst 32(2):74–79

    Article  Google Scholar 

  31. Jaderberg M, Vedaldi A, Zisserman A (2014) Speeding up convolutional neural networks with low rank expansions. In: Proceedings of the British machine vision conference

  32. Lebedev V, Ganin Y, Rakhuba M, Oseledets I, Lempitsky V (2015) Speeding-up convolutional neural networks using fine-tuned cp-decomposition. In: 3rd international conference on learning representations

  33. Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1422–1432

  34. Cho K, Van Merriёnboer B, Gulcehre C, Bahdanau D, Bougares F Schwenk H, Bengio Y (2015) Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1724–1734

  35. Feng G, Huang GB, Lin Q et al (2009) Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Trans Neural Netw 20(8):1352–1357

    Article  Google Scholar 

  36. Zhang M-L, Zhou Z-H (2006) Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18(10):1338–1351

    Article  Google Scholar 

  37. Nam J, Kim J, Mencía EL, Gurevych I, Fürnkranz J (2014) Largescale multi-label text classification—revisiting neural networks. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 437–452

  38. Benites F, Sapozhnikova E (2015) Haram: a hierarchical aram neural network for large-scale text classification. In: 2015 IEEE international conference on data mining workshop (ICDMW). IEEE, pp 847–854

  39. Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814

  40. Dai AM, Le QV (2015) Semi-supervised sequence learning. In: Advances in neural information processing systems, pp 3079–3087

  41. Ling W, Luís T, Marujo L et al (2015) Finding function in form: compositional character models for open vocabulary word representation. Comput Sci 10:1899–1907

    Google Scholar 

  42. Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence Tagging. arXiv:1508.01991

  43. Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. Comput Sci 10:1–5

    Google Scholar 

  44. Qiao C, Huang B, Niu G, Li D, Dong D, He W, Yu D, Wu H (2018) A new method of region embedding for text classification. In: International conference on learning representations

  45. Xiao Y, Cho K (2016) Efficient character-level document classification by combining convolution and recurrent layers. arXiv:1602.00367

  46. Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th conference of the european chapter of the association for computational linguistics, vol 2, pp 427–431

  47. Yogatama D, Dyer C, Ling W, Blunsom P (2017) Generative and discriminative text classification with recurrent neural networks. arXiv:1703.01898

  48. Conneau A, Schwenk H, Barrault L, Lecun Y (2016) Very deep convolutional networks for natural language processing. arXiv:1606.01781

Download references

Acknowledgements

This work is supported by “the Fundamental Research Funds for the Central Universities” (No. 2017XKQY082).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yong Zhou or Wei Sun.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, B., Zhou, Y. & Sun, W. Character-level text classification via convolutional neural network and gated recurrent unit. Int. J. Mach. Learn. & Cyber. 11, 1939–1949 (2020). https://doi.org/10.1007/s13042-020-01084-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-020-01084-9

Keywords

Navigation