Text Classification and Transfer Learning Based on Character-Level Deep Convolutional Neural Networks

Sato, Minato; Orihara, Ryohei; Sei, Yuichi; Tahara, Yasuyuki; Ohsuga, Akihiko

doi:10.1007/978-3-319-93581-2_4

Minato Sato¹⁶,
Ryohei Orihara¹⁶,
Yuichi Sei¹⁶,
Yasuyuki Tahara¹⁶ &
…
Akihiko Ohsuga¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10839))

Included in the following conference series:

International Conference on Agents and Artificial Intelligence

530 Accesses

Abstract

Temporal (one-dimensional) Convolutional Neural Network (Temporal CNN, ConvNet) is an emergent technology for text understanding. The input for the ConvNets could be either a sequence of words or a sequence of characters. In the latter case there are no needs for natural language processing. Past studies showed that the character-level ConvNets worked well for text classification in English and romanized Chinese corpus. In this article we apply the character-level ConvNets to Japanese corpus. We confirmed that meaningful representations are extracted by the ConvNets in English corpus and Japanese corpus. We attempt to reuse the meaningful representations that are learned in the ConvNets from a large-scale dataset in the form of transfer learning. As for the application to the news categorization and the sentiment analysis tasks in Japanese corpus, the ConvNets outperformed N-gram-based classifiers. In addition, our ConvNets transfer learning frameworks worked well for a task which is similar to one used for pre-training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Gated Convolutional Neural Networks for Text Classification

ShufText: A Simple Black Box Approach to Evaluate the Fragility of Text Classification Models

Character-level text classification via convolutional neural network and gated recurrent unit

Article 04 March 2020

Notes

1.
http://kakasi.namazu.org/.
2.
http://www.afpbb.com/.
3.
Rakuten, Inc. is one of the largest Japanese electronic commerce and Internet companies based in Tokyo, Japan.
4.
http://www.nii.ac.jp/dsc/idr/en/rakuten/rakuten.html.
5.
https://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html.
6.
https://snap.stanford.edu/data/web-Amazon.html.
7.
http://www.imdb.com/.
8.
http://www.nii.ac.jp/dsc/idr/en/rakuten/rakuten.html.

References

Agrawal, P., Girshick, R., Malik, J.: Analyzing the performance of multilayer neural networks for object recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 329–344. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_22
Chapter Google Scholar
Bengio, Y., Boulanger-Lewandowski, N., Pascanu, R.: Advances in optimizing recurrent networks. In: The Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013) (2013)
Google Scholar
Del Corso, G.M., Gullí, A., Romani, F.: Ranking a stream of news. In: The Proceedings of the 14th International Conference on World Wide Web (WWW 2005), pp. 97–106 (2005)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: The Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009) (2009)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: The Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014) (2014)
Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: The Proceedings of the 13rd International Conference on Artificial Intelligence and Statistics (AISTATS 2010) (2010)
Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: a deep learning approach. In: The Proceedings of the 28th International Conference on Machine Learning (ICML 2011) (2011)
Google Scholar
Gulli, A.: The anatomy of a news search engine. In: International Conference on World Wide Web (WWW) Special Interest Tracks and Posters, WWW 2005, pp. 880–881 (2005)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: The Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), pp. 1746–1751 (2014)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: The Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NIPS 2012), pp. 1097–1105 (2012)
Google Scholar
Kudo, T., Yamamoto, K., Matsumoto, Y.: Applying conditional random fields to japanese morphological analysis. In: The Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), pp. 230–237 (2004)
Google Scholar
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: The Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL HLT 2011), pp. 142–150 (2011)
Google Scholar
McAuley, J., Pandey, R., Leskovec, J.: Inferring networks of substitutable and complementary products. In: The Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2015), pp. 785–794 (2015)
Google Scholar
McAuley, J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: The Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2015), pp. 43–52 (2015)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: The Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS 2013), pp. 3111–3119 (2013)
Google Scholar
Mikolov, T., Yih, W.T., Zweig, G.: Linguistic regularities in continuous space word representations. In: The Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2013), pp. 746–751 (2013)
Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: The Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 807–814 (2010)
Google Scholar
dos Santos, C., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: The Proceedings of the 25th International Conference on Computational Linguistics (COLING 2014), pp. 69–78 (2014)
Google Scholar
dos Santos, C.N., Xiang, B., Zhou, B.: Classifying relations by ranking with convolutional neural networks. In: The Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL 2015), pp. 626–634 (2015)
Google Scholar
Sato, M., Orihara, R., Sei, Y., Tahara, Y., Ohsuga, A.: Japanese text classification by character-level deep ConvNets and transfer learning. In: The Proceedings of the 9th International Conference on Agents and Artificial Intelligence, vol. 2, pp. 175–184 (2017)
Google Scholar
Severyn, A., Moschitti, A.: Twitter sentiment analysis with deep convolutional neural networks. In: The Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2015), pp. 959–962 (2015)
Google Scholar
Severyn, A., Moschitti, A.: UNITN: training deep convolutional neural network for Twitter sentiment classification. In: The Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 464–469 (2015)
Google Scholar
Sharif Razavian, A., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, CVPR 2014 (2014)
Google Scholar
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: The Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), pp. 1631–1642 (2013)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
MathSciNet MATH Google Scholar
Zhang, X., LeCun, Y.: Text understanding from scratch. CoRR abs/1502.01710 (2015)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: The Proceedings of the 29th Annual Conference on Neural Information Processing Systems (NIPS 2015), pp. 649–657 (2015)
Google Scholar

Download references

Acknowledgements

This work was supported by JSPS KAKENHI Grant Numbers 26330081, 26870201, 16K12411, 17H04705. We use the Rakuten dataset which is provided by the National Institute of Informatics (NII) according to the contract between NII and Rakuten, Inc. We would like to thank NII and Rakuten, Inc.

Author information

Authors and Affiliations

Graduate School of Information Systems, The University of Electro-Communications, 1-5-1, Chofu-gaoka, Chofu-shi, Tokyo, Japan
Minato Sato, Ryohei Orihara, Yuichi Sei, Yasuyuki Tahara & Akihiko Ohsuga

Authors

Minato Sato
View author publications
You can also search for this author in PubMed Google Scholar
Ryohei Orihara
View author publications
You can also search for this author in PubMed Google Scholar
Yuichi Sei
View author publications
You can also search for this author in PubMed Google Scholar
Yasuyuki Tahara
View author publications
You can also search for this author in PubMed Google Scholar
Akihiko Ohsuga
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Minato Sato , Ryohei Orihara , Yuichi Sei , Yasuyuki Tahara or Akihiko Ohsuga .

Editor information

Editors and Affiliations

Leiden University, Leiden, The Netherlands
Jaap van den Herik
University of Porto, Porto, Portugal
Ana Paula Rocha
INSTICC, Polytechnic Institute of Setúbal, Setubal, Portugal
Joaquim Filipe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sato, M., Orihara, R., Sei, Y., Tahara, Y., Ohsuga, A. (2018). Text Classification and Transfer Learning Based on Character-Level Deep Convolutional Neural Networks. In: van den Herik, J., Rocha, A., Filipe, J. (eds) Agents and Artificial Intelligence. ICAART 2017. Lecture Notes in Computer Science(), vol 10839. Springer, Cham. https://doi.org/10.1007/978-3-319-93581-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-93581-2_4
Published: 21 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93580-5
Online ISBN: 978-3-319-93581-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Text Classification and Transfer Learning Based on Character-Level Deep Convolutional Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Gated Convolutional Neural Networks for Text Classification

ShufText: A Simple Black Box Approach to Evaluate the Fragility of Text Classification Models

Character-level text classification via convolutional neural network and gated recurrent unit

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Text Classification and Transfer Learning Based on Character-Level Deep Convolutional Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Gated Convolutional Neural Networks for Text Classification

ShufText: A Simple Black Box Approach to Evaluate the Fragility of Text Classification Models

Character-level text classification via convolutional neural network and gated recurrent unit

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation