Adaptive Convolution Kernel for Text Classification via Multi-channel Representations

Wang, Cheng; Fan, Xiaoyan

doi:10.1007/978-3-030-61616-8_57

Adaptive Convolution Kernel for Text Classification via Multi-channel Representations

Cheng Wang¹¹ &
Xiaoyan Fan¹²

Conference paper
First Online: 14 October 2020

2203 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12397))

Abstract

Although existing text classification algorithms with LSTM-CNN-like structures have achieved great success, these models still have deficiencies in text feature representation and extraction. Most of the text representation methods based on LSTM-like models often adopt a single-channel form, and the size of convolution kernel is usually fixed in further feature extraction by CNN. Hence, in this study, we propose an Adaptive Convolutional Kernel via Multi-Channel Representation (ACK-MCR) model to solve the above two problems. The multi-channel text representation is formed by two different Bi-LSTM networks, extracting time-series features from forward and backward directions to retain more semantic information. Furthermore, after CNNs, a multi-scale feature attention is used to adaptively select multi-scale feature for classification. Extensive experiments show that our model obtains competitive performance against state-of-the-art baselines on six benchmark datasets.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Zhang, L., Lin, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 443–457. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_28
Chapter Google Scholar
Garg, R., Vijay Kumar, B.G., Carneiro, G., Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_45
Chapter Google Scholar
Chiu, C.C., Sainath, T.N., Wu, Y., et al.: State-of-the-art speech recognition with sequence-to-sequence models. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4774–4778. IEEE (2018)
Google Scholar
Collobert, R., Weston, J., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(1), 2493–2537 (2011)
MATH Google Scholar
Lee, J.Y., Dernoncourt, F.: Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks (2016)
Google Scholar
Xu, J., Zhang, C., Zhang, P., Song, D.: Text classification with enriched word features. In: Geng, X., Kang, B.-H. (eds.) PRICAI 2018. LNCS (LNAI), vol. 11013, pp. 274–281. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97310-4_31
Chapter Google Scholar
Yao, D., Bi, J., Huang, J., Zhu, J.: A word distributed representation based framework for large-scale short text classification. In: Proceedings of International Joint Conference on Neural Networks (2015)
Google Scholar
Ding, Z., Xia, R., Yu, J., Li, X., Yang, J.: Densely connected bidirectional LSTM with applications to sentence classification. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2018. LNCS (LNAI), vol. 11109, pp. 278–287. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99501-4_24
Chapter Google Scholar
Nowak, J., Taspinar, A., Scherer, R.: LSTM recurrent neural networks for short text and sentiment classification. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10246, pp. 553–562. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59060-8_50
Chapter Google Scholar
Bai, X.: Text classification based on LSTM and attention. In: Thirteenth International Conference on Digital Information Management (ICDIM), pp. 29–32. IEEE (2018)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: NIPS, pp. 649–657 (2015)
Google Scholar
Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for natural language processing. In: EACL (2017)
Google Scholar
Yin, W., Schütze, H.: Multichannel variable-Size convolution for sentence classification. In: Proceedings of the Nineteenth Conference on Computational Natural Language Learning (2015)
Google Scholar
Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014)
Wang, S., Huang, M., Deng, Z.: Densely connected CNN with multi-scale feature attention for text classification. In: IJCAI, pp. 4468–4474 (2018)
Google Scholar
Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: IJCAI, pp. 2915–2921 (2017)
Google Scholar
Lin, Z., et al.: A structured self-attentive sentence embedding. In: ICLR (2017)
Google Scholar
Tao, C., Feng, R., Yulan, H.: Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst. Appl. 72(15), 221–231 (2017)
Google Scholar
Sainath, T.N., Vinyals, O., Senior, A.W., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4580–4584. IEEE, Piscataway (2015)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol.14, pp. 1532–1543 (2014)
Google Scholar
Wang, D., Gong, J., Song, Y.: W-RNN: news text classification based on a weighted RNN. arXiv preprint arXiv:1909.13077 (2019)
Li, W., Qi, F., Tang, M., et al.: Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification. Neurocomputing 387, 63–77 (2020)
Article Google Scholar
Liu, Y., Ji, L., Huang, R., et al.: An attention-gated convolutional neural network for sentence classification. Intell. Data Anal. 23(5), 1091–1107 (2019)
Article Google Scholar
Xia, W., Zhu, W., Liao, B., et al.: Novel architecture for long short-term memory used in question classification. Neurocomputing 299, 20–31 (2018)
Article Google Scholar
Zhou, P., Shi, W., Tian, J., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (vol. 2: Short papers), pp. 207–212 (2016)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Cheng, D., Gong, Y., Zhou, S., et al.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1335–1344 (2016)
Google Scholar
Ruder, S., Ghaffari, P., Breslin, J.G.: Character-level and multi-channel convolutional neural networks for large-scale authorship attribution. arXiv preprint arXiv:1609.06686 (2016)
Xu, K., et al.: Mixup-based acoustic scene classification using multi-channel convolutional neural network. In: Hong, R., Cheng, W.-H., Yamasaki, T., Wang, M., Ngo, C.-W. (eds.) PCM 2018. LNCS, vol. 11166, pp. 14–23. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00764-5_2
Chapter Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)
Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML, pp. 807–814 (2010)
Google Scholar
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)
Zhang, R., Lee, H., Radev, D.R.: Dependency sensitive convolutional neural networks for modeling sentences and documents (2016)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP, pp. 1746–1751 (2014)
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., et al.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)

Download references

Acknowledgements

This work was supported partly by the National Science Foundation of China under Grants No.61967006 and No.61562027, the project of Jiangxi Provincial Department of Education under Grants No. GJJ180321.

Author information

Authors and Affiliations

School of Information Engineering, East China Jiaotong University, Nanchang, 330013, China
Cheng Wang
School of Software, East China Jiaotong University, Nanchang, 330013, China
Xiaoyan Fan

Authors

Cheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyan Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng Wang .

Editor information

Editors and Affiliations

Department of Applied Informatics, Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kgs. Lyngby, Denmark
Paolo Masulli
Department of Informatics, University of Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, C., Fan, X. (2020). Adaptive Convolution Kernel for Text Classification via Multi-channel Representations. In: Farkaš, I., Masulli, P., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2020. ICANN 2020. Lecture Notes in Computer Science(), vol 12397. Springer, Cham. https://doi.org/10.1007/978-3-030-61616-8_57

Download citation

DOI: https://doi.org/10.1007/978-3-030-61616-8_57
Published: 14 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61615-1
Online ISBN: 978-3-030-61616-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics