Abstract
Although existing text classification algorithms with LSTM-CNN-like structures have achieved great success, these models still have deficiencies in text feature representation and extraction. Most of the text representation methods based on LSTM-like models often adopt a single-channel form, and the size of convolution kernel is usually fixed in further feature extraction by CNN. Hence, in this study, we propose an Adaptive Convolutional Kernel via Multi-Channel Representation (ACK-MCR) model to solve the above two problems. The multi-channel text representation is formed by two different Bi-LSTM networks, extracting time-series features from forward and backward directions to retain more semantic information. Furthermore, after CNNs, a multi-scale feature attention is used to adaptively select multi-scale feature for classification. Extensive experiments show that our model obtains competitive performance against state-of-the-art baselines on six benchmark datasets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Zhang, L., Lin, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 443–457. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_28
Garg, R., Vijay Kumar, B.G., Carneiro, G., Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_45
Chiu, C.C., Sainath, T.N., Wu, Y., et al.: State-of-the-art speech recognition with sequence-to-sequence models. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4774–4778. IEEE (2018)
Collobert, R., Weston, J., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(1), 2493–2537 (2011)
Lee, J.Y., Dernoncourt, F.: Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks (2016)
Xu, J., Zhang, C., Zhang, P., Song, D.: Text classification with enriched word features. In: Geng, X., Kang, B.-H. (eds.) PRICAI 2018. LNCS (LNAI), vol. 11013, pp. 274–281. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97310-4_31
Yao, D., Bi, J., Huang, J., Zhu, J.: A word distributed representation based framework for large-scale short text classification. In: Proceedings of International Joint Conference on Neural Networks (2015)
Ding, Z., Xia, R., Yu, J., Li, X., Yang, J.: Densely connected bidirectional LSTM with applications to sentence classification. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2018. LNCS (LNAI), vol. 11109, pp. 278–287. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99501-4_24
Nowak, J., Taspinar, A., Scherer, R.: LSTM recurrent neural networks for short text and sentiment classification. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10246, pp. 553–562. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59060-8_50
Bai, X.: Text classification based on LSTM and attention. In: Thirteenth International Conference on Digital Information Management (ICDIM), pp. 29–32. IEEE (2018)
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: NIPS, pp. 649–657 (2015)
Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for natural language processing. In: EACL (2017)
Yin, W., Schütze, H.: Multichannel variable-Size convolution for sentence classification. In: Proceedings of the Nineteenth Conference on Computational Natural Language Learning (2015)
Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014)
Wang, S., Huang, M., Deng, Z.: Densely connected CNN with multi-scale feature attention for text classification. In: IJCAI, pp. 4468–4474 (2018)
Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: IJCAI, pp. 2915–2921 (2017)
Lin, Z., et al.: A structured self-attentive sentence embedding. In: ICLR (2017)
Tao, C., Feng, R., Yulan, H.: Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst. Appl. 72(15), 221–231 (2017)
Sainath, T.N., Vinyals, O., Senior, A.W., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4580–4584. IEEE, Piscataway (2015)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol.14, pp. 1532–1543 (2014)
Wang, D., Gong, J., Song, Y.: W-RNN: news text classification based on a weighted RNN. arXiv preprint arXiv:1909.13077 (2019)
Li, W., Qi, F., Tang, M., et al.: Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification. Neurocomputing 387, 63–77 (2020)
Liu, Y., Ji, L., Huang, R., et al.: An attention-gated convolutional neural network for sentence classification. Intell. Data Anal. 23(5), 1091–1107 (2019)
Xia, W., Zhu, W., Liao, B., et al.: Novel architecture for long short-term memory used in question classification. Neurocomputing 299, 20–31 (2018)
Zhou, P., Shi, W., Tian, J., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (vol. 2: Short papers), pp. 207–212 (2016)
Devlin, J., Chang, M.W., Lee, K., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Cheng, D., Gong, Y., Zhou, S., et al.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1335–1344 (2016)
Ruder, S., Ghaffari, P., Breslin, J.G.: Character-level and multi-channel convolutional neural networks for large-scale authorship attribution. arXiv preprint arXiv:1609.06686 (2016)
Xu, K., et al.: Mixup-based acoustic scene classification using multi-channel convolutional neural network. In: Hong, R., Cheng, W.-H., Yamasaki, T., Wang, M., Ngo, C.-W. (eds.) PCM 2018. LNCS, vol. 11166, pp. 14–23. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00764-5_2
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML, pp. 807–814 (2010)
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)
Zhang, R., Lee, H., Radev, D.R.: Dependency sensitive convolutional neural networks for modeling sentences and documents (2016)
Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP, pp. 1746–1751 (2014)
Joulin, A., Grave, E., Bojanowski, P., et al.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Acknowledgements
This work was supported partly by the National Science Foundation of China under Grants No.61967006 and No.61562027, the project of Jiangxi Provincial Department of Education under Grants No. GJJ180321.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, C., Fan, X. (2020). Adaptive Convolution Kernel for Text Classification via Multi-channel Representations. In: Farkaš, I., Masulli, P., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2020. ICANN 2020. Lecture Notes in Computer Science(), vol 12397. Springer, Cham. https://doi.org/10.1007/978-3-030-61616-8_57
Download citation
DOI: https://doi.org/10.1007/978-3-030-61616-8_57
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61615-1
Online ISBN: 978-3-030-61616-8
eBook Packages: Computer ScienceComputer Science (R0)