Skip to main content

Adaptive Convolution Kernel for Text Classification via Multi-channel Representations

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12397))

Abstract

Although existing text classification algorithms with LSTM-CNN-like structures have achieved great success, these models still have deficiencies in text feature representation and extraction. Most of the text representation methods based on LSTM-like models often adopt a single-channel form, and the size of convolution kernel is usually fixed in further feature extraction by CNN. Hence, in this study, we propose an Adaptive Convolutional Kernel via Multi-Channel Representation (ACK-MCR) model to solve the above two problems. The multi-channel text representation is formed by two different Bi-LSTM networks, extracting time-series features from forward and backward directions to retain more semantic information. Furthermore, after CNNs, a multi-scale feature attention is used to adaptively select multi-scale feature for classification. Extensive experiments show that our model obtains competitive performance against state-of-the-art baselines on six benchmark datasets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Zhang, L., Lin, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 443–457. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_28

    Chapter  Google Scholar 

  2. Garg, R., Vijay Kumar, B.G., Carneiro, G., Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_45

    Chapter  Google Scholar 

  3. Chiu, C.C., Sainath, T.N., Wu, Y., et al.: State-of-the-art speech recognition with sequence-to-sequence models. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4774–4778. IEEE (2018)

    Google Scholar 

  4. Collobert, R., Weston, J., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(1), 2493–2537 (2011)

    MATH  Google Scholar 

  5. Lee, J.Y., Dernoncourt, F.: Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks (2016)

    Google Scholar 

  6. Xu, J., Zhang, C., Zhang, P., Song, D.: Text classification with enriched word features. In: Geng, X., Kang, B.-H. (eds.) PRICAI 2018. LNCS (LNAI), vol. 11013, pp. 274–281. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97310-4_31

    Chapter  Google Scholar 

  7. Yao, D., Bi, J., Huang, J., Zhu, J.: A word distributed representation based framework for large-scale short text classification. In: Proceedings of International Joint Conference on Neural Networks (2015)

    Google Scholar 

  8. Ding, Z., Xia, R., Yu, J., Li, X., Yang, J.: Densely connected bidirectional LSTM with applications to sentence classification. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2018. LNCS (LNAI), vol. 11109, pp. 278–287. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99501-4_24

    Chapter  Google Scholar 

  9. Nowak, J., Taspinar, A., Scherer, R.: LSTM recurrent neural networks for short text and sentiment classification. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10246, pp. 553–562. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59060-8_50

    Chapter  Google Scholar 

  10. Bai, X.: Text classification based on LSTM and attention. In: Thirteenth International Conference on Digital Information Management (ICDIM), pp. 29–32. IEEE (2018)

    Google Scholar 

  11. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: NIPS, pp. 649–657 (2015)

    Google Scholar 

  12. Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for natural language processing. In: EACL (2017)

    Google Scholar 

  13. Yin, W., Schütze, H.: Multichannel variable-Size convolution for sentence classification. In: Proceedings of the Nineteenth Conference on Computational Natural Language Learning (2015)

    Google Scholar 

  14. Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014)

  15. Wang, S., Huang, M., Deng, Z.: Densely connected CNN with multi-scale feature attention for text classification. In: IJCAI, pp. 4468–4474 (2018)

    Google Scholar 

  16. Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: IJCAI, pp. 2915–2921 (2017)

    Google Scholar 

  17. Lin, Z., et al.: A structured self-attentive sentence embedding. In: ICLR (2017)

    Google Scholar 

  18. Tao, C., Feng, R., Yulan, H.: Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst. Appl. 72(15), 221–231 (2017)

    Google Scholar 

  19. Sainath, T.N., Vinyals, O., Senior, A.W., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4580–4584. IEEE, Piscataway (2015)

    Google Scholar 

  20. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol.14, pp. 1532–1543 (2014)

    Google Scholar 

  21. Wang, D., Gong, J., Song, Y.: W-RNN: news text classification based on a weighted RNN. arXiv preprint arXiv:1909.13077 (2019)

  22. Li, W., Qi, F., Tang, M., et al.: Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification. Neurocomputing 387, 63–77 (2020)

    Article  Google Scholar 

  23. Liu, Y., Ji, L., Huang, R., et al.: An attention-gated convolutional neural network for sentence classification. Intell. Data Anal. 23(5), 1091–1107 (2019)

    Article  Google Scholar 

  24. Xia, W., Zhu, W., Liao, B., et al.: Novel architecture for long short-term memory used in question classification. Neurocomputing 299, 20–31 (2018)

    Article  Google Scholar 

  25. Zhou, P., Shi, W., Tian, J., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (vol. 2: Short papers), pp. 207–212 (2016)

    Google Scholar 

  26. Devlin, J., Chang, M.W., Lee, K., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  27. Cheng, D., Gong, Y., Zhou, S., et al.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1335–1344 (2016)

    Google Scholar 

  28. Ruder, S., Ghaffari, P., Breslin, J.G.: Character-level and multi-channel convolutional neural networks for large-scale authorship attribution. arXiv preprint arXiv:1609.06686 (2016)

  29. Xu, K., et al.: Mixup-based acoustic scene classification using multi-channel convolutional neural network. In: Hong, R., Cheng, W.-H., Yamasaki, T., Wang, M., Ngo, C.-W. (eds.) PCM 2018. LNCS, vol. 11166, pp. 14–23. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00764-5_2

    Chapter  Google Scholar 

  30. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)

    Google Scholar 

  31. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML, pp. 807–814 (2010)

    Google Scholar 

  32. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)

  33. Zhang, R., Lee, H., Radev, D.R.: Dependency sensitive convolutional neural networks for modeling sentences and documents (2016)

    Google Scholar 

  34. Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP, pp. 1746–1751 (2014)

    Google Scholar 

  35. Joulin, A., Grave, E., Bojanowski, P., et al.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)

Download references

Acknowledgements

This work was supported partly by the National Science Foundation of China under Grants No.61967006 and No.61562027, the project of Jiangxi Provincial Department of Education under Grants No. GJJ180321.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, C., Fan, X. (2020). Adaptive Convolution Kernel for Text Classification via Multi-channel Representations. In: Farkaš, I., Masulli, P., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2020. ICANN 2020. Lecture Notes in Computer Science(), vol 12397. Springer, Cham. https://doi.org/10.1007/978-3-030-61616-8_57

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61616-8_57

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61615-1

  • Online ISBN: 978-3-030-61616-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics