Abstract
Large-scale pretrained models have led to a series of breakthroughs in Text classification. However, Lack of global structure information limits the performance of pertrained models. In this paper, we propose a novel network named BertCA, which employs Bert, Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT) to handle the task of text classification simultaneously. It aims to learn a rich sentence representation involved semantic representation, global structure information and neighborhood nodes features. In this way, we are able to leverage the complementary strengths of pretrained models and graph models. Experimental results on R8, R52, Ohsumed and MR benchmark datasets show that our model obtains significant performance improvement and achieves the state-of-the-art results in four benchmark datasets.
Supported by Ping An Technology (Shenzhen) Co., Ltd.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2), 1097–1105 (2012)
Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. Eprint Arxiv (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
ThoKipf, M.N., Welling, M.: Semisupervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
Yao, L., Mao, C., Luo, Y.: Graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7370–7377 (2019)
Lin, Y., Meng, Y., Sun, X., et al.: BertGCN: transductive text classification by combining GCN and BERT. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (2021)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2018)
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP (2014)
Peters, M.E.: Deep contextualized word representations. In: NAACL-HLT (2018)
Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Le, Q.V.: Generalized autoregressive pretraining for language understanding. In: NeurIPS, XLNet (2019)
Liu, Y., et al.: RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations (2019)
Wu, F., Zhang, T., de Souza, A.H., Jr., Fifty, C., Yu, T., Weinberger, K.Q.: Simplifying graph convolutional networks. arXiv preprint arXiv:1902.07153 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. IEEE (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Hao, K., Li, J., Hou, C., Wang, X., Li, P. (2021). Combining Pretrained and Graph Models for Text Classification. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1516. Springer, Cham. https://doi.org/10.1007/978-3-030-92307-5_49
Download citation
DOI: https://doi.org/10.1007/978-3-030-92307-5_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92306-8
Online ISBN: 978-3-030-92307-5
eBook Packages: Computer ScienceComputer Science (R0)