BERT-KG: A Short Text Classification Model Based on Knowledge Graph and Deep Semantics

Zhong, Yuyanzhen; Zhang, Zhiyang; Zhang, Weiqi; Zhu, Juyi

doi:10.1007/978-3-030-88480-2_58

BERT-KG: A Short Text Classification Model Based on Knowledge Graph and Deep Semantics

Yuyanzhen Zhong¹²,
Zhiyang Zhang¹³,
Weiqi Zhang¹³ &
…
Juyi Zhu¹³

Conference paper
First Online: 06 October 2021

2984 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13028))

Abstract

Chinese short text classification is one of the increasingly significant tasks in Natural Language Processing (NLP). Different from documents and paragraphs, short text faces the problems of shortness, sparseness, non-standardization, etc., which brings enormous challenges for traditional classification methods. In this paper, we propose a novel model named BERT-KG, which can classify Chinese short text promptly and accurately and overcome the difficulty of short text classification. BERT-KG enriches short text features by obtaining background knowledge from the knowledge graph and further embeds the three-tuple information of the target entity into a BERT-based model. Then we fuse the dynamic word vector with the knowledge of the short text to form a feature vector for short text. And finally, the learned feature vector is input into the Softmax classifier to obtain a target label for short text. Extensive experiments conducted on two real-world datasets demonstrate that BERT-KG significantly improves the classification performance compared with state-of-the-art baselines.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Zhao, H., Liu, H.: Classification algorithm of Chinese short texts based on wikipedia. Libr. Inf. Serv. 57(11), 120–124 (2013)
Google Scholar
Kilimci, Z.H., Omurca, S.I.: Extended feature spaces based classifier ensembles for sentiment analysis of short texts. Inf. Technol. Control 47(3), 457–470 (2018)
Google Scholar
Bollegala, D., Mastsuo, Y., Lshizuka, M.: Measuring semantic similarity between words using web search engines. In: Proceedings of the 2nd ACM International Conference on World Wide Web, pp. 757–766. ACM (2007)
Google Scholar
Peters, M.E., Neumann, M., Iyyer, M., et al.: Deep contextualized word representations. In: Proceedings of NAACL-HLT, pp. 2227–2237 (2018)
Google Scholar
Radford, A., Narasimhan, K., Salimans, T., et al.: Improving language understanding by generative pre-training[EB/OL] (2018). https://s3-us-west2.amazonaws.com/openai-assets/researchcovers/languageunsupervised/languageunderstanding paper.pdf
Cui, Y., Che, W., Liu, T., et al.: Pre-training with whole word masking for chinese bert (2019). https://arxiv.org/abs/1906.08101
Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP, pp. 1746–1751 (2014)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar
Lai, S., Xu, L., Liu, K., et al.: Recurrent convolutional neural networks for text classification. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
Lee, J.Y., Dernoncourt, F.: Sequential short-text classification with recurrent and convolutional neural networks (2016). https://arxiv.org/abs/1603.03827
Wu, W., Li, H., Wang, H., et al.: Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 481–492 (2012)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., et al.: Bert: pre-training of deep bidirectional transformers for language understanding (2018). https://arxiv.org/abs/1810.04805
Zhou, Y., Jiaming, X., Cao, J., Bo, X., Li, C., Bo, X.: Hybrid attention networks for chinese short text classification. Comput. Sist. 21(4), 759–769 (2018)
Google Scholar
Amit, S.: Introducing the Knowledge Graph. Official Blog of Google, America (2012)
Google Scholar
Bollacker, K., Cook, R., Tufts, P.: Freebase: a shared database of structured general human knowledge. In: Procedings of the 22nd AAAI Conf on Artificial Intelligence, Menlo Park, CA, pp. 1962–1963. AAAI (2007)
Google Scholar
Bizer, C., Lehmann, J., Kobilarov, G., et al.: DBpedia-a crystallization point for the web of data. J. Web Semant. 7(3), 154–165 (2009)
Article Google Scholar
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a large ontology from wikipedia and wordnet. J. Web Semant. 6(3), 203–217 (2008)
Article Google Scholar
Liu, W., Zhou, P., Zhao, Z., et al.: K-bert: enabling language representation with knowledge graph (2019). https://arxiv.org/abs/1909.07606

Download references

Author information

Authors and Affiliations

Shenzhen University Webank Institute of FinTech, Shenzhen, China
Yuyanzhen Zhong
University of Electronic Science and Technology of China, Chengdu, China
Zhiyang Zhang, Weiqi Zhang & Juyi Zhu

Authors

Yuyanzhen Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Weiqi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Juyi Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Lu Wang
Peking University, Beijing, China
Yansong Feng
Soochow University, Suzhou, China
Yu Hong
Tianjin University, Tianjin, China
Ruifang He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhong, Y., Zhang, Z., Zhang, W., Zhu, J. (2021). BERT-KG: A Short Text Classification Model Based on Knowledge Graph and Deep Semantics. In: Wang, L., Feng, Y., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2021. Lecture Notes in Computer Science(), vol 13028. Springer, Cham. https://doi.org/10.1007/978-3-030-88480-2_58

Download citation

DOI: https://doi.org/10.1007/978-3-030-88480-2_58
Published: 06 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88479-6
Online ISBN: 978-3-030-88480-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)