Abstract
Industry classification for startup companies is meaningful not only to navigate investment strategies but also to find potential competitors. It is essentially a challenging domain-specific text classification task. Due to the lack of such dataset, in this paper, we first construct a dataset for industry classification based on the companies listed on the Chinese National Equities Exchange and Quotations (NEEQ), which consists of 17, 604 annual business reports and their corresponding industry labels. Second, we introduce a novel Knowledge Graph Enriched BERT model (KGEB), which can understand a domain-specific text by enhancing the word representation with external knowledge and can take full use of the local knowledge graph without pre-training. Experimental results show the promising performance of the proposed model and demonstrate its effectiveness for tackling the domain-specific classification task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bai, H., Xing, F.Z., Cambria, E., Huang, W.B.: Business taxonomy construction using concept-level hierarchical clustering. Papers (2019)
Bhojraj, S., Lee, C., Oler, D.K.: What’s my line? A comparison of industry classification schemes for capital market research. J. Acc. Res. 41(5), 745–774 (2003)
Bo, X., et al.: CN-DBpedia: a never-ending Chinese knowledge extraction system. In: International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (2017)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Zhang, J., Cao, Y., Hou, L., Li, J., Zheng, H.-T.: XLink: an unsupervised bilingual entity linking system. In: Sun, M., Wang, X., Chang, B., Xiong, D. (eds.) CCL/NLP-NABD -2017. LNCS (LNAI), vol. 10565, pp. 172–183. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69005-6_15
Kim, Y.: Convolutional neural networks for sentence classification. Eprint Arxiv (2014)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Li, S., Zhao, Z., Hu, R., Li, W., Liu, T., Du, X.: Analogical reasoning on Chinese morphological and semantic relations. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Vol. 2: Short Papers (2018)
Liu, W., Zhou, P., Zhao, Z., Wang, Z., Wang, P.: K-bert: enabling language representation with knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence (2020)
Menard, S.: Logistic regression. American Statistician (2004)
Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008 (2017)
Wang, Z., et al.: Xlore: a large-scale english-chinese bilingual knowledge graph. In: Proceedings of the 12th International Semantic Web Conference (2013)
Zhang, Y., Qi, P., Manning, C.D.: Graph convolution over pruned dependency trees improves relation extraction. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (2018)
Zhang, Z., Han, X., Liu, Z., Jiang, X., Sun, M., Liu, Q.: Ernie: enhanced language representation with informative entities. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (2019)
Acknowledgements
We appreciate the insightful feedback from the anonymous reviewers. This work is jointly supported by grants: Natural Science Foundation of China (No. 62006061), Strategic Emerging Industry Development Special Funds of Shenzhen (No. JCYJ20200109113441941) and Stable Support Program for Higher Education Institutions of Shenzhen (No. GXWD20201230155427003-20200824155011001).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, S., Pan, Y., Xu, Z., Hu, B., Wang, X. (2021). Enriching BERT With Knowledge Graph Embedding For Industry Classification. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1517. Springer, Cham. https://doi.org/10.1007/978-3-030-92310-5_82
Download citation
DOI: https://doi.org/10.1007/978-3-030-92310-5_82
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92309-9
Online ISBN: 978-3-030-92310-5
eBook Packages: Computer ScienceComputer Science (R0)