Enriching BERT With Knowledge Graph Embedding For Industry Classification

Wang, Shiyue; Pan, Youcheng; Xu, Zhenran; Hu, Baotian; Wang, Xiaolong

doi:10.1007/978-3-030-92310-5_82

Shiyue Wang¹⁰,
Youcheng Pan¹⁰,
Zhenran Xu¹⁰,
Baotian Hu¹⁰ &
…
Xiaolong Wang¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1517))

Included in the following conference series:

International Conference on Neural Information Processing

1933 Accesses
2 Citations

Abstract

Industry classification for startup companies is meaningful not only to navigate investment strategies but also to find potential competitors. It is essentially a challenging domain-specific text classification task. Due to the lack of such dataset, in this paper, we first construct a dataset for industry classification based on the companies listed on the Chinese National Equities Exchange and Quotations (NEEQ), which consists of 17, 604 annual business reports and their corresponding industry labels. Second, we introduce a novel Knowledge Graph Enriched BERT model (KGEB), which can understand a domain-specific text by enhancing the word representation with external knowledge and can take full use of the local knowledge graph without pre-training. Experimental results show the promising performance of the proposed model and demonstrate its effectiveness for tackling the domain-specific classification task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bai, H., Xing, F.Z., Cambria, E., Huang, W.B.: Business taxonomy construction using concept-level hierarchical clustering. Papers (2019)
Google Scholar
Bhojraj, S., Lee, C., Oler, D.K.: What’s my line? A comparison of industry classification schemes for capital market research. J. Acc. Res. 41(5), 745–774 (2003)
Article Google Scholar
Bo, X., et al.: CN-DBpedia: a never-ending Chinese knowledge extraction system. In: International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (2017)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Zhang, J., Cao, Y., Hou, L., Li, J., Zheng, H.-T.: XLink: an unsupervised bilingual entity linking system. In: Sun, M., Wang, X., Chang, B., Xiong, D. (eds.) CCL/NLP-NABD -2017. LNCS (LNAI), vol. 10565, pp. 172–183. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69005-6_15
Chapter Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. Eprint Arxiv (2014)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Li, S., Zhao, Z., Hu, R., Li, W., Liu, T., Du, X.: Analogical reasoning on Chinese morphological and semantic relations. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Vol. 2: Short Papers (2018)
Google Scholar
Liu, W., Zhou, P., Zhao, Z., Wang, Z., Wang, P.: K-bert: enabling language representation with knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence (2020)
Google Scholar
Menard, S.: Logistic regression. American Statistician (2004)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008 (2017)
Google Scholar
Wang, Z., et al.: Xlore: a large-scale english-chinese bilingual knowledge graph. In: Proceedings of the 12th International Semantic Web Conference (2013)
Google Scholar
Zhang, Y., Qi, P., Manning, C.D.: Graph convolution over pruned dependency trees improves relation extraction. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (2018)
Google Scholar
Zhang, Z., Han, X., Liu, Z., Jiang, X., Sun, M., Liu, Q.: Ernie: enhanced language representation with informative entities. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (2019)
Google Scholar

Download references

Acknowledgements

We appreciate the insightful feedback from the anonymous reviewers. This work is jointly supported by grants: Natural Science Foundation of China (No. 62006061), Strategic Emerging Industry Development Special Funds of Shenzhen (No. JCYJ20200109113441941) and Stable Support Program for Higher Education Institutions of Shenzhen (No. GXWD20201230155427003-20200824155011001).

Author information

Authors and Affiliations

Harbin Institute of Technology, Shenzhen, China
Shiyue Wang, Youcheng Pan, Zhenran Xu, Baotian Hu & Xiaolong Wang

Authors

Shiyue Wang
View author publications
You can also search for this author in PubMed Google Scholar
Youcheng Pan
View author publications
You can also search for this author in PubMed Google Scholar
Zhenran Xu
View author publications
You can also search for this author in PubMed Google Scholar
Baotian Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Baotian Hu .

Editor information

Editors and Affiliations

Sampoerna University, Jakarta, Indonesia
Teddy Mantoro
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee
Sampoerna University, Jakarta, Indonesia
Media Anugerah Ayu
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Universitas Indonesia, Depok, Indonesia
Achmad Nizar Hidayanto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, S., Pan, Y., Xu, Z., Hu, B., Wang, X. (2021). Enriching BERT With Knowledge Graph Embedding For Industry Classification. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1517. Springer, Cham. https://doi.org/10.1007/978-3-030-92310-5_82

Download citation

DOI: https://doi.org/10.1007/978-3-030-92310-5_82
Published: 02 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92309-9
Online ISBN: 978-3-030-92310-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics