Skip to main content

ELETerm: A Chinese Electric Power Term Dataset

  • Conference paper
  • First Online:
Book cover Knowledge Graph and Semantic Computing: Knowledge Graph Empowers the Digital Economy (CCKS 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1669))

Included in the following conference series:

  • 604 Accesses

Abstract

The domain-specific knowledge graph construction and its corresponding applications are gradually attracting the attention of researchers. However, the lack of professional knowledge and term datasets restricts the development of domain-specific knowledge graph. In the electric power field, knowledge graph has been verified effective in electric fault monitoring, power consumer service, and decision-making on dispatching. Although the electric power knowledge graph is of great application prospects, it is difficult for artificial intelligence experts to create professional knowledge and terms for knowledge graph construction. To assist the process of building electric power knowledge graph, we introduce a new Chinese electric term dataset (ELETerm) containing 10,043 terms. We make full use of reliable data resources from State Grid Jiangsu Electric Power Company Research Institute to extract terms. Our approach includes four stages: word extraction, candidate term selection, term expansion, and dataset generation. We give the statistics and analysis of the dataset. The dataset is publicly available under CC BY-SA 4.0 in github.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Campos, R., Mangaravite, V., Pasquali, A., Jorge, A.M., Nunes, C., Jatowt, A.: YAKE! collection-independent automatic keyword extractor. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018. LNCS, vol. 10772, pp. 806–810. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76941-7_80

    Chapter  Google Scholar 

  2. Giannakopoulos, A., Musat, C., Hossmann, A., Baeriswyl, M.: Unsupervised aspect term extraction with B-LSTM & CRF using automatically labelled datasets. arXiv preprint arXiv:1709.05094 (2017)

  3. Han, X., Xu, L., Qiao, F.: CNN-BiLSTM-CRF model for term extraction in Chinese corpus. In: Meng, X., Li, R., Wang, K., Niu, B., Wang, X., Zhao, G. (eds.) WISA 2018. LNCS, vol. 11242, pp. 267–274. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02934-0_25

    Chapter  Google Scholar 

  4. Hippisley, A., Cheng, D., Ahmad, K.: The head-modifier principle and multilingual term extraction. Nat. Lang. Eng. 11(2), 129–157 (2005)

    Article  Google Scholar 

  5. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)

  6. Li, W., Zhao, J.: TextRank algorithm by exploiting Wikipedia for short text keywords extraction. In: 2016 3rd International Conference on Information Science and Control Engineering (ICISCE), pp. 683–686. IEEE (2016)

    Google Scholar 

  7. Luo, S., Sun, M.: Two-character Chinese word extraction based on hybrid of internal and contextual measures. In: Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, pp. 24–30 (2003)

    Google Scholar 

  8. Ma, J., Zhang, Y., Yao, S., Zhang, B., Guo, C.: Terminology extraction for new energy vehicle based on BiLSTM_Attention_CRF model. Appl. Res. Comput. 36(05), 1385–9 (2019)

    Google Scholar 

  9. Noy, N.F., McGuinness, D.L., et al.: Ontology development 101: a guide to creating your first ontology (2001)

    Google Scholar 

  10. Rose, S., Engel, D., Cramer, N., Cowley, W.: Automatic keyword extraction from individual documents. Text Min. Appl. Theory 1, 1–20 (2010)

    Google Scholar 

  11. Tseng, H., Chang, P.C., Andrew, G., Jurafsky, D., Manning, C.D.: A conditional random field word segmenter for SIGHAN bakeoff 2005. In: Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing (2005)

    Google Scholar 

  12. Vanegas, J.A., Matos, S., González, F., Oliveira, J.L.: An overview of biomolecular event extraction from scientific documents. Comput. Math. Methods Med. 2015 (2015)

    Google Scholar 

  13. Vu, T., Aw, A., Zhang, M.: Term extraction through unithood and termhood unification. In: Proceedings of the Third International Joint Conference on Natural Language Processing, vol. II (2008)

    Google Scholar 

  14. Wang, J., Wang, X., Ma, C., Kou, L.: A survey on the development status and application prospects of knowledge graph in smart grids. IET Gener. Transm. Distrib. 15(3), 383–407 (2021)

    Article  Google Scholar 

  15. Wong, W.: Determination of unithood and termhood for term recognition. In: Handbook of Research on Text and Web Mining Technologies, pp. 500–529 (2009)

    Google Scholar 

  16. Wu, T., Qi, G., Li, C., Wang, M.: A survey of techniques for constructing Chinese knowledge graphs and their applications. Sustainability 10(9), 3245 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Science and Technology Project of State Grid Jiangsu Electric Power Co., LTD. under Grant J2021129 Research on the construction technology of relay protection knowledge graph.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liangliang Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, Y., Song, L., Zhuang, S., Chen, S., Li, J. (2022). ELETerm: A Chinese Electric Power Term Dataset. In: Sun, M., et al. Knowledge Graph and Semantic Computing: Knowledge Graph Empowers the Digital Economy. CCKS 2022. Communications in Computer and Information Science, vol 1669. Springer, Singapore. https://doi.org/10.1007/978-981-19-7596-7_17

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-7596-7_17

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-7595-0

  • Online ISBN: 978-981-19-7596-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics