skip to main content
10.1145/3653081.3653131acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiotaaiConference Proceedingsconference-collections
research-article

A Pre-training Method Inspired by Large Language Model for Power Named Entity Recognition

Published:03 May 2024Publication History

ABSTRACT

In recent years, the field of natural language processing has witnessed remarkable advancements due to the success of large language models. These models leverage the Transformer architecture and pre-training techniques to achieve impressive results. In this paper, we draw inspiration from large language models and apply these techniques into the task of named entity recognition in the domain of power grids, which is critical for building power grid knowledge graphs and question-answering systems. Specifically, we propose a BERT-CNN-BIGRU-CRF deep learning model for named entity recognition. This model effectively harnesses the semantic modeling capabilities and pre-training knowledge of BERT, which is based on the Transformer architecture. By incorporating CNN and BIGRU, the model captures and models both local and global features, respectively. The CRF layer is employed for label classification. This combination of components ensures a high level of recognition accuracy. To evaluate the performance of the proposed model, we train our model on annotated maintenance plan data. We compare its results with those of other commonly used models. The evaluation metrics include recall, precision, and F1 score, which are widely employed in named entity recognition tasks. Our proposed model achieves optimal performance across all three metrics, demonstrating its superiority over other models.

Skip Supplemental Material Section

Supplemental Material

References

  1. Wang J, Wang X, Ma C, A survey on the development status and application prospects of knowledge graph in smart grids[J]. IET Generation, Transmission & Distribution, 2021, 15(3): 383-407.Google ScholarGoogle ScholarCross RefCross Ref
  2. Hogan A, Blomqvist E, Cochez M, Knowledge graphs[J]. ACM Computing Surveys (Csur), 2021, 54(4): 1-37.Google ScholarGoogle Scholar
  3. Ji S, Pan S, Cambria E, A survey on knowledge graphs: Representation, acquisition, and applications[J]. IEEE transactions on neural networks and learning systems, 2021, 33(2): 494-514.Google ScholarGoogle Scholar
  4. Huang H, Hong Z, Zhou H, Knowledge graph construction and application of power grid equipment[J]. Mathematical Problems in Engineering, 2020, 2020: 1-10.Google ScholarGoogle Scholar
  5. Huang H, Chen Y, Lou B, Constructing knowledge graph from big data of smart grids[C]//2019 10th International Conference on Information Technology in Medicine and Education (ITME). IEEE, 2019: 637-641.Google ScholarGoogle Scholar
  6. Kor Y, Tan L, Reformat M Z, Gridkg: Knowledge graph representation of distribution grid data[C]//2020 IEEE Electric Power and Energy Conference (EPEC). IEEE, 2020: 1-5.Google ScholarGoogle Scholar
  7. Jagvaral B, Lee W K, Roh J S, Path-based reasoning approach for knowledge graph completion using CNN-BiLSTM with attention mechanism[J]. Expert Systems with Applications, 2020, 142: 112960.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Meng F, Yang S, Wang J, Creating knowledge graph of electric power equipment faults based on BERT–BiLSTM–CRF model[J]. Journal of Electrical Engineering & Technology, 2022, 17(4): 2507-2516.Google ScholarGoogle ScholarCross RefCross Ref
  9. Kasneci E, Seßler K, Küchemann S, ChatGPT for good? On opportunities and challenges of large language models for education[J]. Learning and individual differences, 2023, 103: 102274.Google ScholarGoogle Scholar
  10. Thirunavukarasu A J, Ting D S J, Elangovan K, Large language models in medicine[J]. Nature medicine, 2023, 29(8): 1930-1940.Google ScholarGoogle Scholar
  11. Devlin J, Chang M W, Lee K, Bert: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018.Google ScholarGoogle Scholar
  12. Kenton J D M W C, Toutanova L K. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of naacL-HLT. 2019, 1: 2.Google ScholarGoogle Scholar
  13. Guo M H, Xu T X, Liu J J, Attention mechanisms in computer vision: A survey[J]. Computational visual media, 2022, 8(3): 331-368.Google ScholarGoogle Scholar
  14. Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv:1508.01991, 2015.Google ScholarGoogle Scholar
  15. Bale T L, Vale W W. CRF and CRF receptors: role in stress responsivity and other behaviors[J]. Annu. Rev. Pharmacol. Toxicol., 2004, 44: 525-557.Google ScholarGoogle ScholarCross RefCross Ref
  16. Dey, Rahul, and Fathi M. Salem. "Gate-variants of gated recurrent unit (GRU) neural networks." 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS). IEEE, 2017.Google ScholarGoogle Scholar
  17. Irie, K., Tüske, Z., Alkhouli, T., Schlüter, R., & Ney, H. 2016, September . LSTM, GRU, highway and a bit of attention: An empirical overview for language modeling in speech recognition. In Interspeech (pp. 3519-3523).Google ScholarGoogle Scholar
  18. Lin, X., Quan, Z., Wang, Z. J., Huang, H., & Zeng, X. 2020 . A novel molecular representation with BiGRU neural networks for learning atom. Briefings in bioinformatics, 21(6), 2099-2111.Google ScholarGoogle Scholar
  19. Duan, Y., Li, H., He, M., & Zhao, D. 2021 . A BiGRU autoencoder remaining useful life prediction scheme with attention mechanism and skip connection. IEEE Sensors Journal, 21(9), 10905-10914.Google ScholarGoogle Scholar

Index Terms

  1. A Pre-training Method Inspired by Large Language Model for Power Named Entity Recognition

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              IoTAAI '23: Proceedings of the 2023 5th International Conference on Internet of Things, Automation and Artificial Intelligence
              November 2023
              902 pages
              ISBN:9798400716485
              DOI:10.1145/3653081

              Copyright © 2023 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 3 May 2024

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited
            • Article Metrics

              • Downloads (Last 12 months)1
              • Downloads (Last 6 weeks)1

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format