ABSTRACT
Abstract. In response to the lack of larger-scale and high-quality NER datasets and research on NER in the steel industry, this paper constructs a NER dataset for the steel industry that includes 4835 pieces of data, and annotates four entity categories: device, material, process, and product, A NER research method based on MacBERT_large-BiLSTM-CRF model was built. This method first utilizes the MacBERT model to generate semantically rich dynamic word vectors, then inputs the word vectors into the BiLSTM network model to obtain global features, and finally uses the CRF model to add effective constraints to the test labels to ensure the effectiveness of the generated labels. The model was compared with three other models, and the experimental results showed that the precision of the model was 90.01%, the recall rate was 91.02%, and the F1 value was 90.51%. The recognition performance of the model was superior to the other three models.
- Bikel D M, Miller S, Schwartz R, (1998). Nymble: a high-performance learning name-finder. arXiv preprint cmp-lg/9803003.Google Scholar
- Borthwick A E(1999). A maximum entropy approach to named entity recognition. New York University.Google ScholarDigital Library
- Asahara M, Matsumoto Y(2003). Japanese named entity extraction with redundant morphological analysis. Proceedings of the 2003 human language technology conference of the North American chapter of the association for computational linguistics. 8-15.Google Scholar
- McCallum A, Li W(2003). Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons.Google Scholar
- Zeng D, Sun C, Lin L, (2017). LSTM-CRF for drug-named entity recognition. Entropy. 19(6): 283.Google ScholarCross Ref
- Huang Z, Xu W, Yu K(2015). Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991.Google Scholar
- Devlin J, Chang M W, Lee K, (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.Google Scholar
- Zhai, C., & Wang, C. (2019). Named entity recognition in steel field based on BiLSTM-CRF model. Journal of Physics: Conference Series, 1314.Google Scholar
Recommendations
Learning multilingual named entity recognition from Wikipedia
We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify ...
Two-stage approach to named entity recognition using Wikipedia and DBpedia
IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and CommunicationIn natural language understanding, extraction of named entity (NE) mentions in given text and classification of the mentions into pre-defined NE types are important processes. Most NE recognition (NER) relies on resources such as a training corpus or NE ...
Automatic gazette creation for named entity recognition and application to resume processing
COMPUTE '12: Proceedings of the 5th ACM COMPUTE Conference: Intelligent & scalable system technologiesNamed entities are important content-carrying units within documents. Consequently named entity recognition (NER) is an important part of information extraction. One fast and accurate approach to NER uses a list or gazette consisting of known instances. ...
Comments