Skip to main content
Log in

Named entity recognition (NER) for Chinese agricultural diseases and pests based on discourse topic and attention mechanism

  • Special Issue
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

The named entities of agricultural diseases and pests are featured by complex word-formation and universal phenomena of word combination and entity embedding; in particular, in the domain of Chinese agricultural diseases and pests, there exist a lot of problems including various entity naming modes, fuzzy entity boundary, inadequate feature extraction and inconsistent labeling of entity boundary. To address the above problems, this article combined discourse topic and attention mechanism and proposed the Attention-based SoftLexicon with TF-IDF (ASLT) for agricultural diseases and pests entity recognition. By dividing the words sets based on the positions of characters in the words, merging the discourse topic features into the calculation of lexical information, and introducing the attention mechanism, the recognition accuracy of Chinese agricultural diseases and pests entities can be enhanced. In order to improve the interpretability of the model, we designed a flow chart to explain the major principles and steps, and explained the model through visual methods. This article selected 1061 Chinese agricultural news texts and constructed the Corpus of Chinese Named Entities of Diseases and Pests (CCNEDP), in which 7806 agricultural diseases and pests named entities in total were labeled. According to the present experimental results, the proposed ASLT method can effectively recognize the entities in Chinese agricultural texts and achieve favorable recognition on CCNEDP, with the recognition accuracy, the recall rate and the value of F1 of 93.57, 92.79 and 93.18%, respectively. By contrast with the other entity recognition methods, ASLT shows enhanced recognition performance in terms of accuracy and operating efficiency. The implementation of this work is publicly available at https://github.com/azureskymoon/Lexicon-TFIDF-DTopic-master/tree/master.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. https://github.com/jiesutd/LatticeLSTM

  2. https://github.com/v-mipeng/LexiconAugmentedNER

  3. https://github.com/wangcxcup/CLUEDatasetSearch

  4. https://github.com/LeeSureman/Flat-Lattice-Transformer

References

  1. Liu L, Wang DB (2018) A review on named entity recognition. J China Soc Sci Tech Inf 37(3):329–340

    CAS  Google Scholar 

  2. Liu Q, Li Y, Duan H (2016) Knowledge graph construction techniques. J Comput Res Dev 53(3):582–600

    Google Scholar 

  3. Pan SJ, Toh Z, Su J (2013) Transfer joint embedding for cross-domain named entity recognition. ACM Trans Inf Syst 31(2):1–27

    Article  Google Scholar 

  4. Zhou, Z, Zhang H (2019) Research on entity relationship extraction in financial and economic field based on deep learning. In: 2018 IEEE 4th International Conference on Computer and Communications (ICCC), pp 2430–2435

  5. Kafle S, Silva ND, Dou D (2020) An overview of utilizing knowledge bases in neural networks for question answering. Inf Syst Front 22(5):1095–1111

    Article  Google Scholar 

  6. Zhang J, Wu Q, Yang X Y, Wang B C, Wu X W, Xu X Y, Lu Q (2018) Chinese agricultural named entity recognition based on conditional random fields. Comput Modernization (1):123–126

  7. Guo X, Zhou H, Su J, Hao X, Li L (2020) Chinese agricultural diseases and pests named entity recognition with multi-scale local context features and self-attention mechanism. Comput Electron Agric 179(5):105830

    Article  Google Scholar 

  8. Sun JJ, Yu H, Feng YH, Peng S, Cheng M, Lu X L, Dong WT, Cui Z (2018) Recognition of nominated fishery domain entity based on deep learning architectures. J Dalian Ocean Univ 33(2):265–269

    Google Scholar 

  9. Shen L, Jiang H, Hu B, Xie Y (2020) A study on joint entity recognition and relation extraction for rice diseases pests weeds and drugs. J Nanjing Agric Univ 43(06):1151–1161

    Google Scholar 

  10. Ma R, Peng M, Zhang Q, Wei Z, Huang X (2020) Simplify the usage of Lexicon in Chinese NER. In: Proceedings of the 58th annual meeting of the association for computational linguistics

  11. Krogh A, Larsson B, Heijne GV, Sonnhammer E (2001) Predicting transmembrane protein topology with a hidden markov model: application to complete genomes - sciencedirect. J Mol Biol 305(3):567–580

    Article  CAS  PubMed  Google Scholar 

  12. Chang CC, Lin CJ (2007) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3, article 27)

  13. Lafferty J, Mccallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning, pp 282–289

  14. Li X, Wei XH, Jia L, Chen X, Liu L, Zhang YE (2017) Recognition of crops, diseases and pesticides named entities in Chinese based in conditional random fields. Trans Chinese Soc Agri Mach 48(S1):178–185

    Google Scholar 

  15. Huang N, Huang H, Wang RJ (2017) Agriculture-related product name extraction and category labeling based on ontology and conditional random field. J Comput Appl 37(1):233–238

    Google Scholar 

  16. Qin Y, Shen GW, Zhao WB, Chen YP, Miao YU, Jin X (2019) A network security entity recognition method based on feature template and CNN-BiLSTM-CRF. Front Inform Technol Electr Eng 020(006):872–884

    Article  Google Scholar 

  17. Cho M, Ha J, Park C, Park S (2020) Combinatorial feature embedding based on CNN and LSTM for biomedical named entity recognition. J Biomed Inf 103:103381

    Article  Google Scholar 

  18. Luo L, Yang Z, Yang P, Zhang Y, Wang L, Lin H, Wang J (2017) An attention-based bilstm-crf approach to document-level chemical named entity recognition. Bioinformatics 34:1381–1388

    Article  Google Scholar 

  19. Xu K, Yang Z, Kang P, Wang Q, Liu W (2019) Document-level attention-based BiLSTM-CRF incorporating disease dictionary for disease named entity recognition. Comput Biology Med 108:122–132

    Article  Google Scholar 

  20. He B, Guan Y (2019) Character-based CRF for medical entity recognition. Intell Comput Appl 9(2):130–134

    Google Scholar 

  21. Yin X, Zhao H, Zhao J, Yao W, Huang Z (2020) Multi-neural network collaboration for Chinese military named entity recognition. J Tsinghua Univ (Sci Technol) 60(8):648–655

    Google Scholar 

  22. Li Y, Zou L, Liu W, Wang X (2020) Research on chinese clinical named entity recognition: lattice lstm with contextualized character representations. JMIR Med Inform 8(9):e19848

    Article  PubMed  PubMed Central  Google Scholar 

  23. Peng M, Ma R, Zhang Q, Zhao L, Huang X (2020) Toward recognizing more entity types in NER: an efficient implementation using only entity lexicons. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp 678–688

  24. Na SH, Kim H, Min J, Kim K (2019) Improving lstm crfs using character-based compositions for korean named entity recognition. Comput Speech Lang 54:106–121

    Article  Google Scholar 

  25. Feng YH, Hong YU, Sun G, Sun JJ (2018) Named entity recognition method based on BLSTM. Comput Sci 45(2):261–268

    Google Scholar 

  26. Le HQ, Nguyen TM, Vu ST, Dang TH (2018) D3ner: biomedical named entity recognition using crf-bilstm improved with fine-tuned embeddings of various linguistic information. Bioinformatics 34:3539–3546

    Article  PubMed  Google Scholar 

  27. Viterbi AJ (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Informat Theory 13(2):260–269

    Article  Google Scholar 

  28. Zhong Z, Li J, Clausi DA, Wong A (2019) Generative adversarial networks and conditional random fields for hyperspectral image classification. IEEE Transactions on Cybernetics, pp 99

  29. Li X, Yan H, Qiu X, Huang X (2020) FLAT: Chinese NER using flat-lattice transformer. In: Proceedings of the 58th annual meeting of the association for computational linguistics

  30. Gui T, Zou Y, Zhang Q, Peng M, Huang X (2019) A lexicon-based graph neural network for Chinese NER. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 1040–1050

Download references

Acknowledgements

This work is partially supported by the Natural Science Foundation of China under Grant (31771679, 31671589), Major Science and Technology Project of Anhui Province, China, under Grant (18030901034, 201904e01020006), Natural Science Foundation of Anhui Province, China (2108085MF209), the Key Laboratory of Agricultural Electronic Commerce, Ministry of Agriculture of China under Grant (AEC2018001, AEC2021001), University collaborative innovation project of Anhui Province, China(GXXT-2019-013), Natural Science Research Project of Anhui Provincial Department of Education (KJ2020A0107, KJ2021A1550).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Chao Wang or Lichuan Gu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, C., Gao, J., Rao, H. et al. Named entity recognition (NER) for Chinese agricultural diseases and pests based on discourse topic and attention mechanism. Evol. Intel. 17, 457–466 (2024). https://doi.org/10.1007/s12065-022-00727-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-022-00727-w

Keywords

Navigation