Skip to main content

A Collaborative Optimization-Guided Entity Extraction Scheme

  • Conference paper
  • First Online:
Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2021)

Abstract

Entity extraction as one of the most basic tasks in achieving information extraction and retrieval, has always been an important research area in natural language processing. Considering that most of the traditional entity extraction methods need to manually adjust their hyperparameters, it takes a lot of time and is easy to fall into local optimality. To avoid such limitations, this paper proposes a novel scheme to extract named entities, where the model hyperparameters are automatically adjusted to improve the performance of entity extraction. Here, the proposed scheme is composed of bi-directional encoder representation from transformers (BERT) and conditional random field (CRF). Specifically, through the fusion of collaborative computing paradigm, particle swarm optimization (PSO) algorithm is utilized in this paper to search for the best value of hyperparameters automatically in a cooperative way. The experimental results on two public datasets and a steel inquiry dataset verify that our proposed scheme can effectively improve the performance of entity extraction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://static.bosonnlp.com/dev/resource.

  2. 2.

    http://www.ling.lancs.ac.uk/corplang/pdcorpus/pdcorpus.html.

References

  1. Bashir, Z., El-Hawary, M.: Applying wavelets to short-term load forecasting using PSO-based neural networks. IEEE Trans. Power Syst. 24(1), 20–27 (2009)

    Article  Google Scholar 

  2. de Bruijn, B., Cherry, C., Kiritchenko, S., Martin, J., Zhu, X.: Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. J. Am. Med. Inform. Assoc. 18(5), 557–562 (2011)

    Article  Google Scholar 

  3. Cai, J., Wei, H., Yang, H., Zhao, X.: A novel clustering algorithm based on DPC and PSO. IEEE Access 8, 88200–88214 (2020)

    Article  Google Scholar 

  4. Cao, P., Chen, Y., Liu, K., Zhao, J., Liu, S.: Adversarial transfer learning for Chinese named entity recognition with self-attention mechanism. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 182–192. Association for Computational Linguistics, Brussels, Belgium (2018)

    Google Scholar 

  5. Carbonell, M., Riba, P., Villegas, M., Fornés, A., Lladós, J.: Named entity recognition and relation extraction with graph neural networks in semi structured documents. In: Proceedings of the 25th International Conference on Pattern Recognition, pp. 9622–9627. IEEE, Milan, Italy (2020)

    Google Scholar 

  6. Chen, M., Shen, H., Huang, Z., Luo, X., Yin, J.: Towards accurate search for e-commerce in steel industry: a knowledge-graph-based approach. In: Gao, H., Wang, X., Iqbal, M., Yin, Y., Yin, J., Gu, N. (eds.) CollaborateCom 2020. LNICST, vol. 349, pp. 3–18. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67537-0_1

    Chapter  Google Scholar 

  7. Chiu, J., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016)

    Article  Google Scholar 

  8. Chopra, D., Joshi, N., Mathur, I.: Named entity recognition in Hindi using hidden Markov model. In: Proceedings of the Second International Conference on Computational Intelligence & Communication Technology, pp. 581–586. IEEE, Ghaziabad, India (2016)

    Google Scholar 

  9. Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 100–110. Association for Computational Linguistics, MD, USA (1999)

    Google Scholar 

  10. Constant, M., Sigogne, A.: MWU-aware part-of-speech tagging with a CRF model and lexical resources. In: Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World, pp. 49–56. Association for Computational Linguistics, Oregon, USA (2011)

    Google Scholar 

  11. Das, A., Garain, U.: CRF-based named entity recognition @icon 2013. arXiv:1409.8008 (2014)

  12. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019)

    Google Scholar 

  13. Eberhart, R.C., Shi, Y.: Particle swarm optimization: developments, applications and resources. In: Proceedings of the Congress on Evolutionary Computation, vol. 1, pp. 81–86. IEEE, COEX, Seoul, Korea (2001)

    Google Scholar 

  14. Farmakiotou, D., Karkaletsis, V., Koutsias, J., Sigletos, G., Spyropoulos, C.D., Stamatopoulos, P.: Rule-based named entity recognition for greek financial texts. In: Proceedings of the Workshop on Computational lexicography and Multimedia Dictionaries, pp. 75–78. Citeseer (2000)

    Google Scholar 

  15. Khabsa, M., Giles, C.L.: Chemical entity extraction using CRF and an ensemble of extractors. J. Cheminformatics 7(1), 1–9 (2015)

    Article  Google Scholar 

  16. Khalifa, M.H., Ammar, M., Ouarda, W., Alimi, A.M.: Particle swarm optimization for deep learning of convolution neural network. In: Proceedings of the Sudan Conference on Computer Science and Information Technology, pp. 1–5. IEEE, Elnihood, Sudan (2017)

    Google Scholar 

  17. Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann, San Francisco, CA, USA (2001)

    Google Scholar 

  18. Lin, X., Peng, H., Liu, B.: Chinese named entity recognition using support vector machines. In: Proceedings of the International Conference on Machine Learning and Cybernetics, pp. 4216–4220. IEEE, Guangzhou, China (2006)

    Google Scholar 

  19. Lison, P., Barnes, J., Hubin, A., Touileb, S.: Named entity recognition without labelled data: a weak supervision approach. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1518–1533. Association for Computational Linguistics (2020)

    Google Scholar 

  20. Liu, L., et al.: Empower sequence labeling with task-aware neural language model. In: Proceedings of the 32th AAAI Conference on Artificial Intelligence, vol. 32. AAAI Press, Hilton New Orleans Riverside, New Orleans, Louisiana, USA (2018)

    Google Scholar 

  21. Liu, M., Tu, Z., Wang, Z., Xu, X.: LTP: a new active learning strategy for BERT-CRF based named entity recognition. arXiv:2001.02524 (2020)

  22. Liu, Y., Zhang, Y., Che, W., Liu, T., Wu, F.: Domain adaptation for CRF-based Chinese word segmentation using free annotations. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 864–874. ACL, Doha, Qatar (2014)

    Google Scholar 

  23. Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1064–1074. Association for Computational Linguistics, Berlin, Germany (2016)

    Google Scholar 

  24. Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(24), 3–26 (2007)

    Article  Google Scholar 

  25. Nadeau, D., Turney, P.D., Matwin, S.: Unsupervised named-entity recognition: generating gazetteers and resolving ambiguity. In: Lamontagne, L., Marchand, M. (eds.) AI 2006. LNCS (LNAI), vol. 4013, pp. 266–277. Springer, Heidelberg (2006). https://doi.org/10.1007/11766247_23

    Chapter  Google Scholar 

  26. Omran, M.G.H., Engelbrecht, A.P., Salman, A.A.: Particle swarm optimization for pattern recognition and image processing. In: Abraham, A., Grosan, C., Ramos, V. (eds.) Swarm Intelligence in Data Mining. Studies in Computational Intelligence, vol. 34. Springer, Heidelberg (2006). https://doi.org/10.1007/978-3-540-34956-3_6

  27. Qu, L., Ferraro, G., Zhou, L., Hou, W., Baldwin, T.: Named entity recognition for novel types by transfer learning. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 899–905. The Association for Computational Linguistics, Texas, USA (2016)

    Google Scholar 

  28. Saha, S.K., Sarkar, S., Mitra, P.: Feature selection techniques for maximum entropy based biomedical named entity recognition. J. Biomed. Inform. 42(5), 905–911 (2009)

    Article  Google Scholar 

  29. Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st Annual Conference on Neural Information Processing Systems, pp. 5999–6009. Neural information processing systems foundation, Long Beach, CA, USA (2017)

    Google Scholar 

  30. Wang, Q., Iwaihara, M.: Deep neural architectures for joint named entity recognition and disambiguation. In: Proceedings of the IEEE International Conference on Big Data and Smart Computing, pp. 1–4. IEEE, Kyoto, Japan (2019)

    Google Scholar 

  31. Wang, X., et al.: Cross-type biomedical named entity recognition with deep multi-task learning. Bioinformatics 35(10), 1745–1752 (2019)

    Article  Google Scholar 

  32. Xie, R., Liu, Z., Jia, J., Luan, H., Sun, M.: Representation learning of knowledge graphs with entity descriptions. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, vol. 30, pp. 2659–2665. AAAI Press, Phoenix, Arizona USA (2016)

    Google Scholar 

  33. Zhang, D., et al.: Improving distantly-supervised named entity recognition for traditional Chinese medicine text via a novel back-labeling approach. IEEE Access 8, 145413–145421 (2020)

    Article  Google Scholar 

  34. Zhang, S., Elhadad, N.: Unsupervised biomedical named entity recognition: experiments with clinical and biological texts. J. Biomed. Inform. 46(6), 1088–1098 (2013)

    Article  Google Scholar 

  35. Zhang, W., Jiang, S., Zhao, S., Hou, K., Liu, Y., Zhang, L.: A BERT-BILSTM-CRF model for Chinese electronic medical records named entity recognition. In: Proceedings of the 12th International Conference on Intelligent Computation Technology and Automation, pp. 166–169. IEEE, Xiangtan, China (2019)

    Google Scholar 

  36. Zhao, Z., et al.: ML-CNN: a novel deep learning based disease named entity recognition architecture. In: Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, pp. 794–794. IEEE, Shenzhen, China (2016)

    Google Scholar 

  37. Zhou, H., Sun, G., Fu, S., Liu, J., Zhou, X., Zhou, J.: A big data mining approach of PSO-based BP neural network for financial risk management with IoT. IEEE Access 7, 154035–154043 (2019)

    Article  Google Scholar 

  38. Žukov-Gregorič, A., Bachrach, Y., Coope, S.: Named entity recognition with parallel recurrent neural networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 69–74. Association for Computational Linguistics, Melbourne, Australia (2018)

    Google Scholar 

Download references

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China under Grant U1836106, in part by the Beijing Natural Science Foundation under Grants 19L2029 and M21032, in part by the Scientific and Technological Innovation Foundation of Shunde Graduate School, USTB, under Grants BK19BF006 and BK20BF010, and in part by the Fundamental Research Funds for the University of Science and Technology Beijing under Grant FRF-BD-19-012A.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiong Luo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Peng, Q., Luo, X., Shen, H., Huang, Z., Chen, M. (2021). A Collaborative Optimization-Guided Entity Extraction Scheme. In: Gao, H., Wang, X. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 407. Springer, Cham. https://doi.org/10.1007/978-3-030-92638-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92638-0_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92637-3

  • Online ISBN: 978-3-030-92638-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics