Abstract
In recent years, information technologies such as big data, Natural Language Processing (NLP) and data mining have become increasingly mature, and the demand for natural language oriented unstructured public security information resources mining and exploitation technologies has become stronger. Customs anti-smuggling departments also urgently need to apply such technology to the processing of cases. Currently, one of the ways to obtain information from customs anti-smuggling text is to text data mining of case files, transcripts and other textual materials generated by solved cases, so as to discover the hidden dangers of the case. Such information is often data-intensive and it is difficult to search for practical information on smuggling cases in a short period of time, and it is time-consuming and inefficient. Based on this problem, this project constructs a public security entity recognition model for text data through machine learning algorithms. Firstly, the relevant information is labeled and used as the training set, and the labeled entities are highlighted to exclude the interference of useless information. Secondly, the BiLSTM-CRF-based algorithm is then combined with the entity pre-training model already built using the BERT model to make the case framework more specific and clear, so that it can be applied to other similar cases, enabling the customs anti-smuggling department to collect and organize relevant information faster, greatly shortening the time to solve the case and improving the efficiency of solving the case.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kong, Z., Yue, C., Shi, Y., Yu, J., Xie, C., Xie, L.: Entity extraction of electrical equipment malfunction text by a hybrid natural language processing algorithm. IEEE Access 9, 3–4 (2021)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding, pp. 24–27. arXiv:1810.04805 (2019)
Wang, X., et al.: Spatialized analysis of air pollution complaints in Beijing using the BERT+CRF model. Atmosphere 13(7), 7–13 (2022)
Chen, J., He, T., Wen, Y.Y., Ma, L.T.: Entity recognition method for judicial documents based on BERT model. J. Northeast. Univ. (Nat. Sci.) 41, 1382–1387, 15–18 (2020)
Guo, J., Liu, M., Luo, P., Chen, X., Yu, H., Wei, X.: Attention-based BILSTM for the degradation trend prediction of lithium battery. Energy Rep. 9(S2), 55–59 (2023)
Acknowledgements
This research was support by the 2023 College Students Innovation and Entrepreneurship Training Program (Grant No. 202312213035Z).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yuan, T., Wen, X., Zhao, P., Wang, X., Chen, Y., Qiu, M. (2023). Research on BERT-Based Text Entity Recognition Model for Customs Anti-smuggling. In: Barolli, L. (eds) Advances in Intelligent Networking and Collaborative Systems. INCoS 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 182. Springer, Cham. https://doi.org/10.1007/978-3-031-40971-4_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-40971-4_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40970-7
Online ISBN: 978-3-031-40971-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)