Abstract
In response to the challenge that most existing case retrieval platforms can not effectively extract feature information of Chinese legal cases, and thus perform unsatisfactorily in terms of indicators such as relevance and accuracy of retrieval results. We propose to apply LBBT model incorporating domain-specific topic-based text summary generation algorithm to the classification of Chinese legal cases. In our proposed LBBT model, we use LDA to extract subject keywords for each type of legal documents separately, and then the TextRank algorithm is introduced to generate abstract for each legal document by combining the extracted subject words. BERT is used to vectorize the generated abstracts adopted as the inputs of BiLSTM to implement the task of classification on Chinese legal documents. The experimental result on the data set of 2500 single charge Chinese legal judgment documents obtained from CAIL2022 shows that our proposed LBBT model can effectively remove the redundant information in legal documents and improve the ability of LSTM to grasp the global key semantic information of long texts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aman, F., Yanchuan, W.: An intelligent adjudication method for multitasking legal cases based on BERT model. Microelectronics and Computers 39(09), 107–114 (2022). https://doi.org/10.19304/J.ISSN1000-7180.2022.0217
Chen, H., Wu, L., Chen, J., Lu, W., Ding, J.: A comparative study of automated legal text classification using random forests and deep learning. Inf. Process. Manag. 59(2) (2022)
Hongshui, S.: Research on judicial big data text mining and sentencing prediction model. Jurisprudence 07, 113–129 (2020)
Yu, H., Li, H.: A knowledge graph construction approach for legal domain. Tehnički vjesnik 28(2), 357–362 (2021)
Kang, Y.-B., Haghighi, P.D., Burstein, F.: CFinder: an intelligent key concept finder from text for ontology development. Expert Syst. Appl. 41(9), 4494–4504 (2014). https://doi.org/10.1016/j.eswa.2014.01.006
Mao, L.Q., Shi, T., Wu, L., Ma, T.A.: Unsupervised text keyword extraction model based on domain adaption: an example of “artificial intelligence risk” domain text. Intell. Theory Pract. 45(03), 182–187 (2022). https://doi.org/10.16353/j.cnki.1000-7490.2022.03.025
Zhou, N., Shi, W., Liang, R., Zhong, N.: Textrank keyword extraction algorithm using word vector clustering based on rough data-deduction. Comput. Intell. Neurosci. 2022, 1–19 (2022). https://doi.org/10.1155/2022/5649994
Yuxuan, J.: Summary and analysis of text vectorization representation methods. Electron. World 22, 10–12 (2018). https://doi.org/10.19353/j.cnki.dzsj.2018.22.003
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of word representations in vector space. CoRR, abs/1301.3781 (2013)
Cheng, Z.: BERT-based vectorized representation of Chinese text. Technol. Innov. 21, 107–108 (2021). https://doi.org/10.15913/j.cnki.kjycx.2021.21.046
Gao, C.L., Xu, H., Gao, K.: Combining lexical information for Chinese text classification based on attention mechanism of bidirectional LSTM. J. Hebei Univ. Sci. Techn. 39(05), 447–454 (2018)
Khadhraoui, M., Bellaaj, H., Ammar, M.B., Hamam, H., Jmaiel, M.: Survey of BERT-base models for scientific text classification: COVID-19 case study. Appl. Sci. 12(6), 2891 (2022). https://doi.org/10.3390/app12062891
Ding, M., et al.: Cogltx: applying bert to long texts. Adv. Neural. Inf. Process. Syst. 33, 12792–12804 (2020)
Li, G., Wang, Z., Ma, Y.: Combining domain knowledge extraction with graph long short-term memory for learning classification of Chinese legal documents. IEEE Access 7, 139616–139627 (2019)
Sun, H.: Rediscovering the “same case”: constructing the criteria for judging the similarity of cases. Chinese Jurisprudence 06, 262–281 (2020). https://doi.org/10.14111/j.cnki.zgfx.2020.06.014
Acknowledgement
This research was funded by National Funds of Social Science (21BXW076), National Natural Science Foundation of China (61602518), Philosophy and Social Science Research Project of Hubei Provincial Department of Education (20G026), Innovation Research of Young Teachers of Central Universities in 2021 (2722021BZ040), The Key Social Science Projects in Wuhan in 2021 (2021010), Prof. Liu Yaqi’s Outstanding Youth Innovation team Construction Project (Big Data Intelligent Information Processing and Application Technology Innovation Team) and School-level reform project of Zhongnan University of Economics and Law (YB202158).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Q., Chen, X. (2023). Applying BBLT Incorporating Specific Domain Topic Summary Generation Algorithm to the Classification of Chinese Legal Cases. In: Barolli, L. (eds) Advances in Internet, Data & Web Technologies. EIDWT 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 161. Springer, Cham. https://doi.org/10.1007/978-3-031-26281-4_47
Download citation
DOI: https://doi.org/10.1007/978-3-031-26281-4_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26280-7
Online ISBN: 978-3-031-26281-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)