Applying BBLT Incorporating Specific Domain Topic Summary Generation Algorithm to the Classification of Chinese Legal Cases

Zhang, Qiong; Chen, Xu

doi:10.1007/978-3-031-26281-4_47

Qiong Zhang³ &
Xu Chen⁴

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 161))

Included in the following conference series:

International Conference on Emerging Internetworking, Data & Web Technologies

374 Accesses

Abstract

In response to the challenge that most existing case retrieval platforms can not effectively extract feature information of Chinese legal cases, and thus perform unsatisfactorily in terms of indicators such as relevance and accuracy of retrieval results. We propose to apply LBBT model incorporating domain-specific topic-based text summary generation algorithm to the classification of Chinese legal cases. In our proposed LBBT model, we use LDA to extract subject keywords for each type of legal documents separately, and then the TextRank algorithm is introduced to generate abstract for each legal document by combining the extracted subject words. BERT is used to vectorize the generated abstracts adopted as the inputs of BiLSTM to implement the task of classification on Chinese legal documents. The experimental result on the data set of 2500 single charge Chinese legal judgment documents obtained from CAIL2022 shows that our proposed LBBT model can effectively remove the redundant information in legal documents and improve the ability of LSTM to grasp the global key semantic information of long texts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aman, F., Yanchuan, W.: An intelligent adjudication method for multitasking legal cases based on BERT model. Microelectronics and Computers 39(09), 107–114 (2022). https://doi.org/10.19304/J.ISSN1000-7180.2022.0217
Article Google Scholar
Chen, H., Wu, L., Chen, J., Lu, W., Ding, J.: A comparative study of automated legal text classification using random forests and deep learning. Inf. Process. Manag. 59(2) (2022)
Google Scholar
Hongshui, S.: Research on judicial big data text mining and sentencing prediction model. Jurisprudence 07, 113–129 (2020)
Google Scholar
Yu, H., Li, H.: A knowledge graph construction approach for legal domain. Tehnički vjesnik 28(2), 357–362 (2021)
MathSciNet Google Scholar
Kang, Y.-B., Haghighi, P.D., Burstein, F.: CFinder: an intelligent key concept finder from text for ontology development. Expert Syst. Appl. 41(9), 4494–4504 (2014). https://doi.org/10.1016/j.eswa.2014.01.006
Article Google Scholar
Mao, L.Q., Shi, T., Wu, L., Ma, T.A.: Unsupervised text keyword extraction model based on domain adaption: an example of “artificial intelligence risk” domain text. Intell. Theory Pract. 45(03), 182–187 (2022). https://doi.org/10.16353/j.cnki.1000-7490.2022.03.025
Article Google Scholar
Zhou, N., Shi, W., Liang, R., Zhong, N.: Textrank keyword extraction algorithm using word vector clustering based on rough data-deduction. Comput. Intell. Neurosci. 2022, 1–19 (2022). https://doi.org/10.1155/2022/5649994
Article Google Scholar
Yuxuan, J.: Summary and analysis of text vectorization representation methods. Electron. World 22, 10–12 (2018). https://doi.org/10.19353/j.cnki.dzsj.2018.22.003
Article Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of word representations in vector space. CoRR, abs/1301.3781 (2013)
Google Scholar
Cheng, Z.: BERT-based vectorized representation of Chinese text. Technol. Innov. 21, 107–108 (2021). https://doi.org/10.15913/j.cnki.kjycx.2021.21.046
Article Google Scholar
Gao, C.L., Xu, H., Gao, K.: Combining lexical information for Chinese text classification based on attention mechanism of bidirectional LSTM. J. Hebei Univ. Sci. Techn. 39(05), 447–454 (2018)
Google Scholar
Khadhraoui, M., Bellaaj, H., Ammar, M.B., Hamam, H., Jmaiel, M.: Survey of BERT-base models for scientific text classification: COVID-19 case study. Appl. Sci. 12(6), 2891 (2022). https://doi.org/10.3390/app12062891
Article Google Scholar
Ding, M., et al.: Cogltx: applying bert to long texts. Adv. Neural. Inf. Process. Syst. 33, 12792–12804 (2020)
Google Scholar
Li, G., Wang, Z., Ma, Y.: Combining domain knowledge extraction with graph long short-term memory for learning classification of Chinese legal documents. IEEE Access 7, 139616–139627 (2019)
Article Google Scholar
Sun, H.: Rediscovering the “same case”: constructing the criteria for judging the similarity of cases. Chinese Jurisprudence 06, 262–281 (2020). https://doi.org/10.14111/j.cnki.zgfx.2020.06.014
Article Google Scholar

Download references

Acknowledgement

This research was funded by National Funds of Social Science (21BXW076), National Natural Science Foundation of China (61602518), Philosophy and Social Science Research Project of Hubei Provincial Department of Education (20G026), Innovation Research of Young Teachers of Central Universities in 2021 (2722021BZ040), The Key Social Science Projects in Wuhan in 2021 (2021010), Prof. Liu Yaqi’s Outstanding Youth Innovation team Construction Project (Big Data Intelligent Information Processing and Application Technology Innovation Team) and School-level reform project of Zhongnan University of Economics and Law (YB202158).

Author information

Authors and Affiliations

School of Management Information and System, Zhongnan University of Economics and Law, Wuhan, 430073, China
Qiong Zhang
School of Information and Safety Engineering, Zhongnan University of Economics and Law, Wuhan, 430073, China
Xu Chen

Authors

Qiong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xu Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xu Chen .

Editor information

Editors and Affiliations

Department of Information and Communication Engineering, Fukuoka Institute of Technology, Fukuoka, Japan
Leonard Barolli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Q., Chen, X. (2023). Applying BBLT Incorporating Specific Domain Topic Summary Generation Algorithm to the Classification of Chinese Legal Cases. In: Barolli, L. (eds) Advances in Internet, Data & Web Technologies. EIDWT 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 161. Springer, Cham. https://doi.org/10.1007/978-3-031-26281-4_47

Download citation

DOI: https://doi.org/10.1007/978-3-031-26281-4_47
Published: 12 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26280-7
Online ISBN: 978-3-031-26281-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics