Skip to main content

Applying BBLT Incorporating Specific Domain Topic Summary Generation Algorithm to the Classification of Chinese Legal Cases

  • Conference paper
  • First Online:
Book cover Advances in Internet, Data & Web Technologies (EIDWT 2023)

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 161))

  • 374 Accesses

Abstract

In response to the challenge that most existing case retrieval platforms can not effectively extract feature information of Chinese legal cases, and thus perform unsatisfactorily in terms of indicators such as relevance and accuracy of retrieval results. We propose to apply LBBT model incorporating domain-specific topic-based text summary generation algorithm to the classification of Chinese legal cases. In our proposed LBBT model, we use LDA to extract subject keywords for each type of legal documents separately, and then the TextRank algorithm is introduced to generate abstract for each legal document by combining the extracted subject words. BERT is used to vectorize the generated abstracts adopted as the inputs of BiLSTM to implement the task of classification on Chinese legal documents. The experimental result on the data set of 2500 single charge Chinese legal judgment documents obtained from CAIL2022 shows that our proposed LBBT model can effectively remove the redundant information in legal documents and improve the ability of LSTM to grasp the global key semantic information of long texts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aman, F., Yanchuan, W.: An intelligent adjudication method for multitasking legal cases based on BERT model. Microelectronics and Computers 39(09), 107–114 (2022). https://doi.org/10.19304/J.ISSN1000-7180.2022.0217

    Article  Google Scholar 

  2. Chen, H., Wu, L., Chen, J., Lu, W., Ding, J.: A comparative study of automated legal text classification using random forests and deep learning. Inf. Process. Manag. 59(2) (2022)

    Google Scholar 

  3. Hongshui, S.: Research on judicial big data text mining and sentencing prediction model. Jurisprudence 07, 113–129 (2020)

    Google Scholar 

  4. Yu, H., Li, H.: A knowledge graph construction approach for legal domain. Tehnički vjesnik 28(2), 357–362 (2021)

    MathSciNet  Google Scholar 

  5. Kang, Y.-B., Haghighi, P.D., Burstein, F.: CFinder: an intelligent key concept finder from text for ontology development. Expert Syst. Appl. 41(9), 4494–4504 (2014). https://doi.org/10.1016/j.eswa.2014.01.006

    Article  Google Scholar 

  6. Mao, L.Q., Shi, T., Wu, L., Ma, T.A.: Unsupervised text keyword extraction model based on domain adaption: an example of “artificial intelligence risk” domain text. Intell. Theory Pract. 45(03), 182–187 (2022). https://doi.org/10.16353/j.cnki.1000-7490.2022.03.025

    Article  Google Scholar 

  7. Zhou, N., Shi, W., Liang, R., Zhong, N.: Textrank keyword extraction algorithm using word vector clustering based on rough data-deduction. Comput. Intell. Neurosci. 2022, 1–19 (2022). https://doi.org/10.1155/2022/5649994

    Article  Google Scholar 

  8. Yuxuan, J.: Summary and analysis of text vectorization representation methods. Electron. World 22, 10–12 (2018). https://doi.org/10.19353/j.cnki.dzsj.2018.22.003

    Article  Google Scholar 

  9. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of word representations in vector space. CoRR, abs/1301.3781 (2013)

    Google Scholar 

  10. Cheng, Z.: BERT-based vectorized representation of Chinese text. Technol. Innov. 21, 107–108 (2021). https://doi.org/10.15913/j.cnki.kjycx.2021.21.046

    Article  Google Scholar 

  11. Gao, C.L., Xu, H., Gao, K.: Combining lexical information for Chinese text classification based on attention mechanism of bidirectional LSTM. J. Hebei Univ. Sci. Techn. 39(05), 447–454 (2018)

    Google Scholar 

  12. Khadhraoui, M., Bellaaj, H., Ammar, M.B., Hamam, H., Jmaiel, M.: Survey of BERT-base models for scientific text classification: COVID-19 case study. Appl. Sci. 12(6), 2891 (2022). https://doi.org/10.3390/app12062891

    Article  Google Scholar 

  13. Ding, M., et al.: Cogltx: applying bert to long texts. Adv. Neural. Inf. Process. Syst. 33, 12792–12804 (2020)

    Google Scholar 

  14. Li, G., Wang, Z., Ma, Y.: Combining domain knowledge extraction with graph long short-term memory for learning classification of Chinese legal documents. IEEE Access 7, 139616–139627 (2019)

    Article  Google Scholar 

  15. Sun, H.: Rediscovering the “same case”: constructing the criteria for judging the similarity of cases. Chinese Jurisprudence 06, 262–281 (2020). https://doi.org/10.14111/j.cnki.zgfx.2020.06.014

    Article  Google Scholar 

Download references

Acknowledgement

This research was funded by National Funds of Social Science (21BXW076), National Natural Science Foundation of China (61602518), Philosophy and Social Science Research Project of Hubei Provincial Department of Education (20G026), Innovation Research of Young Teachers of Central Universities in 2021 (2722021BZ040), The Key Social Science Projects in Wuhan in 2021 (2021010), Prof. Liu Yaqi’s Outstanding Youth Innovation team Construction Project (Big Data Intelligent Information Processing and Application Technology Innovation Team) and School-level reform project of Zhongnan University of Economics and Law (YB202158).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xu Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Q., Chen, X. (2023). Applying BBLT Incorporating Specific Domain Topic Summary Generation Algorithm to the Classification of Chinese Legal Cases. In: Barolli, L. (eds) Advances in Internet, Data & Web Technologies. EIDWT 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 161. Springer, Cham. https://doi.org/10.1007/978-3-031-26281-4_47

Download citation

Publish with us

Policies and ethics