TenLa: an approach based on controllable tensor decomposition and optimized lasso regression for judgement prediction of legal cases

Guo, Xiaoding; Zhang, Hongli; Ye, Lin; Li, Shang

doi:10.1007/s10489-020-01912-z

TenLa: an approach based on controllable tensor decomposition and optimized lasso regression for judgement prediction of legal cases

Published: 30 October 2020

Volume 51, pages 2233–2252, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Xiaoding Guo ORCID: orcid.org/0000-0002-8466-6103¹,
Hongli Zhang¹,
Lin Ye¹ &
…
Shang Li¹

437 Accesses
8 Citations
Explore all metrics

Abstract

With the development of big data and artificial intelligence technology, the computer-assisted judgment of legal cases has become an inevitable trend in the intersection of computer science and law. Judgment prediction methods of legal cases mainly consist of two parts: (1) modeling of legal cases and (2) construction of judgment prediction algorithms. Previous methods for the judgment prediction of legal cases are mainly based on feature models and classification algorithms. Traditional feature models require extensive expert knowledge and manual annotation. They are highly dependent on vocabulary and grammatical information in databases, which are not conducive to the improvement of accuracy and universality of subsequent prediction algorithms. In addition, prediction results obtained by classification algorithms are coarse in granularity and low in accuracy. In general, judgments in similar legal cases are similar. This article proposes a new method for the judgment prediction of legal cases, namely, TenLa, which is based on a controllable algorithm of tensor decomposition and an optimized Lasso regression model. TenLa takes similarities between legal cases as an important indicator of judgment prediction and is mainly divided into three parts: (1) ModTen; we propose a modeling method for legal cases, namely, ModTen, which represents legal cases as three-dimensional tensors. (2) ConTen; we propose a new tensor decomposition algorithm, namely, ConTen, which decomposes tensors obtained by ModTen into core tensors through the intermediary tensor. Core tensors greatly reduce the dimensions of original tensors. (3) OLass; we propose an optimized Lasso regression algorithm, namely, OLass. Core tensors obtained by ConTen are used to train OLass. Specifically, we propose an optimization algorithm for OLass with respect to the intermediary tensor in ConTen; thus, the core tensors obtained by ConTen carry tensor elements and tensor structure information that is most conducive to the improvement of the accuracy of OLass. Experiments show that TenLa has higher accuracy than traditional judgment prediction algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Legal Judgment Prediction Incorporating Guiding Cases Matching

NOWJ at COLIEE 2023: Multi-task and Ensemble Approaches in Legal Information Processing

Article 22 February 2024

Thi-Hai-Yen Vuong, Hai-Long Nguyen, … Ha-Thanh Nguyen

What Can We Learn from the Legal Provisions in Judgment Documents?

References

Massey A, Otto P, Antn A (2015) Evaluating legal implementation readiness decision-making. IEEE Trans Softw Eng 41:545–564, 06
Article Google Scholar
Manes GW, Downing E (2009) Overview of licensing and legal issues for digital forensic investigators. IEEE Secur Priv 7(2):45–48
Article Google Scholar
Jing L, Shen C, Yang L, Yu J, Ng MK (2017) Multi-label classification by semi-supervised singular value decomposition. IEEE Transactions on Image Processing, pp 1–1
Qi J, Yu Z, Li D, Li W (2017) A social recommendation method based on trust propagation and singular value decomposition. Journal of Intelligent and Fuzzy Systems Applications in Engineering and Technology
Wimalawarne K, Tomioka R, Sugiyama M (2016) Theoretical and experimental analyses of tensor-based regression and classification. Neural Comput 28(4):686–715
Article MathSciNet Google Scholar
Taguchi YH (2018) Correction: Tensor decomposition-based unsupervised feature extraction applied to matrix products for multi-view data processing. Plos One 13:7
Article Google Scholar
Zheng X, Ding W, Lin Z, Chen C (2016) Topic tensor factorization for recommender system. Information Sciences, pp S0020025516306144
Gruginskie LADS, Vaccaro GLR, Amaral LAN (2018) Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable. Plos One 13:6
Article Google Scholar
Horsman G, Laing C, Vickers P (2014) A case-based reasoning method for locating evidence during digital forensic device triage. Decis Support Syst 61:69–78
Article Google Scholar
Zhang W, Du Y, Taketoshi Y, Wang Q, Li X (2018) Samen-svr: using sample entropy and support vector regression for bug number prediction. Iet Softw 12(3):183–189
Article Google Scholar
Kaneda Y, Mineno H (2016) Sliding window-based support vector regression for predicting micrometeorological data. Expert Syst Appl 59:217–225
Article Google Scholar
Kim P (2017) Convolutional neural network
Yin W, Schütze H, Xiang B, Zhou B (2015) Abcnn: Attention-based convolutional neural network for modeling sentence pairs. Computer Science
Blanco A, Delgado M, Pegalajar MC A genetic algorithm to obtain the optimal recurrent neural network. Int J Approx Reason 23(1):67–83
Liu T, Yu S, Xu B, Yin H Recurrent networks with attention and convolutional networks for sentence representation and classification. Applied Intelligence the International Journal of Artificial Intelligence Neural Networks and Complex Problem Solving Technologies
Gers FA, Schmidhuber J, Cummins F Learning to forget: Continual prediction with lstm. Neural Comput 12(10): 2451–2471
Yao Y, Huang Z Bi-directional lstm recurrent neural network for chinese word segmentation
Molinar G, Popovic N, Stork W (2018) From data points to ampacity forecasting: Gated recurrent unit networks. In: IEEE Fourth international conference on big data computing service and applications
Zhang L, Zhou Y, Duan X, Chen R A hierarchical multi-input and output bi-gru model for sentiment analysis on customer reviews. Iop Conf 322:062 007–
de Eufrásio ALN, de Francisco ATC Constrained linear regression models for symbolic interval-valued variables. Comput Stat Data Anal 54(2):333–347
Blais E, O’Donnell R, Wimmer K Polynomial regression under arbitrary product distributions. Mach Learn 80(2-3):273–294
Cawley GC, Talbot NLC (2002) Reduced rank kernel ridge regression 16(3):293–302
Yang X, Wen W Ridge and lasso regression models for cross-version defect prediction. IEEE Trans Reliab 67(3):885–896
De Mol C, De Vito E, Rosasco L Elastic-net regularization in learning theory

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (2018YFC0830900 and 2016QY03D0501).

Funding

This study was funded by the National Key Research and Development Program of China (2018YFC0830900 and 2016QY03D0501).

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
Xiaoding Guo, Hongli Zhang, Lin Ye & Shang Li

Authors

Xiaoding Guo
View author publications
You can also search for this author in PubMed Google Scholar
Hongli Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lin Ye
View author publications
You can also search for this author in PubMed Google Scholar
Shang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoding Guo.

Ethics declarations

Conflict of interests

The authors declare that we have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Ethical statement

I confirm that the manuscript has not been submitted to more than one journal for simultaneous consideration. The manuscript has not been published previously (partly or in full) unless the new work concerns an expansion of previous work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Appendix

1.1 A.1 Supplementary material 1

Proofs A.1, A.2, A.3, A.4, A.5, A.6, A.7 and A.8 give the proof process of lemmas 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8 and 4.9, respectively.

Proof A.1

From definition 3.4, we can see that values of corresponding elements in χ and \({\chi _{({n_0})}}\) are equal. According to the definition of Frobenius norm 3.3, the sum of squares of elements in χ and \({\chi _{({n_0})}}\) is equal. That is \(\left \| \chi \right \|_F^2 = \left \| {{\chi _{({n_0})}}} \right \|_F^2\).

Proof A.2

Let C = A^TA, \(C \in {\mathbb {R}^{J \times J}}\). According to the rule of matrix multiplication, we can obtain that \({C_{ij}} = \sum \limits _{p = 1}^I {A_{ip}^T{A_{pj}}} = \sum \limits _{p = 1}^I {{A_{pi}}{A_{pj}}} \). Based on the definition of the trace norm 3.1, we get that \(Trace(C) = \sum \limits _{j = 1}^J {{C_{jj}}} \). That is \(Trace(C) = \sum \limits _{j = 1}^J {\sum \limits _{p = 1}^I {A_{pj}^2} } \). From the definition of Frobenius norm 3.3, we obtain that \(\left \| A \right \|_F^2 = \sum \limits _{p = 1}^I {\sum \limits _{j = 1}^J {A_{pj}^2} } \). Thus, \(\left \| A \right \|_F^2 = Trace({A^T}A)\).

Proof A.3

Let \(\lambda = \chi { \times _{{n_0}}}P\), \({\mathbb {R}^{{I_1} \times {I_2} \times {\cdots } \times {I_{{n_0} - 1}} \times {J_{{n_0}}} \times {I_{{n_0} + 1}} \times {\cdots } \times {I_N}}}\). From definition 3.6, we get that \({\lambda _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{j_{{n_0}}}{i_{{n_0} + 1}} {\cdots } {i_N}}} = \sum \limits _{i = 1}^{{I_{{n_0}}}} {{\chi _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}i{i_{{n_0} + 1}} {\cdots } {i_N}}}{P_{i{j_{{n_0}}}}}} \). According to definition 3.4, we obtain that \({\lambda _{({n_0})}} \in {\mathbb {R}^{({I_1}{I_2} {\cdots } {I_{{n_0} - 1}}{I_{{n_0} + 1}} {\cdots } {I_N}) \times {J_{{n_0}}}}}\). \({\lambda _{({n_0})}}_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}} = {\lambda _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{j_{{n_0}}}{i_{{n_0} + 1}} {\cdots } {i_N}}}\). Since \({\chi _{({n_0})}} \in {\mathbb {R}^{({I_1}{I_2} {\cdots } {I_{{n_0} - 1}}{I_{{n_0} + 1}} {\cdots } {I_N}) \times {I_{{n_0}}}}}\), let \(A = {\chi _{({n_0})}} \times P\), \(A \in {\mathbb {R}^{({I_1}{I_2} {\cdots } {I_{{n_0} - 1}}{I_{{n_0} + 1}} {\cdots } {I_N}) \times {J_{{n_0}}}}}\). From the rule of matrix multiplication, \({A_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}}} = \sum \limits _{i = 1}^{{I_{{n_{0}}}}} {{\chi _{({n_{0}})}}_{({i_{1}}{i_{2}} {\cdots } {i_{{n_{0}} - 1}}{i_{{n_{0}} + 1}} {\cdots } {i_{N}})i}{P_{i{j_{{n_{0}}}}}}} \). Since \({\chi _{({n_{0}})}}_{({i_{1}}{i_{2}} {\cdots } {i_{{n_{0}} - 1}}{i_{{n_{0}} + 1}} {\cdots } {i_{N}})i} = {\chi _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}i{i_{{n_0} + 1}} {\cdots } {i_N}}}\), \({\lambda _{({n_0})}}_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}} = {A_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}}}\). \({\lambda _{({n_0})}} = A\). That is \({(\chi { \times _{{n_0}}}P)_{({n_0})}} = {\chi _{({n_0})}} \times P\).

Proof A.4

Let \(C = \upsilon _{({n_0})}^T{\varpi _{({n_0})}}\), \(C \in {\mathbb {R}^{{J_{{n_0}}} \times {I_{{n_0}}}}}\). From the rule of matrix multiplication, \({C_{ij}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\upsilon {{{~}_{({n_0})}^T}_{ip}}{\varpi _{({n_0})}}_{pj}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pi}{\varpi _{({n_0})}}_{pj}} \). Then \({F_1} = C{B_{{n_0}}}\), \({F_1} \in {\mathbb {R}^{{J_{{n_0}}} \times {J_{{n_0}}}}}\). \({F_1}_{ij} = \sum \limits _{q = 1}^{{I_{{n_0}}}} {{C_{iq}}{B_{{n_0}}}_{qj}} = \sum \limits _{q = 1}^{{I_{{n_0}}}} {\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pi}{\varpi _{({n_0})}}_{pq}{B_{{n_0}}}_{qj}} } \). Based on the definition of the trace norm 3.1, we get that \(Trace({F_1}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {{F_1}_{jj}} \). That is \(Trace({F_1}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {\sum \limits _{q = 1}^{{I_{{n_0}}}} {\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pj}{\varpi _{({n_0})}}_{pq}{B_{{n_0}}}_{qj}} } } \). When q = m, j = n, \(\frac {{\partial (\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pn}{\varpi _{({n_0})}}_{pm}{B_{{n_0}}}_{mn}} )}}{{\partial {B_{{n_0}}}_{mn}}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pn}{\varpi _{({n_0})}}_{pm}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\varpi {{{~}_{({n_0})}^T}_{mp}}} {\upsilon _{({n_0})}}_{pn}\). When q≠m,j≠n, \(\frac {{\partial (\sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{\upsilon _{({n_{0}})}}_{pj}{\varpi _{({n_{0}})}}_{pq}{B_{{n_{0}}}}_{qj}} )}}{{\partial {B_{{n_{0}}}}_{mn}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{1}})}}{{\partial {B_{{n_{0}}}}}} = \varpi _{({n_{0}})}^{T}{\upsilon _{({n_{0}})}}\).

Proof A.5

Let \(C = B_{{n_{0}}}^{T}\varpi _{({n_{0}})}^{T}\), \(C \in {\mathbb {R}^{{J_{{n_{0}}}} \times ({J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}})}}\). \({C_{ij}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{ip}}\varpi {{{~}_{({n_{0}})}^{T}}_{pj}}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {{B_{{n_{0}}}}_{pi}\varpi {{{~}_{({n_{0}})}^{T}}_{pj}}} \). \({F_{2}} = C{\upsilon _{({n_{0}})}}\), \({F_{2}} \in {\mathbb {R}^{{J_{{n_{0}}}} \times {J_{{n_{0}}}}}}\). \({F_{2}}_{ij} = \sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{C_{iq}}{\upsilon _{({n_{0}})}}_{qj}} \). That is \({F_{2}}_{ij} = \sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {\sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{ip}}\varpi {{{~}_{({n_{0}})}^{T}}_{pq}}{\upsilon _{({n_{0}})}}_{qj}} } \). Based on definition 3.1, \(Trace({F_{2}}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {{F_2}_{jj}} \). \(Trace({F_2}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\sum \limits _{p = 1}^{{I_{{n_0}}}} {{B_{{n_0}}}_{pj}\varpi {{{~}_{({n_0})}^T}_{pq}}{\upsilon _{({n_0})}}_{qj}} } } \). When p = m, j = n, \(\frac {{\partial (\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{B_{{n_0}}}_{mn}\varpi {{{~}_{({n_0})}^T}_{mq}}{\upsilon _{({n_0})}}_{qn}} )}}{{\partial {B_{{n_0}}}_{mn}}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\varpi {{{~}_{({n_0})}^T}_{mq}}{\upsilon _{({n_0})}}_{qn}} \). When p≠m, j≠n, \(\frac {{\partial (\sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{B_{{n_{0}}}}_{pj}\varpi {{{~}_{({n_{0}})}^{T}}_{pq}}{\upsilon _{({n_{0}})}}_{qj}} )}}{{\partial {B_{{n_{0}}}}_{mn}}} = 0\). Thus, \(\frac {{\partial Trace({F_{2}})}}{{\partial {B_{{n_{0}}}}}} = \varpi _{({n_{0}})}^{T}{\upsilon _{({n_{0}})}}\).

Proof A.6

Let \(C = \varpi _{({n_{0}})}^{T}{\varpi _{({n_{0}})}}\), \(C \in {\mathbb {R}^{{I_{{n_{0}}}} \times {I_{{n_{0}}}}}}\). Then \({F_{3}} = B_{{n_{0}}}^{T}C{B_{{n_{0}}}}\). Let \(D = B_{{n_{0}}}^{T}C\), \(D \in {\mathbb {R}^{{J_{{n_{0}}}} \times {I_{{n_{0}}}}}}\). \({F_{3}} = D{B_{{n_{0}}}}\). From the rule of matrix multiplication, \({D_{ij}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{ip}}{C_{pj}}} \). \({F_{3}}_{ij} = \sum \limits _{q = 1}^{{I_{{n_{0}}}}} {{D_{iq}}{B_{{n_{0}}}}_{qj}} \). Based on the definition of the trace norm 3.1, \(Trace({F_{3}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {{F_{3}}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{q = 1}^{{I_{{n_{0}}}}} {{D_{jq}}{B_{{n_{0}}}}_{qj}} } \). That is \(Trace({F_{3}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{q = 1}^{{I_{{n_{0}}}}} {\sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{jp}}{C_{pq}}{B_{{n_{0}}}}_{qj}} } } \). When q = m, j = n, \(\frac {{\partial (\sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{np}}{C_{pm}}} {B_{{n_{0}}}}_{mn})}}{{\partial {B_{{n_{0}}}}_{mn}}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{np}}{C_{pm}}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {C_{mp}^{T}{B_{{n_{0}}}}_{pn}} \). When p = m, j = n, \(\frac {{\partial (\sum \limits _{q = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{nm}}{C_{mq}}{B_{{n_{0}}}}_{qn}} )}}{{\partial {B_{{n_{0}}}}_{mn}}} = \sum \limits _{q = 1}^{{I_{{n_{0}}}}} {{C_{mq}}{B_{{n_{0}}}}_{qn}} \). In other cases, \(\frac {{\partial (B{{{~}_{{n_{0}}}^{T}}_{jp}}{C_{pq}}{B_{{n_{0}}}}_{qj})}}{{\partial {B_{{n_{0}}}}_{mn}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{3}})}}{{\partial {B_{{n_{0}}}}}} = C{B_{{n_{0}}}} + {C^{T}}{B_{{n_{0}}}}\). Since \(C = \varpi _{({n_{0}})}^{T}{\varpi _{({n_{0}})}}\), C^T = C. \(\frac {{\partial Trace({F_{3}})}}{{\partial {B_{{n_{0}}}}}} = 2C{B_{{n_{0}}}}\). Thus, \(\frac {{\partial Trace({F_{3}})}}{{\partial {B_{{n_{0}}}}}} = \eta \varpi _{({n_{0}})}^{T}{\varpi _{({n_{0}})}}{B_{{n_{0}}}}\), η = 2.

Proof A.7

Let \(C = \varpi _{({n_{0}})}^{T}\), \(C \in {\mathbb {R}^{{I_{{n_{0}}}} \times ({J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}})}}\). From the singular value decomposition, we get that C = PSQ^T. P and Q are orthogonal matrices. S is a diagonal matrix. Since the number of rows is larger than columns in S, there exists U that satisfies SU = E. Let A = QUP^T, CA = PSQ^TQUP^T. Since P and Q are orthogonal matrices, the value of Q^TQ and PP^T is the identity matrix. Since SU = E, CA = E. That is \(\varpi _{({n_{0}})}^{T}A = E\). Thus, there exists a matrix A, \(A \in {\mathbb {R}^{({J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}) \times {I_{{n_{0}}}}}}\), A satisfies \(\varpi _{({n_{0}})}^{T}A = E\).

Proof A.8

According to lemma 4.8. we can obtain that there exists a matrix A, \(A \in {\mathbb {R}^{({J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}) \times {I_{{n_0}}}}}\), A satisfies \(\varpi _{({n_0})}^TA = E\). E is the identity matrix, \(E \in {\mathbb {R}^{{I_{{n_0}}} \times {I_{{n_0}}}}}\). From proof A.7, we get that A = QUP^T. PSQ^T is the singular value decomposition of \(\varpi _{({n_0})}^T\). U can be calculated by SU = E. Since \({\varpi _{({n_0})}}{B_{{n_0}}} = {\upsilon _{({n_0})}}\), \({A^T}{\varpi _{({n_0})}}{B_{{n_0}}} = {A^T}{\upsilon _{({n_0})}}\). Since \(\varpi _{({n_0})}^TA = {A^T}{\varpi _{({n_0})}} = E\), \({B_{{n_0}}} = {A^T}{\upsilon _{({n_0})}}\). That is \({B_{{n_0}}} = P{U^T}{Q^T}{\upsilon _{({n_0})}}\).

1.2 A.2 Supplementary material 2

Proofs A.9, A.10 and A.11 give the proof process of lemmas 4.11, 4.12 and 4.13, respectively.

Proof A.9

According to the rule of matrix multiplication, \({F_4}_{ij} = \sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {P{{{~}_{({n_{0}})}^{T}}_{ip}}} {\widehat \chi _{({n_{0}})pj}}\). From the definition of the trace norm 3.1, \(Trace({F_{4}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {{F_{4}}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{P_{({n_{0}})}}_{pj}{{\widehat \chi }_{({n_{0}})pj}}} } \). When p = m, j = n, \(\frac {{\partial ({P_{({n_{0}})}}_{mn}{{\widehat \chi }_{({n_{0}})mn}})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = {P_{({n_{0}})}}_{mn}\). When p≠m, j≠n, \(\frac {{\partial ({P_{({n_{0}})}}_{pj}{{\widehat \chi }_{({n_{0}})pj}})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{4}})}}{{\partial {{\widehat \chi }_{({n_{0}})}}}} = {P_{({n_{0}})}}\).

Proof A.10

According to the rule of matrix multiplication, \({F_{5}}_{ij} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\widehat \chi _{({n_0})ip}^T{P_{({n_0})}}_{pj}} \). From the definition of the trace norm 3.1, \(Trace({F_5}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {{F_5}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_0}}}} {\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{{\widehat \chi }_{({n_0})pj}}{P_{({n_0})}}_{pj}} } \). When p = m, j = n, \(\frac {{\partial ({{\widehat \chi }_{({n_0})mn}}{P_{({n_0})}}_{mn})}}{{\partial {{\widehat \chi }_{({n_0})mn}}}} = {P_{({n_0})}}_{mn}\). When p≠m, j≠n, \(\frac {{\partial ({{\widehat \chi }_{({n_0})pj}}{P_{({n_0})}}_{pj})}}{{\partial {{\widehat \chi }_{({n_0})mn}}}} = 0\). Thus, \(\frac {{\partial Trace({F_5})}}{{\partial {{\widehat \chi }_{({n_0})}}}} = {P_{({n_0})}}\).

Proof A.11

According to the rule of matrix multiplication, \({F_6}_{ij} = \sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {\widehat \chi _{({n_{0}})ip}^{T}{{\widehat \chi }_{({n_{0}})pj}}} \). From the definition of the trace norm 3.1, \(Trace({F_{6}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {{F_{6}}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {\widehat \chi _{({n_{0}})pj}^{2}} } \). When p = m, j = n, \(\frac {{\partial (\widehat \chi _{({n_{0}})mn}^{2})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = 2{\widehat \chi _{({n_{0}})mn}}\). When p≠m, j≠n, \(\frac {{\partial (\widehat \chi _{({n_{0}})pj}^{2})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{6}})}}{{\partial {{\widehat \chi }_{({n_{0}})}}}} = \eta {\widehat \chi _{({n_{0}})}}\), η = 2.

1.3 A.3 Supplementary material 3

Proofs A.12, A.13, A.14 and A.15 give the proof process of lemmas 4.15, 4.16, 4.17 and 4.18, respectively.

Proof A.12

Let D = AB, \(D \in {\mathbb {R}^{m \times p}}\). According to the rule of matrix multiplication, \({D_{ij}} = \sum \limits _{a = 1}^{n} {{A_{ia}}{B_{aj}}} \). Let G = DC, \(G \in {\mathbb {R}^{m \times m}}\). Then \({G_{ij}} = \sum \limits _{b = 1}^{p} {{D_{ib}}{C_{bj}}} = \sum \limits _{b = 1}^{p} {\sum \limits _{a = 1}^{n} {{A_{ia}}{B_{ab}}{C_{bj}}} } \). From the definition of the trace norm 3.1, \(Trace(G) = \sum \limits _{h = 1}^{m} {{G_{hh}}} = \sum \limits _{h = 1}^{m} {\sum \limits _{b = 1}^{p} {\sum \limits _{a = 1}^{n} {{A_{ha}}{B_{ab}}{C_{bh}}} } } \).

Let D₁ = CA, \({D_{1}} \in {\mathbb {R}^{p \times n}}\). According to the rule of matrix multiplication, \({D_{1ij}} = \sum \limits _{h = 1}^{m} {{C_{ih}}{A_{hj}}} \). Let G₁ = D₁B, \({G_{1}} \in {\mathbb {R}^{p \times p}}\). \({G_{1ij}} = \sum \limits _{a = 1}^{n} {{D_{1ia}}{B_{aj}}} = \sum \limits _{a = 1}^{n} {\sum \limits _{h = 1}^{m} {{C_{ih}}{A_{ha}}{B_{aj}}} } \). From the definition of the trace norm 3.1, \(Trace({G_{1}}) = \sum \limits _{b = 1}^{p} {{G_{1bb}}} = \sum \limits _{b = 1}^{p} {\sum \limits _{a = 1}^{n} {\sum \limits _{h = 1}^{m} {{C_{bh}}{A_{ha}}{B_{ab}}} } } \). Thus, Trace(ABC) = Trace(CAB).

Proof A.13

According to lemma 4.15, we can obtain that \(Trace(\varphi {({\widehat \chi ^{(m)}})^{T}}\widehat \chi _{vec}^{(m)T}\mu ) = Trace(\widehat \chi _{vec}^{(m)T}\mu \varphi {({\widehat \chi ^{(m)}})^T})\). Let \(A = \mu \varphi {({\widehat \chi ^{(m)}})^T}\), \(A \in {\mathbb {R}^{{J_1}{J_2} {\cdots } {J_N}}}\). According to the rule of matrix multiplication, \({A_{ij}} = \sum \limits _{p = 1}^1 {{\mu _{ip}}\varphi ({{\widehat \chi }^{(m)}})_{pj}^T} \). Let \(D = \widehat \chi _{vec}^{(m)T}A\), \({D_{ij}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\widehat \chi _{veciq}^{(m)T}{A_{qj}}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{veciq}^{(m)T}{\mu _{qp}}\varphi ({{\widehat \chi }^{(m)}})_{pj}^T} } \). From the definition of the trace norm 3.1, \(Trace(D) = \sum \limits _{k = 1}^1 {{D_{kk}}} = \sum \limits _{k = 1}^1 {\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{veckq}^{(m)T}{\mu _{qp}}\varphi ({{\widehat \chi }^{(m)}})_{pk}^T} } } \). When q = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^1 {\widehat \chi _{vecst}^{(m)}{\mu _{sp}}\varphi ({{\widehat \chi }^{(m)}})_{pt}^T} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^{1} {{\mu _{sp}}\varphi ({{\widehat \chi }^{(m)}})_{pt}^{T}} \). When q≠s, k≠t, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\widehat \chi _{veckq}^{(m)T}{\mu _{qp}}\varphi ({{\widehat \chi }^{(m)}})_{pk}^{T}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = 0\). Therefore, \(\frac {{\partial {f_{1}}}}{{\partial \widehat \chi _{vec}^{(m)}}} = \mu \varphi {({\widehat \chi ^{(m)}})^{T}}\).

Proof A.14

According to lemma 4.15, we can obtain that \(Trace({\mu ^{T}}\widehat \chi _{vec}^{(m)}\varphi ({\widehat \chi ^{(m)}})) = Trace(\varphi ({\widehat \chi ^{(m)}}){\mu ^T}\widehat \chi _{vec}^{(m)})\). Let \(A = \varphi ({\widehat \chi ^{(m)}}){\mu ^T}\), According to the rule of matrix multiplication, \({A_{ij}} = \sum \limits _{p = 1}^1 {\varphi {{({{\widehat \chi }^{(m)}})}_{ip}}\mu _{pj}^T} \). Let \(D = A\widehat \chi _{vec}^{(m)}\), \({D_{ij}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {{A_{iq}}\widehat \chi _{vecqj}^{(m)}} = \sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{N}}} {\sum \limits _{p = 1}^{1} {\varphi {{({{\widehat \chi }^{(m)}})}_{ip}}\mu _{pq}^{T}\widehat \chi _{vecqj}^{(m)}} } \). From the definition of the trace norm 3.1, \(Trace(D) = \sum \limits _{k = 1}^{1} {{D_{kk}}} = \sum \limits _{k = 1}^{1} {\sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{N}}} {\sum \limits _{p = 1}^{1} {\varphi {{({{\widehat \chi }^{(m)}})}_{kp}}\mu _{pq}^{T}\widehat \chi _{vecqk}^{(m)}} } } \). When q = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\varphi {{({{\widehat \chi }^{(m)}})}_{tp}}\mu _{ps}^{T}\widehat \chi _{vecst}^{(m)}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^1 {{\mu _{sp}}\varphi ({{\widehat \chi }^{(m)}})_{pt}^T} \). When q≠s, k≠t, \(\frac {{\partial (\sum \limits _{p = 1}^1 {\varphi {{({{\widehat \chi }^{(m)}})}_{kp}}\mu _{pq}^T\widehat \chi _{vecqk}^{(m)}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = 0\). Thus, \(\frac {{\partial {f_2}}}{{\partial \widehat \chi _{vec}^{(m)}}} = \mu \varphi {({\widehat \chi ^{(m)}})^T}\).

Proof A.15

Based on lemma 4.15, we get that \(Trace({\mu ^T}\widehat \chi _{vec}^{(m)}\widehat \chi _{vec}^{(m)T}\mu ) = Trace(\widehat \chi _{vec}^{(m)T}\mu {\mu ^T}\widehat \chi _{vec}^{(m)})\). Let D = μμ^T, \(A = \widehat \chi _{vec}^{(m)T}D\). According to the rule of matrix multiplication, \({A_{ij}} = \sum \limits _{p = 1}^1 {\widehat \chi _{vecip}^{(m)T}{D_{pj}}} \). Let \(H{\text { = }}A\widehat \chi _{vec}^{(m)}\), \({H_{ij}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {{A_{iq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qj}}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{vecip}^{(m)T}{D_{pq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qj}}} } \). From the definition of the trace norm 3.1, \(Trace(H) = \sum \limits _{k = 1}^1 {{H_{kk}}} = \sum \limits _{k = 1}^1 {\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{veckp}^{(m)T}{D_{pq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qk}}} } } \). When q = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^1 {\widehat \chi _{vectp}^{(m)T}{D_{ps}}\widehat \chi {{{~}_{vec}^{(m)}}_{st}}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^1 {D_{sp}^T\widehat \chi _{vecpt}^{(m)}} \). When p = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\widehat \chi _{vecst}^{(m)}{D_{sq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qt}}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^{1} {{D_{sq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qt}}} \). In other cases, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\widehat \chi _{veckp}^{(m)T}{D_{pq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qk}}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = 0\). Thus, \(\frac {{\partial {f_{3}}}}{{\partial \widehat \chi _{vec}^{(m)}}} = D\widehat \chi _{vec}^{(m)} + {D^{T}}\widehat \chi _{vec}^{(m)}\). Since D = D^T, \(\frac {{\partial {f_{3}}}}{{\partial \widehat \chi _{vec}^{(m)}}} = \eta \mu {\mu ^{T}}\widehat \chi _{vec}^{(m)}\). η is a constant, η = 2.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, X., Zhang, H., Ye, L. et al. TenLa: an approach based on controllable tensor decomposition and optimized lasso regression for judgement prediction of legal cases. Appl Intell 51, 2233–2252 (2021). https://doi.org/10.1007/s10489-020-01912-z

Download citation

Published: 30 October 2020
Issue Date: April 2021
DOI: https://doi.org/10.1007/s10489-020-01912-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TenLa: an approach based on controllable tensor decomposition and optimized lasso regression for judgement prediction of legal cases

Abstract

Access this article

Similar content being viewed by others

Legal Judgment Prediction Incorporating Guiding Cases Matching

NOWJ at COLIEE 2023: Multi-task and Ensemble Approaches in Legal Information Processing

What Can We Learn from the Legal Provisions in Judgment Documents?

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Ethical approval

Ethical statement

Additional information

Publisher’s note

A Appendix

A Appendix

1.1 A.1 Supplementary material 1

Proof A.1

Proof A.2

Proof A.3

Proof A.4

Proof A.5

Proof A.6

Proof A.7

Proof A.8

1.2 A.2 Supplementary material 2

Proof A.9

Proof A.10

Proof A.11

1.3 A.3 Supplementary material 3

Proof A.12

Proof A.13

Proof A.14

Proof A.15

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation