Abstract
With the development of big data and artificial intelligence technology, the computer-assisted judgment of legal cases has become an inevitable trend in the intersection of computer science and law. Judgment prediction methods of legal cases mainly consist of two parts: (1) modeling of legal cases and (2) construction of judgment prediction algorithms. Previous methods for the judgment prediction of legal cases are mainly based on feature models and classification algorithms. Traditional feature models require extensive expert knowledge and manual annotation. They are highly dependent on vocabulary and grammatical information in databases, which are not conducive to the improvement of accuracy and universality of subsequent prediction algorithms. In addition, prediction results obtained by classification algorithms are coarse in granularity and low in accuracy. In general, judgments in similar legal cases are similar. This article proposes a new method for the judgment prediction of legal cases, namely, TenLa, which is based on a controllable algorithm of tensor decomposition and an optimized Lasso regression model. TenLa takes similarities between legal cases as an important indicator of judgment prediction and is mainly divided into three parts: (1) ModTen; we propose a modeling method for legal cases, namely, ModTen, which represents legal cases as three-dimensional tensors. (2) ConTen; we propose a new tensor decomposition algorithm, namely, ConTen, which decomposes tensors obtained by ModTen into core tensors through the intermediary tensor. Core tensors greatly reduce the dimensions of original tensors. (3) OLass; we propose an optimized Lasso regression algorithm, namely, OLass. Core tensors obtained by ConTen are used to train OLass. Specifically, we propose an optimization algorithm for OLass with respect to the intermediary tensor in ConTen; thus, the core tensors obtained by ConTen carry tensor elements and tensor structure information that is most conducive to the improvement of the accuracy of OLass. Experiments show that TenLa has higher accuracy than traditional judgment prediction algorithms.
Similar content being viewed by others
References
Massey A, Otto P, Antn A (2015) Evaluating legal implementation readiness decision-making. IEEE Trans Softw Eng 41:545–564, 06
Manes GW, Downing E (2009) Overview of licensing and legal issues for digital forensic investigators. IEEE Secur Priv 7(2):45–48
Jing L, Shen C, Yang L, Yu J, Ng MK (2017) Multi-label classification by semi-supervised singular value decomposition. IEEE Transactions on Image Processing, pp 1–1
Qi J, Yu Z, Li D, Li W (2017) A social recommendation method based on trust propagation and singular value decomposition. Journal of Intelligent and Fuzzy Systems Applications in Engineering and Technology
Wimalawarne K, Tomioka R, Sugiyama M (2016) Theoretical and experimental analyses of tensor-based regression and classification. Neural Comput 28(4):686–715
Taguchi YH (2018) Correction: Tensor decomposition-based unsupervised feature extraction applied to matrix products for multi-view data processing. Plos One 13:7
Zheng X, Ding W, Lin Z, Chen C (2016) Topic tensor factorization for recommender system. Information Sciences, pp S0020025516306144
Gruginskie LADS, Vaccaro GLR, Amaral LAN (2018) Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable. Plos One 13:6
Horsman G, Laing C, Vickers P (2014) A case-based reasoning method for locating evidence during digital forensic device triage. Decis Support Syst 61:69–78
Zhang W, Du Y, Taketoshi Y, Wang Q, Li X (2018) Samen-svr: using sample entropy and support vector regression for bug number prediction. Iet Softw 12(3):183–189
Kaneda Y, Mineno H (2016) Sliding window-based support vector regression for predicting micrometeorological data. Expert Syst Appl 59:217–225
Kim P (2017) Convolutional neural network
Yin W, Schütze H, Xiang B, Zhou B (2015) Abcnn: Attention-based convolutional neural network for modeling sentence pairs. Computer Science
Blanco A, Delgado M, Pegalajar MC A genetic algorithm to obtain the optimal recurrent neural network. Int J Approx Reason 23(1):67–83
Liu T, Yu S, Xu B, Yin H Recurrent networks with attention and convolutional networks for sentence representation and classification. Applied Intelligence the International Journal of Artificial Intelligence Neural Networks and Complex Problem Solving Technologies
Gers FA, Schmidhuber J, Cummins F Learning to forget: Continual prediction with lstm. Neural Comput 12(10): 2451–2471
Yao Y, Huang Z Bi-directional lstm recurrent neural network for chinese word segmentation
Molinar G, Popovic N, Stork W (2018) From data points to ampacity forecasting: Gated recurrent unit networks. In: IEEE Fourth international conference on big data computing service and applications
Zhang L, Zhou Y, Duan X, Chen R A hierarchical multi-input and output bi-gru model for sentiment analysis on customer reviews. Iop Conf 322:062 007–
de Eufrásio ALN, de Francisco ATC Constrained linear regression models for symbolic interval-valued variables. Comput Stat Data Anal 54(2):333–347
Blais E, O’Donnell R, Wimmer K Polynomial regression under arbitrary product distributions. Mach Learn 80(2-3):273–294
Cawley GC, Talbot NLC (2002) Reduced rank kernel ridge regression 16(3):293–302
Yang X, Wen W Ridge and lasso regression models for cross-version defect prediction. IEEE Trans Reliab 67(3):885–896
De Mol C, De Vito E, Rosasco L Elastic-net regularization in learning theory
Acknowledgements
This work was supported by the National Key Research and Development Program of China (2018YFC0830900 and 2016QY03D0501).
Funding
This study was funded by the National Key Research and Development Program of China (2018YFC0830900 and 2016QY03D0501).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that we have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Ethical statement
I confirm that the manuscript has not been submitted to more than one journal for simultaneous consideration. The manuscript has not been published previously (partly or in full) unless the new work concerns an expansion of previous work.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A Appendix
A Appendix
1.1 A.1 Supplementary material 1
Proofs A.1, A.2, A.3, A.4, A.5, A.6, A.7 and A.8 give the proof process of lemmas 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8 and 4.9, respectively.
Proof A.1
From definition 3.4, we can see that values of corresponding elements in χ and \({\chi _{({n_0})}}\) are equal. According to the definition of Frobenius norm 3.3, the sum of squares of elements in χ and \({\chi _{({n_0})}}\) is equal. That is \(\left \| \chi \right \|_F^2 = \left \| {{\chi _{({n_0})}}} \right \|_F^2\).
Proof A.2
Let C = ATA, \(C \in {\mathbb {R}^{J \times J}}\). According to the rule of matrix multiplication, we can obtain that \({C_{ij}} = \sum \limits _{p = 1}^I {A_{ip}^T{A_{pj}}} = \sum \limits _{p = 1}^I {{A_{pi}}{A_{pj}}} \). Based on the definition of the trace norm 3.1, we get that \(Trace(C) = \sum \limits _{j = 1}^J {{C_{jj}}} \). That is \(Trace(C) = \sum \limits _{j = 1}^J {\sum \limits _{p = 1}^I {A_{pj}^2} } \). From the definition of Frobenius norm 3.3, we obtain that \(\left \| A \right \|_F^2 = \sum \limits _{p = 1}^I {\sum \limits _{j = 1}^J {A_{pj}^2} } \). Thus, \(\left \| A \right \|_F^2 = Trace({A^T}A)\).
Proof A.3
Let \(\lambda = \chi { \times _{{n_0}}}P\), \({\mathbb {R}^{{I_1} \times {I_2} \times {\cdots } \times {I_{{n_0} - 1}} \times {J_{{n_0}}} \times {I_{{n_0} + 1}} \times {\cdots } \times {I_N}}}\). From definition 3.6, we get that \({\lambda _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{j_{{n_0}}}{i_{{n_0} + 1}} {\cdots } {i_N}}} = \sum \limits _{i = 1}^{{I_{{n_0}}}} {{\chi _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}i{i_{{n_0} + 1}} {\cdots } {i_N}}}{P_{i{j_{{n_0}}}}}} \). According to definition 3.4, we obtain that \({\lambda _{({n_0})}} \in {\mathbb {R}^{({I_1}{I_2} {\cdots } {I_{{n_0} - 1}}{I_{{n_0} + 1}} {\cdots } {I_N}) \times {J_{{n_0}}}}}\). \({\lambda _{({n_0})}}_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}} = {\lambda _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{j_{{n_0}}}{i_{{n_0} + 1}} {\cdots } {i_N}}}\). Since \({\chi _{({n_0})}} \in {\mathbb {R}^{({I_1}{I_2} {\cdots } {I_{{n_0} - 1}}{I_{{n_0} + 1}} {\cdots } {I_N}) \times {I_{{n_0}}}}}\), let \(A = {\chi _{({n_0})}} \times P\), \(A \in {\mathbb {R}^{({I_1}{I_2} {\cdots } {I_{{n_0} - 1}}{I_{{n_0} + 1}} {\cdots } {I_N}) \times {J_{{n_0}}}}}\). From the rule of matrix multiplication, \({A_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}}} = \sum \limits _{i = 1}^{{I_{{n_{0}}}}} {{\chi _{({n_{0}})}}_{({i_{1}}{i_{2}} {\cdots } {i_{{n_{0}} - 1}}{i_{{n_{0}} + 1}} {\cdots } {i_{N}})i}{P_{i{j_{{n_{0}}}}}}} \). Since \({\chi _{({n_{0}})}}_{({i_{1}}{i_{2}} {\cdots } {i_{{n_{0}} - 1}}{i_{{n_{0}} + 1}} {\cdots } {i_{N}})i} = {\chi _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}i{i_{{n_0} + 1}} {\cdots } {i_N}}}\), \({\lambda _{({n_0})}}_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}} = {A_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}}}\). \({\lambda _{({n_0})}} = A\). That is \({(\chi { \times _{{n_0}}}P)_{({n_0})}} = {\chi _{({n_0})}} \times P\).
Proof A.4
Let \(C = \upsilon _{({n_0})}^T{\varpi _{({n_0})}}\), \(C \in {\mathbb {R}^{{J_{{n_0}}} \times {I_{{n_0}}}}}\). From the rule of matrix multiplication, \({C_{ij}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\upsilon {{{~}_{({n_0})}^T}_{ip}}{\varpi _{({n_0})}}_{pj}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pi}{\varpi _{({n_0})}}_{pj}} \). Then \({F_1} = C{B_{{n_0}}}\), \({F_1} \in {\mathbb {R}^{{J_{{n_0}}} \times {J_{{n_0}}}}}\). \({F_1}_{ij} = \sum \limits _{q = 1}^{{I_{{n_0}}}} {{C_{iq}}{B_{{n_0}}}_{qj}} = \sum \limits _{q = 1}^{{I_{{n_0}}}} {\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pi}{\varpi _{({n_0})}}_{pq}{B_{{n_0}}}_{qj}} } \). Based on the definition of the trace norm 3.1, we get that \(Trace({F_1}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {{F_1}_{jj}} \). That is \(Trace({F_1}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {\sum \limits _{q = 1}^{{I_{{n_0}}}} {\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pj}{\varpi _{({n_0})}}_{pq}{B_{{n_0}}}_{qj}} } } \). When q = m, j = n, \(\frac {{\partial (\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pn}{\varpi _{({n_0})}}_{pm}{B_{{n_0}}}_{mn}} )}}{{\partial {B_{{n_0}}}_{mn}}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pn}{\varpi _{({n_0})}}_{pm}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\varpi {{{~}_{({n_0})}^T}_{mp}}} {\upsilon _{({n_0})}}_{pn}\). When q≠m,j≠n, \(\frac {{\partial (\sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{\upsilon _{({n_{0}})}}_{pj}{\varpi _{({n_{0}})}}_{pq}{B_{{n_{0}}}}_{qj}} )}}{{\partial {B_{{n_{0}}}}_{mn}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{1}})}}{{\partial {B_{{n_{0}}}}}} = \varpi _{({n_{0}})}^{T}{\upsilon _{({n_{0}})}}\).
Proof A.5
Let \(C = B_{{n_{0}}}^{T}\varpi _{({n_{0}})}^{T}\), \(C \in {\mathbb {R}^{{J_{{n_{0}}}} \times ({J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}})}}\). \({C_{ij}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{ip}}\varpi {{{~}_{({n_{0}})}^{T}}_{pj}}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {{B_{{n_{0}}}}_{pi}\varpi {{{~}_{({n_{0}})}^{T}}_{pj}}} \). \({F_{2}} = C{\upsilon _{({n_{0}})}}\), \({F_{2}} \in {\mathbb {R}^{{J_{{n_{0}}}} \times {J_{{n_{0}}}}}}\). \({F_{2}}_{ij} = \sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{C_{iq}}{\upsilon _{({n_{0}})}}_{qj}} \). That is \({F_{2}}_{ij} = \sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {\sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{ip}}\varpi {{{~}_{({n_{0}})}^{T}}_{pq}}{\upsilon _{({n_{0}})}}_{qj}} } \). Based on definition 3.1, \(Trace({F_{2}}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {{F_2}_{jj}} \). \(Trace({F_2}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\sum \limits _{p = 1}^{{I_{{n_0}}}} {{B_{{n_0}}}_{pj}\varpi {{{~}_{({n_0})}^T}_{pq}}{\upsilon _{({n_0})}}_{qj}} } } \). When p = m, j = n, \(\frac {{\partial (\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{B_{{n_0}}}_{mn}\varpi {{{~}_{({n_0})}^T}_{mq}}{\upsilon _{({n_0})}}_{qn}} )}}{{\partial {B_{{n_0}}}_{mn}}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\varpi {{{~}_{({n_0})}^T}_{mq}}{\upsilon _{({n_0})}}_{qn}} \). When p≠m, j≠n, \(\frac {{\partial (\sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{B_{{n_{0}}}}_{pj}\varpi {{{~}_{({n_{0}})}^{T}}_{pq}}{\upsilon _{({n_{0}})}}_{qj}} )}}{{\partial {B_{{n_{0}}}}_{mn}}} = 0\). Thus, \(\frac {{\partial Trace({F_{2}})}}{{\partial {B_{{n_{0}}}}}} = \varpi _{({n_{0}})}^{T}{\upsilon _{({n_{0}})}}\).
Proof A.6
Let \(C = \varpi _{({n_{0}})}^{T}{\varpi _{({n_{0}})}}\), \(C \in {\mathbb {R}^{{I_{{n_{0}}}} \times {I_{{n_{0}}}}}}\). Then \({F_{3}} = B_{{n_{0}}}^{T}C{B_{{n_{0}}}}\). Let \(D = B_{{n_{0}}}^{T}C\), \(D \in {\mathbb {R}^{{J_{{n_{0}}}} \times {I_{{n_{0}}}}}}\). \({F_{3}} = D{B_{{n_{0}}}}\). From the rule of matrix multiplication, \({D_{ij}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{ip}}{C_{pj}}} \). \({F_{3}}_{ij} = \sum \limits _{q = 1}^{{I_{{n_{0}}}}} {{D_{iq}}{B_{{n_{0}}}}_{qj}} \). Based on the definition of the trace norm 3.1, \(Trace({F_{3}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {{F_{3}}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{q = 1}^{{I_{{n_{0}}}}} {{D_{jq}}{B_{{n_{0}}}}_{qj}} } \). That is \(Trace({F_{3}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{q = 1}^{{I_{{n_{0}}}}} {\sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{jp}}{C_{pq}}{B_{{n_{0}}}}_{qj}} } } \). When q = m, j = n, \(\frac {{\partial (\sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{np}}{C_{pm}}} {B_{{n_{0}}}}_{mn})}}{{\partial {B_{{n_{0}}}}_{mn}}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{np}}{C_{pm}}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {C_{mp}^{T}{B_{{n_{0}}}}_{pn}} \). When p = m, j = n, \(\frac {{\partial (\sum \limits _{q = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{nm}}{C_{mq}}{B_{{n_{0}}}}_{qn}} )}}{{\partial {B_{{n_{0}}}}_{mn}}} = \sum \limits _{q = 1}^{{I_{{n_{0}}}}} {{C_{mq}}{B_{{n_{0}}}}_{qn}} \). In other cases, \(\frac {{\partial (B{{{~}_{{n_{0}}}^{T}}_{jp}}{C_{pq}}{B_{{n_{0}}}}_{qj})}}{{\partial {B_{{n_{0}}}}_{mn}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{3}})}}{{\partial {B_{{n_{0}}}}}} = C{B_{{n_{0}}}} + {C^{T}}{B_{{n_{0}}}}\). Since \(C = \varpi _{({n_{0}})}^{T}{\varpi _{({n_{0}})}}\), CT = C. \(\frac {{\partial Trace({F_{3}})}}{{\partial {B_{{n_{0}}}}}} = 2C{B_{{n_{0}}}}\). Thus, \(\frac {{\partial Trace({F_{3}})}}{{\partial {B_{{n_{0}}}}}} = \eta \varpi _{({n_{0}})}^{T}{\varpi _{({n_{0}})}}{B_{{n_{0}}}}\), η = 2.
Proof A.7
Let \(C = \varpi _{({n_{0}})}^{T}\), \(C \in {\mathbb {R}^{{I_{{n_{0}}}} \times ({J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}})}}\). From the singular value decomposition, we get that C = PSQT. P and Q are orthogonal matrices. S is a diagonal matrix. Since the number of rows is larger than columns in S, there exists U that satisfies SU = E. Let A = QUPT, CA = PSQTQUPT. Since P and Q are orthogonal matrices, the value of QTQ and PPT is the identity matrix. Since SU = E, CA = E. That is \(\varpi _{({n_{0}})}^{T}A = E\). Thus, there exists a matrix A, \(A \in {\mathbb {R}^{({J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}) \times {I_{{n_{0}}}}}}\), A satisfies \(\varpi _{({n_{0}})}^{T}A = E\).
Proof A.8
According to lemma 4.8. we can obtain that there exists a matrix A, \(A \in {\mathbb {R}^{({J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}) \times {I_{{n_0}}}}}\), A satisfies \(\varpi _{({n_0})}^TA = E\). E is the identity matrix, \(E \in {\mathbb {R}^{{I_{{n_0}}} \times {I_{{n_0}}}}}\). From proof A.7, we get that A = QUPT. PSQT is the singular value decomposition of \(\varpi _{({n_0})}^T\). U can be calculated by SU = E. Since \({\varpi _{({n_0})}}{B_{{n_0}}} = {\upsilon _{({n_0})}}\), \({A^T}{\varpi _{({n_0})}}{B_{{n_0}}} = {A^T}{\upsilon _{({n_0})}}\). Since \(\varpi _{({n_0})}^TA = {A^T}{\varpi _{({n_0})}} = E\), \({B_{{n_0}}} = {A^T}{\upsilon _{({n_0})}}\). That is \({B_{{n_0}}} = P{U^T}{Q^T}{\upsilon _{({n_0})}}\).
1.2 A.2 Supplementary material 2
Proofs A.9, A.10 and A.11 give the proof process of lemmas 4.11, 4.12 and 4.13, respectively.
Proof A.9
According to the rule of matrix multiplication, \({F_4}_{ij} = \sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {P{{{~}_{({n_{0}})}^{T}}_{ip}}} {\widehat \chi _{({n_{0}})pj}}\). From the definition of the trace norm 3.1, \(Trace({F_{4}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {{F_{4}}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{P_{({n_{0}})}}_{pj}{{\widehat \chi }_{({n_{0}})pj}}} } \). When p = m, j = n, \(\frac {{\partial ({P_{({n_{0}})}}_{mn}{{\widehat \chi }_{({n_{0}})mn}})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = {P_{({n_{0}})}}_{mn}\). When p≠m, j≠n, \(\frac {{\partial ({P_{({n_{0}})}}_{pj}{{\widehat \chi }_{({n_{0}})pj}})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{4}})}}{{\partial {{\widehat \chi }_{({n_{0}})}}}} = {P_{({n_{0}})}}\).
Proof A.10
According to the rule of matrix multiplication, \({F_{5}}_{ij} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\widehat \chi _{({n_0})ip}^T{P_{({n_0})}}_{pj}} \). From the definition of the trace norm 3.1, \(Trace({F_5}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {{F_5}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_0}}}} {\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{{\widehat \chi }_{({n_0})pj}}{P_{({n_0})}}_{pj}} } \). When p = m, j = n, \(\frac {{\partial ({{\widehat \chi }_{({n_0})mn}}{P_{({n_0})}}_{mn})}}{{\partial {{\widehat \chi }_{({n_0})mn}}}} = {P_{({n_0})}}_{mn}\). When p≠m, j≠n, \(\frac {{\partial ({{\widehat \chi }_{({n_0})pj}}{P_{({n_0})}}_{pj})}}{{\partial {{\widehat \chi }_{({n_0})mn}}}} = 0\). Thus, \(\frac {{\partial Trace({F_5})}}{{\partial {{\widehat \chi }_{({n_0})}}}} = {P_{({n_0})}}\).
Proof A.11
According to the rule of matrix multiplication, \({F_6}_{ij} = \sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {\widehat \chi _{({n_{0}})ip}^{T}{{\widehat \chi }_{({n_{0}})pj}}} \). From the definition of the trace norm 3.1, \(Trace({F_{6}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {{F_{6}}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {\widehat \chi _{({n_{0}})pj}^{2}} } \). When p = m, j = n, \(\frac {{\partial (\widehat \chi _{({n_{0}})mn}^{2})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = 2{\widehat \chi _{({n_{0}})mn}}\). When p≠m, j≠n, \(\frac {{\partial (\widehat \chi _{({n_{0}})pj}^{2})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{6}})}}{{\partial {{\widehat \chi }_{({n_{0}})}}}} = \eta {\widehat \chi _{({n_{0}})}}\), η = 2.
1.3 A.3 Supplementary material 3
Proofs A.12, A.13, A.14 and A.15 give the proof process of lemmas 4.15, 4.16, 4.17 and 4.18, respectively.
Proof A.12
Let D = AB, \(D \in {\mathbb {R}^{m \times p}}\). According to the rule of matrix multiplication, \({D_{ij}} = \sum \limits _{a = 1}^{n} {{A_{ia}}{B_{aj}}} \). Let G = DC, \(G \in {\mathbb {R}^{m \times m}}\). Then \({G_{ij}} = \sum \limits _{b = 1}^{p} {{D_{ib}}{C_{bj}}} = \sum \limits _{b = 1}^{p} {\sum \limits _{a = 1}^{n} {{A_{ia}}{B_{ab}}{C_{bj}}} } \). From the definition of the trace norm 3.1, \(Trace(G) = \sum \limits _{h = 1}^{m} {{G_{hh}}} = \sum \limits _{h = 1}^{m} {\sum \limits _{b = 1}^{p} {\sum \limits _{a = 1}^{n} {{A_{ha}}{B_{ab}}{C_{bh}}} } } \).
Let D1 = CA, \({D_{1}} \in {\mathbb {R}^{p \times n}}\). According to the rule of matrix multiplication, \({D_{1ij}} = \sum \limits _{h = 1}^{m} {{C_{ih}}{A_{hj}}} \). Let G1 = D1B, \({G_{1}} \in {\mathbb {R}^{p \times p}}\). \({G_{1ij}} = \sum \limits _{a = 1}^{n} {{D_{1ia}}{B_{aj}}} = \sum \limits _{a = 1}^{n} {\sum \limits _{h = 1}^{m} {{C_{ih}}{A_{ha}}{B_{aj}}} } \). From the definition of the trace norm 3.1, \(Trace({G_{1}}) = \sum \limits _{b = 1}^{p} {{G_{1bb}}} = \sum \limits _{b = 1}^{p} {\sum \limits _{a = 1}^{n} {\sum \limits _{h = 1}^{m} {{C_{bh}}{A_{ha}}{B_{ab}}} } } \). Thus, Trace(ABC) = Trace(CAB).
Proof A.13
According to lemma 4.15, we can obtain that \(Trace(\varphi {({\widehat \chi ^{(m)}})^{T}}\widehat \chi _{vec}^{(m)T}\mu ) = Trace(\widehat \chi _{vec}^{(m)T}\mu \varphi {({\widehat \chi ^{(m)}})^T})\). Let \(A = \mu \varphi {({\widehat \chi ^{(m)}})^T}\), \(A \in {\mathbb {R}^{{J_1}{J_2} {\cdots } {J_N}}}\). According to the rule of matrix multiplication, \({A_{ij}} = \sum \limits _{p = 1}^1 {{\mu _{ip}}\varphi ({{\widehat \chi }^{(m)}})_{pj}^T} \). Let \(D = \widehat \chi _{vec}^{(m)T}A\), \({D_{ij}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\widehat \chi _{veciq}^{(m)T}{A_{qj}}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{veciq}^{(m)T}{\mu _{qp}}\varphi ({{\widehat \chi }^{(m)}})_{pj}^T} } \). From the definition of the trace norm 3.1, \(Trace(D) = \sum \limits _{k = 1}^1 {{D_{kk}}} = \sum \limits _{k = 1}^1 {\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{veckq}^{(m)T}{\mu _{qp}}\varphi ({{\widehat \chi }^{(m)}})_{pk}^T} } } \). When q = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^1 {\widehat \chi _{vecst}^{(m)}{\mu _{sp}}\varphi ({{\widehat \chi }^{(m)}})_{pt}^T} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^{1} {{\mu _{sp}}\varphi ({{\widehat \chi }^{(m)}})_{pt}^{T}} \). When q≠s, k≠t, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\widehat \chi _{veckq}^{(m)T}{\mu _{qp}}\varphi ({{\widehat \chi }^{(m)}})_{pk}^{T}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = 0\). Therefore, \(\frac {{\partial {f_{1}}}}{{\partial \widehat \chi _{vec}^{(m)}}} = \mu \varphi {({\widehat \chi ^{(m)}})^{T}}\).
Proof A.14
According to lemma 4.15, we can obtain that \(Trace({\mu ^{T}}\widehat \chi _{vec}^{(m)}\varphi ({\widehat \chi ^{(m)}})) = Trace(\varphi ({\widehat \chi ^{(m)}}){\mu ^T}\widehat \chi _{vec}^{(m)})\). Let \(A = \varphi ({\widehat \chi ^{(m)}}){\mu ^T}\), According to the rule of matrix multiplication, \({A_{ij}} = \sum \limits _{p = 1}^1 {\varphi {{({{\widehat \chi }^{(m)}})}_{ip}}\mu _{pj}^T} \). Let \(D = A\widehat \chi _{vec}^{(m)}\), \({D_{ij}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {{A_{iq}}\widehat \chi _{vecqj}^{(m)}} = \sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{N}}} {\sum \limits _{p = 1}^{1} {\varphi {{({{\widehat \chi }^{(m)}})}_{ip}}\mu _{pq}^{T}\widehat \chi _{vecqj}^{(m)}} } \). From the definition of the trace norm 3.1, \(Trace(D) = \sum \limits _{k = 1}^{1} {{D_{kk}}} = \sum \limits _{k = 1}^{1} {\sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{N}}} {\sum \limits _{p = 1}^{1} {\varphi {{({{\widehat \chi }^{(m)}})}_{kp}}\mu _{pq}^{T}\widehat \chi _{vecqk}^{(m)}} } } \). When q = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\varphi {{({{\widehat \chi }^{(m)}})}_{tp}}\mu _{ps}^{T}\widehat \chi _{vecst}^{(m)}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^1 {{\mu _{sp}}\varphi ({{\widehat \chi }^{(m)}})_{pt}^T} \). When q≠s, k≠t, \(\frac {{\partial (\sum \limits _{p = 1}^1 {\varphi {{({{\widehat \chi }^{(m)}})}_{kp}}\mu _{pq}^T\widehat \chi _{vecqk}^{(m)}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = 0\). Thus, \(\frac {{\partial {f_2}}}{{\partial \widehat \chi _{vec}^{(m)}}} = \mu \varphi {({\widehat \chi ^{(m)}})^T}\).
Proof A.15
Based on lemma 4.15, we get that \(Trace({\mu ^T}\widehat \chi _{vec}^{(m)}\widehat \chi _{vec}^{(m)T}\mu ) = Trace(\widehat \chi _{vec}^{(m)T}\mu {\mu ^T}\widehat \chi _{vec}^{(m)})\). Let D = μμT, \(A = \widehat \chi _{vec}^{(m)T}D\). According to the rule of matrix multiplication, \({A_{ij}} = \sum \limits _{p = 1}^1 {\widehat \chi _{vecip}^{(m)T}{D_{pj}}} \). Let \(H{\text { = }}A\widehat \chi _{vec}^{(m)}\), \({H_{ij}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {{A_{iq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qj}}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{vecip}^{(m)T}{D_{pq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qj}}} } \). From the definition of the trace norm 3.1, \(Trace(H) = \sum \limits _{k = 1}^1 {{H_{kk}}} = \sum \limits _{k = 1}^1 {\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{veckp}^{(m)T}{D_{pq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qk}}} } } \). When q = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^1 {\widehat \chi _{vectp}^{(m)T}{D_{ps}}\widehat \chi {{{~}_{vec}^{(m)}}_{st}}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^1 {D_{sp}^T\widehat \chi _{vecpt}^{(m)}} \). When p = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\widehat \chi _{vecst}^{(m)}{D_{sq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qt}}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^{1} {{D_{sq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qt}}} \). In other cases, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\widehat \chi _{veckp}^{(m)T}{D_{pq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qk}}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = 0\). Thus, \(\frac {{\partial {f_{3}}}}{{\partial \widehat \chi _{vec}^{(m)}}} = D\widehat \chi _{vec}^{(m)} + {D^{T}}\widehat \chi _{vec}^{(m)}\). Since D = DT, \(\frac {{\partial {f_{3}}}}{{\partial \widehat \chi _{vec}^{(m)}}} = \eta \mu {\mu ^{T}}\widehat \chi _{vec}^{(m)}\). η is a constant, η = 2.
Rights and permissions
About this article
Cite this article
Guo, X., Zhang, H., Ye, L. et al. TenLa: an approach based on controllable tensor decomposition and optimized lasso regression for judgement prediction of legal cases. Appl Intell 51, 2233–2252 (2021). https://doi.org/10.1007/s10489-020-01912-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-01912-z