Skip to main content
Log in

TenLa: an approach based on controllable tensor decomposition and optimized lasso regression for judgement prediction of legal cases

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

With the development of big data and artificial intelligence technology, the computer-assisted judgment of legal cases has become an inevitable trend in the intersection of computer science and law. Judgment prediction methods of legal cases mainly consist of two parts: (1) modeling of legal cases and (2) construction of judgment prediction algorithms. Previous methods for the judgment prediction of legal cases are mainly based on feature models and classification algorithms. Traditional feature models require extensive expert knowledge and manual annotation. They are highly dependent on vocabulary and grammatical information in databases, which are not conducive to the improvement of accuracy and universality of subsequent prediction algorithms. In addition, prediction results obtained by classification algorithms are coarse in granularity and low in accuracy. In general, judgments in similar legal cases are similar. This article proposes a new method for the judgment prediction of legal cases, namely, TenLa, which is based on a controllable algorithm of tensor decomposition and an optimized Lasso regression model. TenLa takes similarities between legal cases as an important indicator of judgment prediction and is mainly divided into three parts: (1) ModTen; we propose a modeling method for legal cases, namely, ModTen, which represents legal cases as three-dimensional tensors. (2) ConTen; we propose a new tensor decomposition algorithm, namely, ConTen, which decomposes tensors obtained by ModTen into core tensors through the intermediary tensor. Core tensors greatly reduce the dimensions of original tensors. (3) OLass; we propose an optimized Lasso regression algorithm, namely, OLass. Core tensors obtained by ConTen are used to train OLass. Specifically, we propose an optimization algorithm for OLass with respect to the intermediary tensor in ConTen; thus, the core tensors obtained by ConTen carry tensor elements and tensor structure information that is most conducive to the improvement of the accuracy of OLass. Experiments show that TenLa has higher accuracy than traditional judgment prediction algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Massey A, Otto P, Antn A (2015) Evaluating legal implementation readiness decision-making. IEEE Trans Softw Eng 41:545–564, 06

    Article  Google Scholar 

  2. Manes GW, Downing E (2009) Overview of licensing and legal issues for digital forensic investigators. IEEE Secur Priv 7(2):45–48

    Article  Google Scholar 

  3. Jing L, Shen C, Yang L, Yu J, Ng MK (2017) Multi-label classification by semi-supervised singular value decomposition. IEEE Transactions on Image Processing, pp 1–1

  4. Qi J, Yu Z, Li D, Li W (2017) A social recommendation method based on trust propagation and singular value decomposition. Journal of Intelligent and Fuzzy Systems Applications in Engineering and Technology

  5. Wimalawarne K, Tomioka R, Sugiyama M (2016) Theoretical and experimental analyses of tensor-based regression and classification. Neural Comput 28(4):686–715

    Article  MathSciNet  Google Scholar 

  6. Taguchi YH (2018) Correction: Tensor decomposition-based unsupervised feature extraction applied to matrix products for multi-view data processing. Plos One 13:7

    Article  Google Scholar 

  7. Zheng X, Ding W, Lin Z, Chen C (2016) Topic tensor factorization for recommender system. Information Sciences, pp S0020025516306144

  8. Gruginskie LADS, Vaccaro GLR, Amaral LAN (2018) Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable. Plos One 13:6

    Article  Google Scholar 

  9. Horsman G, Laing C, Vickers P (2014) A case-based reasoning method for locating evidence during digital forensic device triage. Decis Support Syst 61:69–78

    Article  Google Scholar 

  10. Zhang W, Du Y, Taketoshi Y, Wang Q, Li X (2018) Samen-svr: using sample entropy and support vector regression for bug number prediction. Iet Softw 12(3):183–189

    Article  Google Scholar 

  11. Kaneda Y, Mineno H (2016) Sliding window-based support vector regression for predicting micrometeorological data. Expert Syst Appl 59:217–225

    Article  Google Scholar 

  12. Kim P (2017) Convolutional neural network

  13. Yin W, Schütze H, Xiang B, Zhou B (2015) Abcnn: Attention-based convolutional neural network for modeling sentence pairs. Computer Science

  14. Blanco A, Delgado M, Pegalajar MC A genetic algorithm to obtain the optimal recurrent neural network. Int J Approx Reason 23(1):67–83

  15. Liu T, Yu S, Xu B, Yin H Recurrent networks with attention and convolutional networks for sentence representation and classification. Applied Intelligence the International Journal of Artificial Intelligence Neural Networks and Complex Problem Solving Technologies

  16. Gers FA, Schmidhuber J, Cummins F Learning to forget: Continual prediction with lstm. Neural Comput 12(10): 2451–2471

  17. Yao Y, Huang Z Bi-directional lstm recurrent neural network for chinese word segmentation

  18. Molinar G, Popovic N, Stork W (2018) From data points to ampacity forecasting: Gated recurrent unit networks. In: IEEE Fourth international conference on big data computing service and applications

  19. Zhang L, Zhou Y, Duan X, Chen R A hierarchical multi-input and output bi-gru model for sentiment analysis on customer reviews. Iop Conf 322:062 007–

  20. de Eufrásio ALN, de Francisco ATC Constrained linear regression models for symbolic interval-valued variables. Comput Stat Data Anal 54(2):333–347

  21. Blais E, O’Donnell R, Wimmer K Polynomial regression under arbitrary product distributions. Mach Learn 80(2-3):273–294

  22. Cawley GC, Talbot NLC (2002) Reduced rank kernel ridge regression 16(3):293–302

  23. Yang X, Wen W Ridge and lasso regression models for cross-version defect prediction. IEEE Trans Reliab 67(3):885–896

  24. De Mol C, De Vito E, Rosasco L Elastic-net regularization in learning theory

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (2018YFC0830900 and 2016QY03D0501).

Funding

This study was funded by the National Key Research and Development Program of China (2018YFC0830900 and 2016QY03D0501).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoding Guo.

Ethics declarations

Conflict of interests

The authors declare that we have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Ethical statement

I confirm that the manuscript has not been submitted to more than one journal for simultaneous consideration. The manuscript has not been published previously (partly or in full) unless the new work concerns an expansion of previous work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Appendix

A Appendix

1.1 A.1 Supplementary material 1

Proofs A.1, A.2, A.3, A.4, A.5, A.6, A.7 and A.8 give the proof process of lemmas 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8 and 4.9, respectively.

Proof A.1

From definition 3.4, we can see that values of corresponding elements in χ and \({\chi _{({n_0})}}\) are equal. According to the definition of Frobenius norm 3.3, the sum of squares of elements in χ and \({\chi _{({n_0})}}\) is equal. That is \(\left \| \chi \right \|_F^2 = \left \| {{\chi _{({n_0})}}} \right \|_F^2\).

Proof A.2

Let C = ATA, \(C \in {\mathbb {R}^{J \times J}}\). According to the rule of matrix multiplication, we can obtain that \({C_{ij}} = \sum \limits _{p = 1}^I {A_{ip}^T{A_{pj}}} = \sum \limits _{p = 1}^I {{A_{pi}}{A_{pj}}} \). Based on the definition of the trace norm 3.1, we get that \(Trace(C) = \sum \limits _{j = 1}^J {{C_{jj}}} \). That is \(Trace(C) = \sum \limits _{j = 1}^J {\sum \limits _{p = 1}^I {A_{pj}^2} } \). From the definition of Frobenius norm 3.3, we obtain that \(\left \| A \right \|_F^2 = \sum \limits _{p = 1}^I {\sum \limits _{j = 1}^J {A_{pj}^2} } \). Thus, \(\left \| A \right \|_F^2 = Trace({A^T}A)\).

Proof A.3

Let \(\lambda = \chi { \times _{{n_0}}}P\), \({\mathbb {R}^{{I_1} \times {I_2} \times {\cdots } \times {I_{{n_0} - 1}} \times {J_{{n_0}}} \times {I_{{n_0} + 1}} \times {\cdots } \times {I_N}}}\). From definition 3.6, we get that \({\lambda _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{j_{{n_0}}}{i_{{n_0} + 1}} {\cdots } {i_N}}} = \sum \limits _{i = 1}^{{I_{{n_0}}}} {{\chi _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}i{i_{{n_0} + 1}} {\cdots } {i_N}}}{P_{i{j_{{n_0}}}}}} \). According to definition 3.4, we obtain that \({\lambda _{({n_0})}} \in {\mathbb {R}^{({I_1}{I_2} {\cdots } {I_{{n_0} - 1}}{I_{{n_0} + 1}} {\cdots } {I_N}) \times {J_{{n_0}}}}}\). \({\lambda _{({n_0})}}_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}} = {\lambda _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{j_{{n_0}}}{i_{{n_0} + 1}} {\cdots } {i_N}}}\). Since \({\chi _{({n_0})}} \in {\mathbb {R}^{({I_1}{I_2} {\cdots } {I_{{n_0} - 1}}{I_{{n_0} + 1}} {\cdots } {I_N}) \times {I_{{n_0}}}}}\), let \(A = {\chi _{({n_0})}} \times P\), \(A \in {\mathbb {R}^{({I_1}{I_2} {\cdots } {I_{{n_0} - 1}}{I_{{n_0} + 1}} {\cdots } {I_N}) \times {J_{{n_0}}}}}\). From the rule of matrix multiplication, \({A_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}}} = \sum \limits _{i = 1}^{{I_{{n_{0}}}}} {{\chi _{({n_{0}})}}_{({i_{1}}{i_{2}} {\cdots } {i_{{n_{0}} - 1}}{i_{{n_{0}} + 1}} {\cdots } {i_{N}})i}{P_{i{j_{{n_{0}}}}}}} \). Since \({\chi _{({n_{0}})}}_{({i_{1}}{i_{2}} {\cdots } {i_{{n_{0}} - 1}}{i_{{n_{0}} + 1}} {\cdots } {i_{N}})i} = {\chi _{{i_1}{i_2} {\cdots } {i_{{n_0} - 1}}i{i_{{n_0} + 1}} {\cdots } {i_N}}}\), \({\lambda _{({n_0})}}_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}} = {A_{({i_1}{i_2} {\cdots } {i_{{n_0} - 1}}{i_{{n_0} + 1}} {\cdots } {i_N}){j_{{n_0}}}}}\). \({\lambda _{({n_0})}} = A\). That is \({(\chi { \times _{{n_0}}}P)_{({n_0})}} = {\chi _{({n_0})}} \times P\).

Proof A.4

Let \(C = \upsilon _{({n_0})}^T{\varpi _{({n_0})}}\), \(C \in {\mathbb {R}^{{J_{{n_0}}} \times {I_{{n_0}}}}}\). From the rule of matrix multiplication, \({C_{ij}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\upsilon {{{~}_{({n_0})}^T}_{ip}}{\varpi _{({n_0})}}_{pj}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pi}{\varpi _{({n_0})}}_{pj}} \). Then \({F_1} = C{B_{{n_0}}}\), \({F_1} \in {\mathbb {R}^{{J_{{n_0}}} \times {J_{{n_0}}}}}\). \({F_1}_{ij} = \sum \limits _{q = 1}^{{I_{{n_0}}}} {{C_{iq}}{B_{{n_0}}}_{qj}} = \sum \limits _{q = 1}^{{I_{{n_0}}}} {\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pi}{\varpi _{({n_0})}}_{pq}{B_{{n_0}}}_{qj}} } \). Based on the definition of the trace norm 3.1, we get that \(Trace({F_1}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {{F_1}_{jj}} \). That is \(Trace({F_1}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {\sum \limits _{q = 1}^{{I_{{n_0}}}} {\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pj}{\varpi _{({n_0})}}_{pq}{B_{{n_0}}}_{qj}} } } \). When q = m, j = n, \(\frac {{\partial (\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pn}{\varpi _{({n_0})}}_{pm}{B_{{n_0}}}_{mn}} )}}{{\partial {B_{{n_0}}}_{mn}}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{\upsilon _{({n_0})}}_{pn}{\varpi _{({n_0})}}_{pm}} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\varpi {{{~}_{({n_0})}^T}_{mp}}} {\upsilon _{({n_0})}}_{pn}\). When qm,jn, \(\frac {{\partial (\sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{\upsilon _{({n_{0}})}}_{pj}{\varpi _{({n_{0}})}}_{pq}{B_{{n_{0}}}}_{qj}} )}}{{\partial {B_{{n_{0}}}}_{mn}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{1}})}}{{\partial {B_{{n_{0}}}}}} = \varpi _{({n_{0}})}^{T}{\upsilon _{({n_{0}})}}\).

Proof A.5

Let \(C = B_{{n_{0}}}^{T}\varpi _{({n_{0}})}^{T}\), \(C \in {\mathbb {R}^{{J_{{n_{0}}}} \times ({J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}})}}\). \({C_{ij}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{ip}}\varpi {{{~}_{({n_{0}})}^{T}}_{pj}}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {{B_{{n_{0}}}}_{pi}\varpi {{{~}_{({n_{0}})}^{T}}_{pj}}} \). \({F_{2}} = C{\upsilon _{({n_{0}})}}\), \({F_{2}} \in {\mathbb {R}^{{J_{{n_{0}}}} \times {J_{{n_{0}}}}}}\). \({F_{2}}_{ij} = \sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{C_{iq}}{\upsilon _{({n_{0}})}}_{qj}} \). That is \({F_{2}}_{ij} = \sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {\sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{ip}}\varpi {{{~}_{({n_{0}})}^{T}}_{pq}}{\upsilon _{({n_{0}})}}_{qj}} } \). Based on definition 3.1, \(Trace({F_{2}}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {{F_2}_{jj}} \). \(Trace({F_2}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\sum \limits _{p = 1}^{{I_{{n_0}}}} {{B_{{n_0}}}_{pj}\varpi {{{~}_{({n_0})}^T}_{pq}}{\upsilon _{({n_0})}}_{qj}} } } \). When p = m, j = n, \(\frac {{\partial (\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{B_{{n_0}}}_{mn}\varpi {{{~}_{({n_0})}^T}_{mq}}{\upsilon _{({n_0})}}_{qn}} )}}{{\partial {B_{{n_0}}}_{mn}}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\varpi {{{~}_{({n_0})}^T}_{mq}}{\upsilon _{({n_0})}}_{qn}} \). When pm, jn, \(\frac {{\partial (\sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{B_{{n_{0}}}}_{pj}\varpi {{{~}_{({n_{0}})}^{T}}_{pq}}{\upsilon _{({n_{0}})}}_{qj}} )}}{{\partial {B_{{n_{0}}}}_{mn}}} = 0\). Thus, \(\frac {{\partial Trace({F_{2}})}}{{\partial {B_{{n_{0}}}}}} = \varpi _{({n_{0}})}^{T}{\upsilon _{({n_{0}})}}\).

Proof A.6

Let \(C = \varpi _{({n_{0}})}^{T}{\varpi _{({n_{0}})}}\), \(C \in {\mathbb {R}^{{I_{{n_{0}}}} \times {I_{{n_{0}}}}}}\). Then \({F_{3}} = B_{{n_{0}}}^{T}C{B_{{n_{0}}}}\). Let \(D = B_{{n_{0}}}^{T}C\), \(D \in {\mathbb {R}^{{J_{{n_{0}}}} \times {I_{{n_{0}}}}}}\). \({F_{3}} = D{B_{{n_{0}}}}\). From the rule of matrix multiplication, \({D_{ij}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{ip}}{C_{pj}}} \). \({F_{3}}_{ij} = \sum \limits _{q = 1}^{{I_{{n_{0}}}}} {{D_{iq}}{B_{{n_{0}}}}_{qj}} \). Based on the definition of the trace norm 3.1, \(Trace({F_{3}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {{F_{3}}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{q = 1}^{{I_{{n_{0}}}}} {{D_{jq}}{B_{{n_{0}}}}_{qj}} } \). That is \(Trace({F_{3}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{q = 1}^{{I_{{n_{0}}}}} {\sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{jp}}{C_{pq}}{B_{{n_{0}}}}_{qj}} } } \). When q = m, j = n, \(\frac {{\partial (\sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{np}}{C_{pm}}} {B_{{n_{0}}}}_{mn})}}{{\partial {B_{{n_{0}}}}_{mn}}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{np}}{C_{pm}}} = \sum \limits _{p = 1}^{{I_{{n_{0}}}}} {C_{mp}^{T}{B_{{n_{0}}}}_{pn}} \). When p = m, j = n, \(\frac {{\partial (\sum \limits _{q = 1}^{{I_{{n_{0}}}}} {B{{{~}_{{n_{0}}}^{T}}_{nm}}{C_{mq}}{B_{{n_{0}}}}_{qn}} )}}{{\partial {B_{{n_{0}}}}_{mn}}} = \sum \limits _{q = 1}^{{I_{{n_{0}}}}} {{C_{mq}}{B_{{n_{0}}}}_{qn}} \). In other cases, \(\frac {{\partial (B{{{~}_{{n_{0}}}^{T}}_{jp}}{C_{pq}}{B_{{n_{0}}}}_{qj})}}{{\partial {B_{{n_{0}}}}_{mn}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{3}})}}{{\partial {B_{{n_{0}}}}}} = C{B_{{n_{0}}}} + {C^{T}}{B_{{n_{0}}}}\). Since \(C = \varpi _{({n_{0}})}^{T}{\varpi _{({n_{0}})}}\), CT = C. \(\frac {{\partial Trace({F_{3}})}}{{\partial {B_{{n_{0}}}}}} = 2C{B_{{n_{0}}}}\). Thus, \(\frac {{\partial Trace({F_{3}})}}{{\partial {B_{{n_{0}}}}}} = \eta \varpi _{({n_{0}})}^{T}{\varpi _{({n_{0}})}}{B_{{n_{0}}}}\), η = 2.

Proof A.7

Let \(C = \varpi _{({n_{0}})}^{T}\), \(C \in {\mathbb {R}^{{I_{{n_{0}}}} \times ({J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}})}}\). From the singular value decomposition, we get that C = PSQT. P and Q are orthogonal matrices. S is a diagonal matrix. Since the number of rows is larger than columns in S, there exists U that satisfies SU = E. Let A = QUPT, CA = PSQTQUPT. Since P and Q are orthogonal matrices, the value of QTQ and PPT is the identity matrix. Since SU = E, CA = E. That is \(\varpi _{({n_{0}})}^{T}A = E\). Thus, there exists a matrix A, \(A \in {\mathbb {R}^{({J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}) \times {I_{{n_{0}}}}}}\), A satisfies \(\varpi _{({n_{0}})}^{T}A = E\).

Proof A.8

According to lemma 4.8. we can obtain that there exists a matrix A, \(A \in {\mathbb {R}^{({J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}) \times {I_{{n_0}}}}}\), A satisfies \(\varpi _{({n_0})}^TA = E\). E is the identity matrix, \(E \in {\mathbb {R}^{{I_{{n_0}}} \times {I_{{n_0}}}}}\). From proof A.7, we get that A = QUPT. PSQT is the singular value decomposition of \(\varpi _{({n_0})}^T\). U can be calculated by SU = E. Since \({\varpi _{({n_0})}}{B_{{n_0}}} = {\upsilon _{({n_0})}}\), \({A^T}{\varpi _{({n_0})}}{B_{{n_0}}} = {A^T}{\upsilon _{({n_0})}}\). Since \(\varpi _{({n_0})}^TA = {A^T}{\varpi _{({n_0})}} = E\), \({B_{{n_0}}} = {A^T}{\upsilon _{({n_0})}}\). That is \({B_{{n_0}}} = P{U^T}{Q^T}{\upsilon _{({n_0})}}\).

1.2 A.2 Supplementary material 2

Proofs A.9, A.10 and A.11 give the proof process of lemmas 4.11, 4.12 and 4.13, respectively.

Proof A.9

According to the rule of matrix multiplication, \({F_4}_{ij} = \sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {P{{{~}_{({n_{0}})}^{T}}_{ip}}} {\widehat \chi _{({n_{0}})pj}}\). From the definition of the trace norm 3.1, \(Trace({F_{4}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {{F_{4}}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {{P_{({n_{0}})}}_{pj}{{\widehat \chi }_{({n_{0}})pj}}} } \). When p = m, j = n, \(\frac {{\partial ({P_{({n_{0}})}}_{mn}{{\widehat \chi }_{({n_{0}})mn}})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = {P_{({n_{0}})}}_{mn}\). When pm, jn, \(\frac {{\partial ({P_{({n_{0}})}}_{pj}{{\widehat \chi }_{({n_{0}})pj}})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{4}})}}{{\partial {{\widehat \chi }_{({n_{0}})}}}} = {P_{({n_{0}})}}\).

Proof A.10

According to the rule of matrix multiplication, \({F_{5}}_{ij} = \sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {\widehat \chi _{({n_0})ip}^T{P_{({n_0})}}_{pj}} \). From the definition of the trace norm 3.1, \(Trace({F_5}) = \sum \limits _{j = 1}^{{J_{{n_0}}}} {{F_5}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_0}}}} {\sum \limits _{p = 1}^{{J_1}{J_2} {\cdots } {J_{{n_0} - 1}}{J_{{n_0} + 1}} {\cdots } {J_N}} {{{\widehat \chi }_{({n_0})pj}}{P_{({n_0})}}_{pj}} } \). When p = m, j = n, \(\frac {{\partial ({{\widehat \chi }_{({n_0})mn}}{P_{({n_0})}}_{mn})}}{{\partial {{\widehat \chi }_{({n_0})mn}}}} = {P_{({n_0})}}_{mn}\). When pm, jn, \(\frac {{\partial ({{\widehat \chi }_{({n_0})pj}}{P_{({n_0})}}_{pj})}}{{\partial {{\widehat \chi }_{({n_0})mn}}}} = 0\). Thus, \(\frac {{\partial Trace({F_5})}}{{\partial {{\widehat \chi }_{({n_0})}}}} = {P_{({n_0})}}\).

Proof A.11

According to the rule of matrix multiplication, \({F_6}_{ij} = \sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {\widehat \chi _{({n_{0}})ip}^{T}{{\widehat \chi }_{({n_{0}})pj}}} \). From the definition of the trace norm 3.1, \(Trace({F_{6}}) = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {{F_{6}}_{jj}} = \sum \limits _{j = 1}^{{J_{{n_{0}}}}} {\sum \limits _{p = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{{n_{0}} - 1}}{J_{{n_{0}} + 1}} {\cdots } {J_{N}}} {\widehat \chi _{({n_{0}})pj}^{2}} } \). When p = m, j = n, \(\frac {{\partial (\widehat \chi _{({n_{0}})mn}^{2})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = 2{\widehat \chi _{({n_{0}})mn}}\). When pm, jn, \(\frac {{\partial (\widehat \chi _{({n_{0}})pj}^{2})}}{{\partial {{\widehat \chi }_{({n_{0}})mn}}}} = 0\). Therefore, \(\frac {{\partial Trace({F_{6}})}}{{\partial {{\widehat \chi }_{({n_{0}})}}}} = \eta {\widehat \chi _{({n_{0}})}}\), η = 2.

1.3 A.3 Supplementary material 3

Proofs A.12, A.13, A.14 and A.15 give the proof process of lemmas 4.15, 4.16, 4.17 and 4.18, respectively.

Proof A.12

Let D = AB, \(D \in {\mathbb {R}^{m \times p}}\). According to the rule of matrix multiplication, \({D_{ij}} = \sum \limits _{a = 1}^{n} {{A_{ia}}{B_{aj}}} \). Let G = DC, \(G \in {\mathbb {R}^{m \times m}}\). Then \({G_{ij}} = \sum \limits _{b = 1}^{p} {{D_{ib}}{C_{bj}}} = \sum \limits _{b = 1}^{p} {\sum \limits _{a = 1}^{n} {{A_{ia}}{B_{ab}}{C_{bj}}} } \). From the definition of the trace norm 3.1, \(Trace(G) = \sum \limits _{h = 1}^{m} {{G_{hh}}} = \sum \limits _{h = 1}^{m} {\sum \limits _{b = 1}^{p} {\sum \limits _{a = 1}^{n} {{A_{ha}}{B_{ab}}{C_{bh}}} } } \).

Let D1 = CA, \({D_{1}} \in {\mathbb {R}^{p \times n}}\). According to the rule of matrix multiplication, \({D_{1ij}} = \sum \limits _{h = 1}^{m} {{C_{ih}}{A_{hj}}} \). Let G1 = D1B, \({G_{1}} \in {\mathbb {R}^{p \times p}}\). \({G_{1ij}} = \sum \limits _{a = 1}^{n} {{D_{1ia}}{B_{aj}}} = \sum \limits _{a = 1}^{n} {\sum \limits _{h = 1}^{m} {{C_{ih}}{A_{ha}}{B_{aj}}} } \). From the definition of the trace norm 3.1, \(Trace({G_{1}}) = \sum \limits _{b = 1}^{p} {{G_{1bb}}} = \sum \limits _{b = 1}^{p} {\sum \limits _{a = 1}^{n} {\sum \limits _{h = 1}^{m} {{C_{bh}}{A_{ha}}{B_{ab}}} } } \). Thus, Trace(ABC) = Trace(CAB).

Proof A.13

According to lemma 4.15, we can obtain that \(Trace(\varphi {({\widehat \chi ^{(m)}})^{T}}\widehat \chi _{vec}^{(m)T}\mu ) = Trace(\widehat \chi _{vec}^{(m)T}\mu \varphi {({\widehat \chi ^{(m)}})^T})\). Let \(A = \mu \varphi {({\widehat \chi ^{(m)}})^T}\), \(A \in {\mathbb {R}^{{J_1}{J_2} {\cdots } {J_N}}}\). According to the rule of matrix multiplication, \({A_{ij}} = \sum \limits _{p = 1}^1 {{\mu _{ip}}\varphi ({{\widehat \chi }^{(m)}})_{pj}^T} \). Let \(D = \widehat \chi _{vec}^{(m)T}A\), \({D_{ij}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\widehat \chi _{veciq}^{(m)T}{A_{qj}}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{veciq}^{(m)T}{\mu _{qp}}\varphi ({{\widehat \chi }^{(m)}})_{pj}^T} } \). From the definition of the trace norm 3.1, \(Trace(D) = \sum \limits _{k = 1}^1 {{D_{kk}}} = \sum \limits _{k = 1}^1 {\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{veckq}^{(m)T}{\mu _{qp}}\varphi ({{\widehat \chi }^{(m)}})_{pk}^T} } } \). When q = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^1 {\widehat \chi _{vecst}^{(m)}{\mu _{sp}}\varphi ({{\widehat \chi }^{(m)}})_{pt}^T} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^{1} {{\mu _{sp}}\varphi ({{\widehat \chi }^{(m)}})_{pt}^{T}} \). When qs, kt, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\widehat \chi _{veckq}^{(m)T}{\mu _{qp}}\varphi ({{\widehat \chi }^{(m)}})_{pk}^{T}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = 0\). Therefore, \(\frac {{\partial {f_{1}}}}{{\partial \widehat \chi _{vec}^{(m)}}} = \mu \varphi {({\widehat \chi ^{(m)}})^{T}}\).

Proof A.14

According to lemma 4.15, we can obtain that \(Trace({\mu ^{T}}\widehat \chi _{vec}^{(m)}\varphi ({\widehat \chi ^{(m)}})) = Trace(\varphi ({\widehat \chi ^{(m)}}){\mu ^T}\widehat \chi _{vec}^{(m)})\). Let \(A = \varphi ({\widehat \chi ^{(m)}}){\mu ^T}\), According to the rule of matrix multiplication, \({A_{ij}} = \sum \limits _{p = 1}^1 {\varphi {{({{\widehat \chi }^{(m)}})}_{ip}}\mu _{pj}^T} \). Let \(D = A\widehat \chi _{vec}^{(m)}\), \({D_{ij}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {{A_{iq}}\widehat \chi _{vecqj}^{(m)}} = \sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{N}}} {\sum \limits _{p = 1}^{1} {\varphi {{({{\widehat \chi }^{(m)}})}_{ip}}\mu _{pq}^{T}\widehat \chi _{vecqj}^{(m)}} } \). From the definition of the trace norm 3.1, \(Trace(D) = \sum \limits _{k = 1}^{1} {{D_{kk}}} = \sum \limits _{k = 1}^{1} {\sum \limits _{q = 1}^{{J_{1}}{J_{2}} {\cdots } {J_{N}}} {\sum \limits _{p = 1}^{1} {\varphi {{({{\widehat \chi }^{(m)}})}_{kp}}\mu _{pq}^{T}\widehat \chi _{vecqk}^{(m)}} } } \). When q = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\varphi {{({{\widehat \chi }^{(m)}})}_{tp}}\mu _{ps}^{T}\widehat \chi _{vecst}^{(m)}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^1 {{\mu _{sp}}\varphi ({{\widehat \chi }^{(m)}})_{pt}^T} \). When qs, kt, \(\frac {{\partial (\sum \limits _{p = 1}^1 {\varphi {{({{\widehat \chi }^{(m)}})}_{kp}}\mu _{pq}^T\widehat \chi _{vecqk}^{(m)}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = 0\). Thus, \(\frac {{\partial {f_2}}}{{\partial \widehat \chi _{vec}^{(m)}}} = \mu \varphi {({\widehat \chi ^{(m)}})^T}\).

Proof A.15

Based on lemma 4.15, we get that \(Trace({\mu ^T}\widehat \chi _{vec}^{(m)}\widehat \chi _{vec}^{(m)T}\mu ) = Trace(\widehat \chi _{vec}^{(m)T}\mu {\mu ^T}\widehat \chi _{vec}^{(m)})\). Let D = μμT, \(A = \widehat \chi _{vec}^{(m)T}D\). According to the rule of matrix multiplication, \({A_{ij}} = \sum \limits _{p = 1}^1 {\widehat \chi _{vecip}^{(m)T}{D_{pj}}} \). Let \(H{\text { = }}A\widehat \chi _{vec}^{(m)}\), \({H_{ij}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {{A_{iq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qj}}} = \sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{vecip}^{(m)T}{D_{pq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qj}}} } \). From the definition of the trace norm 3.1, \(Trace(H) = \sum \limits _{k = 1}^1 {{H_{kk}}} = \sum \limits _{k = 1}^1 {\sum \limits _{q = 1}^{{J_1}{J_2} {\cdots } {J_N}} {\sum \limits _{p = 1}^1 {\widehat \chi _{veckp}^{(m)T}{D_{pq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qk}}} } } \). When q = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^1 {\widehat \chi _{vectp}^{(m)T}{D_{ps}}\widehat \chi {{{~}_{vec}^{(m)}}_{st}}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^1 {D_{sp}^T\widehat \chi _{vecpt}^{(m)}} \). When p = s, k = t, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\widehat \chi _{vecst}^{(m)}{D_{sq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qt}}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = \sum \limits _{p = 1}^{1} {{D_{sq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qt}}} \). In other cases, \(\frac {{\partial (\sum \limits _{p = 1}^{1} {\widehat \chi _{veckp}^{(m)T}{D_{pq}}\widehat \chi {{{~}_{vec}^{(m)}}_{qk}}} )}}{{\partial \widehat \chi _{vecst}^{(m)}}} = 0\). Thus, \(\frac {{\partial {f_{3}}}}{{\partial \widehat \chi _{vec}^{(m)}}} = D\widehat \chi _{vec}^{(m)} + {D^{T}}\widehat \chi _{vec}^{(m)}\). Since D = DT, \(\frac {{\partial {f_{3}}}}{{\partial \widehat \chi _{vec}^{(m)}}} = \eta \mu {\mu ^{T}}\widehat \chi _{vec}^{(m)}\). η is a constant, η = 2.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, X., Zhang, H., Ye, L. et al. TenLa: an approach based on controllable tensor decomposition and optimized lasso regression for judgement prediction of legal cases. Appl Intell 51, 2233–2252 (2021). https://doi.org/10.1007/s10489-020-01912-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01912-z

Keywords

Navigation