Skip to main content
Log in

Ternary tree-based structural twin support tensor machine for clustering

  • Theoretical advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Most of the real-life applications usually involve complex data, e.g., grayscale images, where information is distributed spatially in the form of two-dimensional matrices (elements of second-order tensor space). Traditional vector-based clustering models such as k-means and support vector clustering rely on low-dimensional features representations for identifying patterns and are prone to loss of useful information which is present in spatial structure of the data. To overcome this limitation, tensor-based clustering models can be utilized for identifying relevant patterns in matrix data as they take advantage of structural information present in multi-dimensional framework and reduce computational overheads as well. However, despite these numerous advantages, tensor clustering has still remained relatively unexplored research area. In this paper, we propose a novel clustering framework, termed as Ternary Tree-based Structural Least Squares Support Tensor Clustering (TT-SLSTWSTC), that builds a cluster model as a hierarchical ternary tree, where at each node non-ambiguous data are dealt separately from ambiguous data points using the proposed Ternary Structural Least Squares Support Tensor Machine (TS-LSTWSTM). The TS-LSTWSTM classifier considers the structural risk minimization of data alongside a symmetrical L2-norm loss function. Also, initialization framework based on tensor k-means has been used in order to overcome the instability disseminated by random initialization. To validate the efficacy of the proposed framework, computational experiments have been performed on human activity recognition and image recognition problems. Experimental results show that our method is not only fast but yields significantly better generalization performance and is comparatively more robust in order to handle heteroscedastic noise and outliers when compared to related methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Basu S, Karki M, Ganguly S, DiBiano R, Mukhopadhyay S, Gayaka S, Kannan R, Nemani R (2017) Learning sparse feature representations using probabilistic quadtrees and deep belief nets. Neural Process Lett 45(3):855–867

    Article  Google Scholar 

  2. Baumann F, Ehlers A, Rosenhahn B, Liao J (2016) Recognizing human actions using novel space-time volume binary patterns. Neurocomputing 173:54–63

    Article  Google Scholar 

  3. Ben-Hur A (2008) Support vector clustering. Scholarpedia 3(6):5187. https://doi.org/10.4249/scholarpedia.5187

    Article  Google Scholar 

  4. Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: Tenth IEEE international conference on computer vision (ICCV’05) Volume 1, vol 2, pp 1395–1402. IEEE

  5. Bradley PS, Mangasarian OL (2000) K-plane clustering. J Global Optim 16(1):23–32

    Article  MathSciNet  Google Scholar 

  6. Cai D, He X, Wen JR, Han J, Ma WY (2006) Support tensor machines for text categorization. Technical report

  7. Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, New York

    MATH  Google Scholar 

  8. Escalante R, Raydan M (2011) Alternating projection methods, vol 8. SIAM, Philadelphia

    Book  Google Scholar 

  9. Gu Z, Zhang Z, Sun J, Li B (2017) Robust image recognition by l1-norm twin-projection support vector machine. Neurocomputing 223:1–11

    Article  Google Scholar 

  10. Izenman AJ (2013) Linear discriminant analysis. In: Modern multivariate statistical techniques. Springer, New York, pp 237–280

  11. Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666

    Article  Google Scholar 

  12. Jayadeva, Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910

    Article  Google Scholar 

  13. Khemchandani R (2008) Mathematical programming applications in machine learning. Ph.D. thesis, Indian Institute of Technology Delhi New Delhi, India

  14. Khemchandani R, Pal A, Chandra S (2016) Fuzzy least squares twin support vector clustering. Neural Comput Appl 29:1–11

    Google Scholar 

  15. Khemchandani R, Pal A, Chandra S (2018) Fuzzy least squares twin support vector clustering. Neural Comput Appl 29(2):553–563

    Article  Google Scholar 

  16. Khemchandani R, Sharma S (2016) Robust least squares twin support vector machine for human activity recognition. Appl Soft Comput 47:33–46

    Article  Google Scholar 

  17. Lee SH, Daniels KM (2011) Gaussian kernel width exploration and cone cluster labeling for support vector clustering. Pattern Anal Appl 15(3):327–344. https://doi.org/10.1007/s10044-011-0244-8

    Article  MathSciNet  Google Scholar 

  18. Luo L, Xie Y, Zhang Z, Li WJ (2015) Support matrix machines. In: International conference on machine learning, pp 938–947

  19. Nasiri JA, Charkari NM, Mozafari K (2014) Energy-based model of least squares twin support vector machines for human action recognition. Sig Process 104:248–257

    Article  Google Scholar 

  20. Nielsen AA (2002) Multiset canonical correlations analysis and multispectral, truly multitemporal remote sensing data. IEEE Trans Image Process 11(3):293–305

    Article  Google Scholar 

  21. Page EB (1963) Ordered hypotheses for multiple treatments: a significance test for linear ranks. J Am Stat Assoc 58(301):216–230

    Article  MathSciNet  Google Scholar 

  22. Qiao J, Wang G, Li W, Chen M (2018) An adaptive deep q-learning strategy for handwritten digit recognition. Neural Netw. https://doi.org/10.1016/j.neunet.2018.02.010

    Article  MATH  Google Scholar 

  23. Rastogi R, Sharma S (2017) Tree-based structural twin support tensor clustering with square loss function. In: International conference on pattern recognition and machine intelligence. Springer, Berlin, pp 28–34

  24. Rastogi R, Sharma S, Chandra S (2017) Robust parametric twin support vector machine for pattern classification. Neural Process Lett 47:293–323

    Article  Google Scholar 

  25. Sarle WS, Jain AK, Dubes RC (1990) Algorithms for clustering data. Technometrics 32(2):227. https://doi.org/10.2307/1268876

    Article  Google Scholar 

  26. Touati R, Mignotte M (2014) Mds-based multi-axial dimensionality reduction model for human action recognition. In: 2014 Canadian conference on computer and robot vision (CRV), pp 262–267. IEEE

  27. Wang Z, Shao YH, Bai L, Deng NY (2015) Twin support vector machine for clustering. IEEE Trans Neural Netw Learn Syst 26(10):2583–2588

    Article  MathSciNet  Google Scholar 

  28. Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemometr Intell Lab Syst 2(1–3):37–52

    Article  Google Scholar 

  29. Xu L, Neufeld J, Larson B, Schuurmans D (2005) Maximum margin clustering. In: Advances in neural information processing systems, pp 1537–1544

  30. Xu Y, Akrotirianakis I, Chakraborty A (2015) Proximal gradient method for huberized support vector machine. Pattern Anal Appl 19(4):989–1005. https://doi.org/10.1007/s10044-015-0485-z

    Article  MathSciNet  MATH  Google Scholar 

  31. Yang J, Zhang D, Frangi AF, Yang JY (2004) Two-dimensional PCA: a new approach to appearance-based face representation and recognition. IEEE Trans Pattern Anal Mach Intell 26(1):131–137

    Article  Google Scholar 

  32. Yang ZM, Guo YR, Li CN, Shao YH (2015) Local k-proximal plane clustering. Neural Comput Appl 26(1):199–211

    Article  Google Scholar 

  33. Zhang X, Gao X, Wang Y (2009) Twin support tensor machines for MCS detection. J Electron 26(3):318–325

    Google Scholar 

  34. Zhao X, Shi H, Lv M, Jing L (2014) Least squares twin support tensor machine for classification. J Inf Comput Sci 11(12):4175–4189

    Article  Google Scholar 

  35. Zheng Q, Zhu F, Qin J, Chen B, Heng PA (2018) Sparse support matrix machine. Pattern Recogn 76:715–726

    Article  Google Scholar 

  36. Zhu C (2016) Double-fold localized multiple matrix learning machine with universum. Pattern Anal Appl 20(4):1091–1118. https://doi.org/10.1007/s10044-016-0548-9

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reshma Rastogi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Finding the solution for the second problem

At iteration m, for any given non-zero vector \(u_{2,m} \in \mathbb {R}^{n_1}\), let \(a_i^{\rm T}={u_{2,m}}^{\rm T}A_i\), \(b_i^{\rm T}={u_{2,m}}^{\rm T}B_i\) and \({c}_k^{\rm T}={u_{2,m}}^{\rm T}{X}_k\), we then solve for the following modified problem:

$$\begin{aligned}&\underset{v_{2,m},b_{2,m},\rho _{2,m}}{Min} \frac{\alpha }{2} \sum \limits _{j \in I_2}||b_jv_{2,m}+b_{2,m}||^2\nonumber \\&\quad + \frac{\beta }{2} (v_{2,m}^{\rm T}v_{2,m} + b_{2,m}^2) + \frac{\gamma }{2} \sum \limits _{i \in I_1}||a_iv_{2,m} +e_1b_{2,m} + (1-\rho _{2,m}) ||^2\nonumber \\&\quad + \frac{\delta }{2} \sum \limits _{k \in I_3}||c_kv_{2,m} + e_3b_{2,m} + (1-\rho _{2,m}-\epsilon )||^2. \end{aligned}$$
(16)

Considering Eq. (16) in vector form and differentiating it with respect to \(v_m\) and \(b_m\) leads to the following system of linear equations:

$$\begin{aligned} \frac{\partial L}{\partial v_{2,m}}&= \alpha B^{\rm T}(Bv_{2,m}+e_2b_{2,m}) \nonumber \\&\quad +\beta v_{2,m}+\gamma A^{\rm T}(Av_{2,m}+e_1b_{2,m}+e_1(1-\rho _{2,m})) \nonumber \\&\quad +\delta C^{\rm T}(Cv_{2,m}+e_3b_{2,m} +e_3(1-\rho _{2,m}-\epsilon ))=0 \end{aligned}$$
(17)
$$\begin{aligned} \frac{\partial L}{\partial b_{2,m}}&= \alpha e_2^{\rm T}(Bv_{2,m}+e_2b_{2,m})+\beta b_{2,m}\nonumber \\&\quad +\gamma e_1^{\rm T}(Av_{2,m}+e_1b_{2,m}+e_1(1-\rho _{2,m}))+\delta e_3^{\rm T}(Cv_{2,m}+e_3b_{2,m} \nonumber \\&\quad +e_3(1-\rho _{2,m}-\epsilon ))=0 \end{aligned}$$
(18)
$$\begin{aligned} \frac{\partial L}{\partial \rho _{1,m}}&= -\gamma (Av_{2,m}+e_1b_{2,m}+e_1-\rho _{2,m}) \nonumber \\&\quad -\delta (Cv_{2,m}+e_3b_{2,m}+e_3-\rho _{2,m}-\epsilon )=0 \nonumber \\&\Rightarrow -\gamma (H_1^{\rm T}z_{2,m}+e_1-e_1\rho _{2,m})-\delta (K_1^{\rm T}z_{2,m}+e_3-\rho _{2,m}-\epsilon ) =0 \nonumber \\&\Rightarrow -(\gamma H_1^{\rm T}+\delta K_1^{\rm T})z_{2,m}+(\gamma +\delta )\rho _{2,m} \nonumber \\&\quad - (\gamma e_2+\delta e_3-\delta \epsilon ) \nonumber \\&=0 \quad \text{ where } z_{2,m}=[v_{2,m}~~b_{2,m}]. \end{aligned}$$
(19)

Combining Eqs. (17) and (18) on the similar lines, we have

$$\begin{aligned}&(\alpha G_1^{\rm T}G_1+\beta I+\gamma H_1^{\rm T}H_1+\delta K_1^{\rm T}K_1)z+(-\gamma H_1^{\rm T}e_1-\delta K_1^{\rm T}e_3)\rho _{1,m}\nonumber \\&\quad =-(\gamma H_1^{\rm T}e_2+\delta K_1^{\rm T}e_3 (1-\epsilon )). \end{aligned}$$
(20)
$$\begin{aligned}&\left[ {\begin{array}{cc} z_{2,m} \\ \rho _{2,m} \end{array}}\right] \nonumber \\&\quad =\left[ {\begin{array}{cc} \alpha G_1^{\rm T}G_1+\beta I+\gamma H_1^{\rm T}H_1+\delta K_1^{\rm T}K_1 &{} -(\gamma H_1^{\rm T}e_1+\delta K_1^{\rm T}e_3) \\ -(\gamma e_2^{\rm T}G_1^{\rm T}+\delta e_3^{\rm T}K_1^{\rm T}) &{} \gamma e_1^{\rm T} +\delta e_3^{\rm T} \end{array}}\right] ^{-1} \nonumber \\&\left[ {\begin{array}{cc}-(\gamma H_1^{\rm T}e_2+\delta K_1^{\rm T} (1-\epsilon )) \\ (\gamma e_2+\delta e_3(1-\epsilon )) \end{array}}\right] , \end{aligned}$$
(21)

Once the solution to Eq. (21) are calculated, the optimal values of \(v_{2,m}\), \(b_{2,m}\) and \(\rho _{2,m}\) are obtained. Thus, alternatively projecting obtained non-zero vector \(v_{1,m} \in \mathbb {R}^{n_2}\), we have \(\hat{a}_i^{\rm T}=A_i{v_{1,m}}\), \(\hat{b}_j^{\rm T}={B}_j{v_{1,m}}\) and \(\hat{c}_j^{\rm T}={C}_j{v_{1,m}}\) in Eq. (13). So, now, we solve for the following modified optimization problem:

$$\begin{aligned}&\underset{u_{2,m},b_{2,m},\rho _{2,m}}{Min} \frac{\alpha }{2} \sum \limits _{j \in I_2}||\hat{b}_ju_{2,m}+b_{2,m}||^2\nonumber \\&\quad + \frac{\beta }{2} (u_{2,m}^{\rm T}u_{2,m} + b_{2,m}^2) + \frac{\gamma }{2} \sum \limits _{i \in I_1}||\hat{a}_iu_{2,m} +b_{2,m} + (1-\rho _{2,m}) ||^2\nonumber \\&\quad + \frac{\delta }{2} \sum \limits _{k \in I_3}||\hat{c}_ku_{2,m} + b_{2,m} + (1-\rho _{2,m}-\epsilon )||^2. \end{aligned}$$
(22)

Working on the lines as above, and considering \(\hat{z}=[u_{2,m} \quad b_{2,m}]\), we obtain \((u_{1,m}, b_{1,m}, \rho _{1,m})\) as follows

$$\begin{aligned}&\left[ {\begin{array}{cc} z_{2,m} \\ \rho _{2,m} \end{array}}\right] =\left[ {\begin{array}{cc} \alpha G_2^{\rm T}G_2+\beta I+\gamma H_2^{\rm T}H_2+\delta K_2^{\rm T}K_2 &{} -(\gamma H_2^{\rm T}e_1+\delta K_2^{\rm T}e_3) \\ -(\gamma e_2^{\rm T} H_2^{\rm T}+\delta e_3^{\rm T} K_2^{\rm T}) &{} \gamma e_1^{\rm T} +\delta e_3^{\rm T} \end{array}}\right] ^{-1} \nonumber \\&\left[ {\begin{array}{cc}-(\gamma H_2^{\rm T}e_2+\delta K_2^{\rm T} (1-\epsilon )) \\ (\gamma e_2+\delta e_3(1-\epsilon )) \end{array}}\right] , \end{aligned}$$
(23)

where \(H_2\), \(G_2\) and \(K_2\) are matrices of points from classes +1, -1 and 0, respectively, augmented with a column of ones. Equations (21) and (23) are solved alternatively until \(u_{2,m}\), \(v_{2,m}\), \(b_{2,m}\) and \(\rho _{2,m}\) converges as per some predefined termination criteria.

Appendix 2: Finding the solution for the third problem

At iteration m, for any given non-zero vector \(u_{3,m} \in \mathbb {R}^{n_1}\), let \(a_i^{\rm T}={u_{3,m}}^{\rm T}A_i\), \(b_i^{\rm T}={u_{3,m}}^{\rm T}B_i\) and \({c}_k^{\rm T}={u_{3,m}}^{\rm T}{X}_k\), we then solve for the following modified problem:

$$\begin{aligned}&\underset{v_{3,m},b_{3,m}}{Min} \frac{\alpha }{2} \sum \limits _{k \in I_3}||c_kv_{3,m}+b_{3,m}||^2\nonumber \\&\quad + \frac{\beta }{2} (v_{3,m}^{\rm T}v_{3,m} + b_{3,m}^2) + \frac{\gamma }{2} \sum \limits _{i \in I_1}||a_iv_{3,m} +b_{3,m} + (1-\rho _1-\epsilon ) ||^2\nonumber \\&\quad + \frac{\delta }{2} \sum \limits _{j \in I_2}||b_jv_{3,m} + b_{3,m} + (1-\rho _2-\epsilon )||^2. \end{aligned}$$
(24)

Considering Eq. (24) in vector form and differentiating it with respect to \(v_m\) and \(b_m\) leads to the following system of linear equations:

$$\begin{aligned} \frac{\partial L}{\partial v_{3,m}}& = {} \alpha C^{\rm T}(Cv_{3,m}+e_3b_{3,m})+\beta v_{3,m}\nonumber \\&\quad +\gamma A^{\rm T}(Av_{3,m}+e_1b_{3,m}+e_1(1-\rho _1-\epsilon )) \nonumber \\&\quad +\delta B^{\rm T}(Bv_{3,m}+e_2b_{3} +e_2(1-\rho _{2}-\epsilon ))=0 \end{aligned}$$
(25)
$$\begin{aligned} \frac{\partial L}{\partial b_{3,m}}& = {} \alpha e_3^{\rm T}(Cv_{3,m}+e_3b_{3,m})+\beta b_{3,m}\nonumber \\&\quad +\gamma e_1^{\rm T}(Av_{3,m}+e_1b_{3,m} +e_1(1-\rho _{1}-\epsilon ))\nonumber \\&\quad +\delta e_3^{\rm T}(Bv_{3,m}+e_2b_{3,m} +e_2(1-\rho _{2}-\epsilon ))=0 \end{aligned}$$
(26)

Combining Eqs. (25) and (26) on the similar lines, we have

$$\begin{aligned}&(\alpha K_1^{\rm T}K_1+\beta I+\gamma H_1^{\rm T}H_1+\delta G_1^{\rm T}G_1)z_{3,m}\nonumber \\&\quad =-(\gamma H_1^{\rm T}e_1+\delta H_1^{\rm T}e_1\rho _{1}+\gamma H_1^{\rm T}e_1\epsilon -\delta G_1^{\rm T}e_2 + \delta G_1^{\rm T}e_2 \delta + \delta G_1^{\rm T}e_2 \epsilon ) \end{aligned}$$
(27)
$$\begin{aligned}&z_{3,m}=-(\alpha K_1^{\rm T}K_1+\beta I+\gamma H_1^{\rm T}H_1+\delta G_1^{\rm T}G_1)^{-1} (\gamma H_1^{\rm T}e_1+\delta H_1^{\rm T}e_1\rho _{1}\nonumber \\&\qquad +\gamma H_1^{\rm T}e_1\epsilon -\delta G_1^{\rm T}e_2 + \delta G_1^{\rm T}e_2 \delta + \delta G_1^{\rm T}e_2 \epsilon ) \end{aligned}$$
(28)

Once the solution to Eq. (28) are calculated, the optimal values of \(v_{3,m}\) and \(b_{3,m}\) are obtained. Thus, alternatively projecting obtained non-zero vector \(v_{3,m} \in \mathbb {R}^{n_2}\), we have \(\hat{a}_i^{\rm T}=A_i{v_{3,m}}\), \(\hat{b}_j^{\rm T}={B}_j{v_{3,m}}\) and \(\hat{c}_j^{\rm T}={C}_j{v_{3,m}}\) in Eq. (14). So, now, we solve for the following modified optimization problem:

$$\begin{aligned}&\underset{u_{3,m},b_{3,m}}{Min} \frac{\alpha }{2} \sum \limits _{k \in I_3}||\hat{c}_ku_{3,m}+b_{3,m}||^2+ \frac{\beta }{2} (u_{3,m}^{\rm T}u_{3,m} + b_{3,m}^2) \nonumber \\&\quad + \frac{\gamma }{2} \sum \limits _{i \in I_1}||\hat{a}_iu_{3,m} +e_1b_{3,m} + (1-\rho _1-\epsilon ) ||^2\nonumber \\&\quad + \frac{\delta }{2} \sum \limits _{j \in I_2}||\hat{b}_ju_{3,m} + e_2b_{3,m} + (1-\rho _2-\epsilon )||^2. \end{aligned}$$
(29)

Working on the lines as above, and considering \(\hat{z_{3,m}}=[u_{3,m} \quad b_{3,m}]\), we obtain \((u_{3,m}\) and \(b_{3,m}\) as follows

$$\begin{aligned} z_{3,m}& = {} -(\alpha K_2^{\rm T}K_2+\beta I+\gamma H_2^{\rm T}H_2+\delta G_2^{\rm T}G_2)^{-1} (\gamma H_2^{\rm T}e_1+\delta H_2^{\rm T}e_1\rho _{1}+\gamma H_2^{\rm T}e_1\epsilon \nonumber \\ &\quad -\delta G_2^{\rm T}e_2 + \delta G_2^{\rm T}e_2 \delta + \delta G_2^{\rm T}e_2 \epsilon ) \end{aligned}$$
(30)

where \(H_2\), \(G_2\) and \(K_2\) are matrices of points from classes + 1, − 1 and 0, respectively, augmented with a column of ones. Equations (28) and (30) are solved alternatively until \(u_{3,m}\), \(v_{3,m}\) and \(b_{3,m}\) converges as per some predefined termination criteria.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rastogi, R., Sharma, S. Ternary tree-based structural twin support tensor machine for clustering. Pattern Anal Applic 24, 61–74 (2021). https://doi.org/10.1007/s10044-020-00902-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-020-00902-8

Keywords

Navigation