Ternary tree-based structural twin support tensor machine for clustering

Rastogi, Reshma; Sharma, Sweta

doi:10.1007/s10044-020-00902-8

Ternary tree-based structural twin support tensor machine for clustering

Theoretical advances
Published: 23 June 2020

Volume 24, pages 61–74, (2021)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Reshma Rastogi¹ &
Sweta Sharma¹

277 Accesses
4 Citations
Explore all metrics

Abstract

Most of the real-life applications usually involve complex data, e.g., grayscale images, where information is distributed spatially in the form of two-dimensional matrices (elements of second-order tensor space). Traditional vector-based clustering models such as k-means and support vector clustering rely on low-dimensional features representations for identifying patterns and are prone to loss of useful information which is present in spatial structure of the data. To overcome this limitation, tensor-based clustering models can be utilized for identifying relevant patterns in matrix data as they take advantage of structural information present in multi-dimensional framework and reduce computational overheads as well. However, despite these numerous advantages, tensor clustering has still remained relatively unexplored research area. In this paper, we propose a novel clustering framework, termed as Ternary Tree-based Structural Least Squares Support Tensor Clustering (TT-SLSTWSTC), that builds a cluster model as a hierarchical ternary tree, where at each node non-ambiguous data are dealt separately from ambiguous data points using the proposed Ternary Structural Least Squares Support Tensor Machine (TS-LSTWSTM). The TS-LSTWSTM classifier considers the structural risk minimization of data alongside a symmetrical L2-norm loss function. Also, initialization framework based on tensor k-means has been used in order to overcome the instability disseminated by random initialization. To validate the efficacy of the proposed framework, computational experiments have been performed on human activity recognition and image recognition problems. Experimental results show that our method is not only fast but yields significantly better generalization performance and is comparatively more robust in order to handle heteroscedastic noise and outliers when compared to related methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tree-Based Structural Twin Support Tensor Clustering with Square Loss Function

Non-negative Matrix Semi-tensor Factorization for Image Feature Extraction and Clustering

A parameter-less algorithm for tensor co-clustering

Article Open access 11 June 2021

References

Basu S, Karki M, Ganguly S, DiBiano R, Mukhopadhyay S, Gayaka S, Kannan R, Nemani R (2017) Learning sparse feature representations using probabilistic quadtrees and deep belief nets. Neural Process Lett 45(3):855–867
Article Google Scholar
Baumann F, Ehlers A, Rosenhahn B, Liao J (2016) Recognizing human actions using novel space-time volume binary patterns. Neurocomputing 173:54–63
Article Google Scholar
Ben-Hur A (2008) Support vector clustering. Scholarpedia 3(6):5187. https://doi.org/10.4249/scholarpedia.5187
Article Google Scholar
Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: Tenth IEEE international conference on computer vision (ICCV’05) Volume 1, vol 2, pp 1395–1402. IEEE
Bradley PS, Mangasarian OL (2000) K-plane clustering. J Global Optim 16(1):23–32
Article MathSciNet Google Scholar
Cai D, He X, Wen JR, Han J, Ma WY (2006) Support tensor machines for text categorization. Technical report
Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, New York
MATH Google Scholar
Escalante R, Raydan M (2011) Alternating projection methods, vol 8. SIAM, Philadelphia
Book Google Scholar
Gu Z, Zhang Z, Sun J, Li B (2017) Robust image recognition by l1-norm twin-projection support vector machine. Neurocomputing 223:1–11
Article Google Scholar
Izenman AJ (2013) Linear discriminant analysis. In: Modern multivariate statistical techniques. Springer, New York, pp 237–280
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666
Article Google Scholar
Jayadeva, Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Article Google Scholar
Khemchandani R (2008) Mathematical programming applications in machine learning. Ph.D. thesis, Indian Institute of Technology Delhi New Delhi, India
Khemchandani R, Pal A, Chandra S (2016) Fuzzy least squares twin support vector clustering. Neural Comput Appl 29:1–11
Google Scholar
Khemchandani R, Pal A, Chandra S (2018) Fuzzy least squares twin support vector clustering. Neural Comput Appl 29(2):553–563
Article Google Scholar
Khemchandani R, Sharma S (2016) Robust least squares twin support vector machine for human activity recognition. Appl Soft Comput 47:33–46
Article Google Scholar
Lee SH, Daniels KM (2011) Gaussian kernel width exploration and cone cluster labeling for support vector clustering. Pattern Anal Appl 15(3):327–344. https://doi.org/10.1007/s10044-011-0244-8
Article MathSciNet Google Scholar
Luo L, Xie Y, Zhang Z, Li WJ (2015) Support matrix machines. In: International conference on machine learning, pp 938–947
Nasiri JA, Charkari NM, Mozafari K (2014) Energy-based model of least squares twin support vector machines for human action recognition. Sig Process 104:248–257
Article Google Scholar
Nielsen AA (2002) Multiset canonical correlations analysis and multispectral, truly multitemporal remote sensing data. IEEE Trans Image Process 11(3):293–305
Article Google Scholar
Page EB (1963) Ordered hypotheses for multiple treatments: a significance test for linear ranks. J Am Stat Assoc 58(301):216–230
Article MathSciNet Google Scholar
Qiao J, Wang G, Li W, Chen M (2018) An adaptive deep q-learning strategy for handwritten digit recognition. Neural Netw. https://doi.org/10.1016/j.neunet.2018.02.010
Article MATH Google Scholar
Rastogi R, Sharma S (2017) Tree-based structural twin support tensor clustering with square loss function. In: International conference on pattern recognition and machine intelligence. Springer, Berlin, pp 28–34
Rastogi R, Sharma S, Chandra S (2017) Robust parametric twin support vector machine for pattern classification. Neural Process Lett 47:293–323
Article Google Scholar
Sarle WS, Jain AK, Dubes RC (1990) Algorithms for clustering data. Technometrics 32(2):227. https://doi.org/10.2307/1268876
Article Google Scholar
Touati R, Mignotte M (2014) Mds-based multi-axial dimensionality reduction model for human action recognition. In: 2014 Canadian conference on computer and robot vision (CRV), pp 262–267. IEEE
Wang Z, Shao YH, Bai L, Deng NY (2015) Twin support vector machine for clustering. IEEE Trans Neural Netw Learn Syst 26(10):2583–2588
Article MathSciNet Google Scholar
Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemometr Intell Lab Syst 2(1–3):37–52
Article Google Scholar
Xu L, Neufeld J, Larson B, Schuurmans D (2005) Maximum margin clustering. In: Advances in neural information processing systems, pp 1537–1544
Xu Y, Akrotirianakis I, Chakraborty A (2015) Proximal gradient method for huberized support vector machine. Pattern Anal Appl 19(4):989–1005. https://doi.org/10.1007/s10044-015-0485-z
Article MathSciNet MATH Google Scholar
Yang J, Zhang D, Frangi AF, Yang JY (2004) Two-dimensional PCA: a new approach to appearance-based face representation and recognition. IEEE Trans Pattern Anal Mach Intell 26(1):131–137
Article Google Scholar
Yang ZM, Guo YR, Li CN, Shao YH (2015) Local k-proximal plane clustering. Neural Comput Appl 26(1):199–211
Article Google Scholar
Zhang X, Gao X, Wang Y (2009) Twin support tensor machines for MCS detection. J Electron 26(3):318–325
Google Scholar
Zhao X, Shi H, Lv M, Jing L (2014) Least squares twin support tensor machine for classification. J Inf Comput Sci 11(12):4175–4189
Article Google Scholar
Zheng Q, Zhu F, Qin J, Chen B, Heng PA (2018) Sparse support matrix machine. Pattern Recogn 76:715–726
Article Google Scholar
Zhu C (2016) Double-fold localized multiple matrix learning machine with universum. Pattern Anal Appl 20(4):1091–1118. https://doi.org/10.1007/s10044-016-0548-9
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Mathematics and Computer Science, South Asian University, New Delhi, India
Reshma Rastogi & Sweta Sharma

Authors

Reshma Rastogi
View author publications
You can also search for this author in PubMed Google Scholar
Sweta Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Reshma Rastogi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Finding the solution for the second problem

At iteration m, for any given non-zero vector $u_{2,m} \in \mathbb {R}^{n_1}$, let $a_i^{\rm T}={u_{2,m}}^{\rm T}A_i$, $b_i^{\rm T}={u_{2,m}}^{\rm T}B_i$ and ${c}_k^{\rm T}={u_{2,m}}^{\rm T}{X}_k$, we then solve for the following modified problem:

$$\begin{aligned}&\underset{v_{2,m},b_{2,m},\rho _{2,m}}{Min} \frac{\alpha }{2} \sum \limits _{j \in I_2}||b_jv_{2,m}+b_{2,m}||^2\nonumber \\&\quad + \frac{\beta }{2} (v_{2,m}^{\rm T}v_{2,m} + b_{2,m}^2) + \frac{\gamma }{2} \sum \limits _{i \in I_1}||a_iv_{2,m} +e_1b_{2,m} + (1-\rho _{2,m}) ||^2\nonumber \\&\quad + \frac{\delta }{2} \sum \limits _{k \in I_3}||c_kv_{2,m} + e_3b_{2,m} + (1-\rho _{2,m}-\epsilon )||^2. \end{aligned}$$

(16)

Considering Eq. (16) in vector form and differentiating it with respect to $v_m$ and $b_m$ leads to the following system of linear equations:

$$\begin{aligned} \frac{\partial L}{\partial v_{2,m}}&= \alpha B^{\rm T}(Bv_{2,m}+e_2b_{2,m}) \nonumber \\&\quad +\beta v_{2,m}+\gamma A^{\rm T}(Av_{2,m}+e_1b_{2,m}+e_1(1-\rho _{2,m})) \nonumber \\&\quad +\delta C^{\rm T}(Cv_{2,m}+e_3b_{2,m} +e_3(1-\rho _{2,m}-\epsilon ))=0 \end{aligned}$$

(17)

$$\begin{aligned} \frac{\partial L}{\partial b_{2,m}}&= \alpha e_2^{\rm T}(Bv_{2,m}+e_2b_{2,m})+\beta b_{2,m}\nonumber \\&\quad +\gamma e_1^{\rm T}(Av_{2,m}+e_1b_{2,m}+e_1(1-\rho _{2,m}))+\delta e_3^{\rm T}(Cv_{2,m}+e_3b_{2,m} \nonumber \\&\quad +e_3(1-\rho _{2,m}-\epsilon ))=0 \end{aligned}$$

(18)

$$\begin{aligned} \frac{\partial L}{\partial \rho _{1,m}}&= -\gamma (Av_{2,m}+e_1b_{2,m}+e_1-\rho _{2,m}) \nonumber \\&\quad -\delta (Cv_{2,m}+e_3b_{2,m}+e_3-\rho _{2,m}-\epsilon )=0 \nonumber \\&\Rightarrow -\gamma (H_1^{\rm T}z_{2,m}+e_1-e_1\rho _{2,m})-\delta (K_1^{\rm T}z_{2,m}+e_3-\rho _{2,m}-\epsilon ) =0 \nonumber \\&\Rightarrow -(\gamma H_1^{\rm T}+\delta K_1^{\rm T})z_{2,m}+(\gamma +\delta )\rho _{2,m} \nonumber \\&\quad - (\gamma e_2+\delta e_3-\delta \epsilon ) \nonumber \\&=0 \quad \text{ where } z_{2,m}=[v_{2,m}~~b_{2,m}]. \end{aligned}$$

(19)

Combining Eqs. (17) and (18) on the similar lines, we have

$$\begin{aligned}&(\alpha G_1^{\rm T}G_1+\beta I+\gamma H_1^{\rm T}H_1+\delta K_1^{\rm T}K_1)z+(-\gamma H_1^{\rm T}e_1-\delta K_1^{\rm T}e_3)\rho _{1,m}\nonumber \\&\quad =-(\gamma H_1^{\rm T}e_2+\delta K_1^{\rm T}e_3 (1-\epsilon )). \end{aligned}$$

(20)

$$\begin{aligned}&\left[ {\begin{array}{cc} z_{2,m} \\ \rho _{2,m} \end{array}}\right] \nonumber \\&\quad =\left[ {\begin{array}{cc} \alpha G_1^{\rm T}G_1+\beta I+\gamma H_1^{\rm T}H_1+\delta K_1^{\rm T}K_1 &{} -(\gamma H_1^{\rm T}e_1+\delta K_1^{\rm T}e_3) \\ -(\gamma e_2^{\rm T}G_1^{\rm T}+\delta e_3^{\rm T}K_1^{\rm T}) &{} \gamma e_1^{\rm T} +\delta e_3^{\rm T} \end{array}}\right] ^{-1} \nonumber \\&\left[ {\begin{array}{cc}-(\gamma H_1^{\rm T}e_2+\delta K_1^{\rm T} (1-\epsilon )) \\ (\gamma e_2+\delta e_3(1-\epsilon )) \end{array}}\right] , \end{aligned}$$

(21)

Once the solution to Eq. (21) are calculated, the optimal values of $v_{2,m}$, $b_{2,m}$ and $\rho _{2,m}$ are obtained. Thus, alternatively projecting obtained non-zero vector $v_{1,m} \in \mathbb {R}^{n_2}$, we have $\hat{a}_i^{\rm T}=A_i{v_{1,m}}$, $\hat{b}_j^{\rm T}={B}_j{v_{1,m}}$ and $\hat{c}_j^{\rm T}={C}_j{v_{1,m}}$ in Eq. (13). So, now, we solve for the following modified optimization problem:

$$\begin{aligned}&\underset{u_{2,m},b_{2,m},\rho _{2,m}}{Min} \frac{\alpha }{2} \sum \limits _{j \in I_2}||\hat{b}_ju_{2,m}+b_{2,m}||^2\nonumber \\&\quad + \frac{\beta }{2} (u_{2,m}^{\rm T}u_{2,m} + b_{2,m}^2) + \frac{\gamma }{2} \sum \limits _{i \in I_1}||\hat{a}_iu_{2,m} +b_{2,m} + (1-\rho _{2,m}) ||^2\nonumber \\&\quad + \frac{\delta }{2} \sum \limits _{k \in I_3}||\hat{c}_ku_{2,m} + b_{2,m} + (1-\rho _{2,m}-\epsilon )||^2. \end{aligned}$$

(22)

Working on the lines as above, and considering $\hat{z}=[u_{2,m} \quad b_{2,m}]$, we obtain $(u_{1,m}, b_{1,m}, \rho _{1,m})$ as follows

$$\begin{aligned}&\left[ {\begin{array}{cc} z_{2,m} \\ \rho _{2,m} \end{array}}\right] =\left[ {\begin{array}{cc} \alpha G_2^{\rm T}G_2+\beta I+\gamma H_2^{\rm T}H_2+\delta K_2^{\rm T}K_2 &{} -(\gamma H_2^{\rm T}e_1+\delta K_2^{\rm T}e_3) \\ -(\gamma e_2^{\rm T} H_2^{\rm T}+\delta e_3^{\rm T} K_2^{\rm T}) &{} \gamma e_1^{\rm T} +\delta e_3^{\rm T} \end{array}}\right] ^{-1} \nonumber \\&\left[ {\begin{array}{cc}-(\gamma H_2^{\rm T}e_2+\delta K_2^{\rm T} (1-\epsilon )) \\ (\gamma e_2+\delta e_3(1-\epsilon )) \end{array}}\right] , \end{aligned}$$

(23)

where $H_2$, $G_2$ and $K_2$ are matrices of points from classes +1, -1 and 0, respectively, augmented with a column of ones. Equations (21) and (23) are solved alternatively until $u_{2,m}$, $v_{2,m}$, $b_{2,m}$ and $\rho _{2,m}$ converges as per some predefined termination criteria.

Appendix 2: Finding the solution for the third problem

At iteration m, for any given non-zero vector $u_{3,m} \in \mathbb {R}^{n_1}$, let $a_i^{\rm T}={u_{3,m}}^{\rm T}A_i$, $b_i^{\rm T}={u_{3,m}}^{\rm T}B_i$ and ${c}_k^{\rm T}={u_{3,m}}^{\rm T}{X}_k$, we then solve for the following modified problem:

$$\begin{aligned}&\underset{v_{3,m},b_{3,m}}{Min} \frac{\alpha }{2} \sum \limits _{k \in I_3}||c_kv_{3,m}+b_{3,m}||^2\nonumber \\&\quad + \frac{\beta }{2} (v_{3,m}^{\rm T}v_{3,m} + b_{3,m}^2) + \frac{\gamma }{2} \sum \limits _{i \in I_1}||a_iv_{3,m} +b_{3,m} + (1-\rho _1-\epsilon ) ||^2\nonumber \\&\quad + \frac{\delta }{2} \sum \limits _{j \in I_2}||b_jv_{3,m} + b_{3,m} + (1-\rho _2-\epsilon )||^2. \end{aligned}$$

(24)

Considering Eq. (24) in vector form and differentiating it with respect to $v_m$ and $b_m$ leads to the following system of linear equations:

$$\begin{aligned} \frac{\partial L}{\partial v_{3,m}}& = {} \alpha C^{\rm T}(Cv_{3,m}+e_3b_{3,m})+\beta v_{3,m}\nonumber \\&\quad +\gamma A^{\rm T}(Av_{3,m}+e_1b_{3,m}+e_1(1-\rho _1-\epsilon )) \nonumber \\&\quad +\delta B^{\rm T}(Bv_{3,m}+e_2b_{3} +e_2(1-\rho _{2}-\epsilon ))=0 \end{aligned}$$

(25)

$$\begin{aligned} \frac{\partial L}{\partial b_{3,m}}& = {} \alpha e_3^{\rm T}(Cv_{3,m}+e_3b_{3,m})+\beta b_{3,m}\nonumber \\&\quad +\gamma e_1^{\rm T}(Av_{3,m}+e_1b_{3,m} +e_1(1-\rho _{1}-\epsilon ))\nonumber \\&\quad +\delta e_3^{\rm T}(Bv_{3,m}+e_2b_{3,m} +e_2(1-\rho _{2}-\epsilon ))=0 \end{aligned}$$

(26)

Combining Eqs. (25) and (26) on the similar lines, we have

$$\begin{aligned}&(\alpha K_1^{\rm T}K_1+\beta I+\gamma H_1^{\rm T}H_1+\delta G_1^{\rm T}G_1)z_{3,m}\nonumber \\&\quad =-(\gamma H_1^{\rm T}e_1+\delta H_1^{\rm T}e_1\rho _{1}+\gamma H_1^{\rm T}e_1\epsilon -\delta G_1^{\rm T}e_2 + \delta G_1^{\rm T}e_2 \delta + \delta G_1^{\rm T}e_2 \epsilon ) \end{aligned}$$

(27)

$$\begin{aligned}&z_{3,m}=-(\alpha K_1^{\rm T}K_1+\beta I+\gamma H_1^{\rm T}H_1+\delta G_1^{\rm T}G_1)^{-1} (\gamma H_1^{\rm T}e_1+\delta H_1^{\rm T}e_1\rho _{1}\nonumber \\&\qquad +\gamma H_1^{\rm T}e_1\epsilon -\delta G_1^{\rm T}e_2 + \delta G_1^{\rm T}e_2 \delta + \delta G_1^{\rm T}e_2 \epsilon ) \end{aligned}$$

(28)

Once the solution to Eq. (28) are calculated, the optimal values of $v_{3,m}$ and $b_{3,m}$ are obtained. Thus, alternatively projecting obtained non-zero vector $v_{3,m} \in \mathbb {R}^{n_2}$, we have $\hat{a}_i^{\rm T}=A_i{v_{3,m}}$, $\hat{b}_j^{\rm T}={B}_j{v_{3,m}}$ and $\hat{c}_j^{\rm T}={C}_j{v_{3,m}}$ in Eq. (14). So, now, we solve for the following modified optimization problem:

$$\begin{aligned}&\underset{u_{3,m},b_{3,m}}{Min} \frac{\alpha }{2} \sum \limits _{k \in I_3}||\hat{c}_ku_{3,m}+b_{3,m}||^2+ \frac{\beta }{2} (u_{3,m}^{\rm T}u_{3,m} + b_{3,m}^2) \nonumber \\&\quad + \frac{\gamma }{2} \sum \limits _{i \in I_1}||\hat{a}_iu_{3,m} +e_1b_{3,m} + (1-\rho _1-\epsilon ) ||^2\nonumber \\&\quad + \frac{\delta }{2} \sum \limits _{j \in I_2}||\hat{b}_ju_{3,m} + e_2b_{3,m} + (1-\rho _2-\epsilon )||^2. \end{aligned}$$

(29)

Working on the lines as above, and considering $\hat{z_{3,m}}=[u_{3,m} \quad b_{3,m}]$, we obtain $(u_{3,m}$ and $b_{3,m}$ as follows

$$\begin{aligned} z_{3,m}& = {} -(\alpha K_2^{\rm T}K_2+\beta I+\gamma H_2^{\rm T}H_2+\delta G_2^{\rm T}G_2)^{-1} (\gamma H_2^{\rm T}e_1+\delta H_2^{\rm T}e_1\rho _{1}+\gamma H_2^{\rm T}e_1\epsilon \nonumber \\ &\quad -\delta G_2^{\rm T}e_2 + \delta G_2^{\rm T}e_2 \delta + \delta G_2^{\rm T}e_2 \epsilon ) \end{aligned}$$

(30)

where $H_2$, $G_2$ and $K_2$ are matrices of points from classes + 1, − 1 and 0, respectively, augmented with a column of ones. Equations (28) and (30) are solved alternatively until $u_{3,m}$, $v_{3,m}$ and $b_{3,m}$ converges as per some predefined termination criteria.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rastogi, R., Sharma, S. Ternary tree-based structural twin support tensor machine for clustering. Pattern Anal Applic 24, 61–74 (2021). https://doi.org/10.1007/s10044-020-00902-8

Download citation

Received: 30 July 2018
Accepted: 10 June 2020
Published: 23 June 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s10044-020-00902-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ternary tree-based structural twin support tensor machine for clustering

Abstract

Access this article

Similar content being viewed by others

Tree-Based Structural Twin Support Tensor Clustering with Square Loss Function

Non-negative Matrix Semi-tensor Factorization for Image Feature Extraction and Clustering

A parameter-less algorithm for tensor co-clustering

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1: Finding the solution for the second problem

Appendix 2: Finding the solution for the third problem

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Ternary tree-based structural twin support tensor machine for clustering

Abstract

Access this article

Similar content being viewed by others

Tree-Based Structural Twin Support Tensor Clustering with Square Loss Function

Non-negative Matrix Semi-tensor Factorization for Image Feature Extraction and Clustering

A parameter-less algorithm for tensor co-clustering

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1: Finding the solution for the second problem

Appendix 2: Finding the solution for the third problem

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation