Skip to main content
Log in

Dimensionality reduction and topographic mapping of binary tensors

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

In this paper, a decomposition method for binary tensors, generalized multi-linear model for principal component analysis (GMLPCA) is proposed. To the best of our knowledge at present there is no other principled systematic framework for decomposition or topographic mapping of binary tensors. In the model formulation, we constrain the natural parameters of the Bernoulli distributions for each tensor element to lie in a sub-space spanned by a reduced set of basis (principal) tensors. We evaluate and compare the proposed GMLPCA technique with existing real-valued tensor decomposition methods in two scenarios: (1) in a series of controlled experiments involving synthetic data; (2) on a real-world biological dataset of DNA sub-sequences from different functional regions, with sequences represented by binary tensors. The experiments suggest that the GMLPCA model is better suited for modelling binary tensors than its real-valued counterparts. Furthermore, we extended our GMLPCA model to the semi-supervised setting by forcing the model to search for a natural parameter subspace that represents a user-specified compromise between the modelling quality and the degree of class separation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. The updating formulas of CSA and MPCA are similar, the only difference being that MPCA subtracts the mean from the data tensors.

  2. As a term, we denote a short and widespread sequence of nucleotides that has or may have a biological significance.

  3. R = [3 × 3] represents a natural parameter subspace spanned by 3-row and 3-column vectors.

  4. \(\theta\)’s are fixed current values of the parameters and should be treated as constants.

References

  1. Lu H, Plataniotis KN, Venetsanopoulos AN (2008) MPCA: multilinear principal component analysis of tensor objects. IEEE Trans Neural Netw 19:18–39

    Article  Google Scholar 

  2. Nolker C, Ritter H (2002) Visual recognition of continuous hand postures. IEEE Trans Neural Netw 13:983–994

    Article  Google Scholar 

  3. Jia K, Gong S (2005) Multi-modal tensor face for simultaneous super-resolution and recognition. In: 10th IEEE international conference computer vision 2:1683–1690 (Beijing)

  4. Renard N, Bourennane S (2008) An ICA-based multilinear algebra tools for dimensionality reduction in hyperspectral imagery. In: IEEE international conference acoustics, speech and signal processing, vol 4. Las Vegas, NV, pp 1345–1348

  5. Cai D, He X, Han J (2006) Tensor space model for document analysis. In: Proceedings 29th annual ACM SIGIR international conference research and development in information retrieval. Seatlle, WA, pp 625–626

  6. Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 2(6):559–572

    Article  Google Scholar 

  7. Lu H, Plataniotis KN, Venetsanopoulos AN (2009) Uncorrelated multilinear discriminant analysis with regularization and aggregation for tensor object recognition. IEEE Trans Neural Netw 20:103–123

    Article  Google Scholar 

  8. Zafeiriou S (2009) Discriminant nonnegative tensor factorization algorithms. IEEE Trans Neural Netw 20:217–235

    Article  Google Scholar 

  9. Panagakis Y, Kotropoulos C, Arce G (2010) Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification. IEEE Trans Audio Speech Lang Process 18:576–588

    Article  Google Scholar 

  10. Brachat J, Comon P, Mourrain B, Tsigaridas E (2010) Symmetric tensor decomposition. Linear Algebra Appl (In Press, Corrected Proof)

  11. Schein A, Saul L, Ungar L (2003) A generalized linear model for principal component analysis of binary data. In: 9th international workshop artificial intelligence and statistics. Key West, FL

  12. Acar E, Yener B (2009) “Unsupervised multiway data analysis: a literature survey. IEEE Trans Knowl Data En 21:6–20

    Article  Google Scholar 

  13. Wang H, Ahuja N (2005) Rank-r approximation of tensors: using image-as-matrix representation. Comput Vis Pattern Recognit 346–353

  14. De Lathauwer L, De Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21 (4):1253–1278

    Article  MATH  MathSciNet  Google Scholar 

  15. Kofidis E, Regalia PA (2001) On the best rank-1 approximation of higher-order supersymmetric tensors. SIAM J Matrix Anal Appl 23(3):863–884

    Article  MathSciNet  Google Scholar 

  16. Wang H, Ahuja N (2004) Compact representation of multidimensional data using tensor rank-one decomposition. In: Proceedings 17th international conference pattern recognition. Cambridge, UK, pp 44–47

  17. Ye J, Janardan R, Li Q (2004) Gpca: an efficient dimension reduction scheme for image compression and retrieval .In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’04. New York, NY, USA, pp 354–363, ACM, 2004.

  18. Xu D, Yan S, Zhang L, Lin S, Zhang H-J, Huang T (2008) Reconstruction and recognition of tensor-based objects with concurrent subspaces analysis. IEEE Trans Circuits Syst Video Technol 18:36–47

    Article  Google Scholar 

  19. Lu H, Plataniotis KN, Venetsanopoulos AN (2009) Uncorrelated multilinear principal component analysis for unsupervised multilinear subspace learning. IEEE Trans Neural Netw 20(11):1820–1836

    Article  Google Scholar 

  20. Inoue K, Hara K, Urahama K (2009) Robust multilinear principal component analysis. In: 12th international conference on computer vision 2009, pp 591–597, 2–29 oct 2009

  21. Lu H, Plataniotis KN, Venetsanopoulos AN (2011) A survey of multilinear subspace learning for tensor data. Pattern Recognit 44:1540–1551

    Article  MATH  Google Scholar 

  22. Cortes C, Mohri M (2003) AUC optimization vs. error rate minimization. In: Advances in neural information processing systems, vol 16. Banff, AL, Canada, pp 313–320

  23. Li X, Zeng J, Yan H (2008) PCA-HPR: a principle component analysis model for human promoter recognition. Bioinformation 2(9):373–378

    Article  Google Scholar 

  24. Sonnenburg S, Zien A, Philips P, Ratsch G (2008) POIMs: positional oligomer importance matrices–understanding support vector machine-based signal detectors. Bioinformatics 24(13): i6–i14

    Article  Google Scholar 

  25. Baldi P, Brunak S (2001) Bioinformatics: the machine learning approach. The MIT Press, 2nd ed.

  26. Isaev A (2007) Introduction to mathematical methods in Bioinformatics. Springer, Secaucus

  27. Wakaguri H, Yamashita R, Sugano SYS, Nakai K (2008) DBTSS: database of transcription start sites. Nucleic Acids Res 36:97–101 (no. Database-Issue)

    Google Scholar 

  28. Saxonov S, Daizadeh I, Fedorov A, Gilbert W (2000) EID: the exon-intron database—an exhaustive database of protein-coding intron-containing genes. Nucleic Acids Res 28(1):185–190

    Article  Google Scholar 

  29. Ron D, Singer Y, Tishby N (1996) “The power of amnesia: learning probabilistic automata with variable memory length. Mach Learning 25(2–3):117–149

    Article  MATH  Google Scholar 

  30. Tino P, Dorffner G (2001) Predicting the future of discrete sequences from fractal representations of the past. Mach Learning 45(2):187–217

    Article  MATH  Google Scholar 

  31. Cross S, Clark V, Bird A (1999) Isolation of CpG islands from large genomic clones. Nucleic Acids Res 27(10):2099–2107

    Article  Google Scholar 

  32. Bodén M, Bailey TL (2007) Associating transcription factor-binding site motifs with target GO terms and target genes. Nucleic Acids Res 36(12):4108–4117

    Article  Google Scholar 

  33. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1):25–29

    Article  Google Scholar 

  34. Straussman R, Nejman D, Roberts D, Steinfeld I, Blum B, Benvenisty N, Simon I, Yakhini Z, Cedar H (2009) Developmental programming of CpG island methylation profiles in the human genome. Nat Struct Mol Biol 16(5): 564–571

    Article  Google Scholar 

  35. Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS (2009) Meme suite: tools for motif discovery and searching. Nucleic Acids Res 37:W202–W208 (Web Server issue)

    Google Scholar 

  36. Gupta S, Stamatoyannopoulos J, Bailey T, Noble W (2007) Quantifying similarity between motifs. Genome Biol8(2): R24

    Google Scholar 

  37. Globerson A, Roweis S (2006) Metric learning by collapsing classes. Adva Neural Info Process Syst 18:451–458

    Google Scholar 

Download references

Acknowledgements

Jakub Mažgut was supported by the Slovak Research and Development Agency under the contract No. APVV-0208-10 and by the Scientific Grant Agency of Slovak Republic, Grant No. VG1/0971/11. Peter Tiňo was supported by the DfES UK/Hong Kong Fellowship for Excellence and a BBSRC Grant (No. BB/H012508/1). Mikael Bodén was supported by the ARC Centre of Excellence in Bioinformatics and the 2009 University of Birmingham Ramsay Research Scholarship Award. Hong Yan is supported by a grant from City University of Hong Kong (Project 7002843).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jakub Mažgut.

Appendix: Parameter estimation

Appendix: Parameter estimation

To get analytical parameter updates, we use the trick of [11] and take advantage of the fact that while the model log-likelihood (7) is not convex in the parameters, it is convex in any parameter, if the others are kept fixed. This leads to an iterative estimation scheme detailed below.

The analytical updates will be derived from a lower bound on the log-likelihood (7) using [11]:

$$\log \sigma(\hat \theta) \ge -\log 2 + \frac{\hat \theta}{2} - \log \cosh \left( \frac{\theta}{2}\right) - (\hat \theta^2 - \theta^2) \,\frac{\tanh \frac{\theta}{2}}{4 \theta},$$
(31)

where \(\theta\) stands for the current value of individual natural parameters \(\theta_{m,{\bf i}}\) of the Bernoulli noise models \({P({\mathcal A}_{m,{\bf i}} | \theta_{m,{\bf i}})}\) and \(\hat \theta\) stands for the future estimate of the parameter, given the current parameter values. Hence, from (7) we obtainFootnote 4

$${\mathcal {L}}(\hat \Uptheta) = \sum_{m=1}^M \sum_{{\bf i} \in \Upupsilon} {\mathcal A}_{m,{\bf i}} \log \sigma(\hat \theta_{m,{\bf i}}) + (1-{\mathcal A}_{m,{\bf i}}) \log \sigma(- \hat \theta_{m,{\bf i}}) \ge \sum_{m=1}^M \sum_{{\bf i} \in \Upupsilon} {\mathcal A}_{m,{\bf i}} \left[ - \log 2 + \frac{\hat \theta_{m,{\bf i}}}{2} \right. \left.- \log \cosh \left( \frac{\theta_{m,{\bf i}}}{2}\right) - (\hat \theta_{m,{\bf i}}^2 - \theta_{m,{\bf i}}^2) \frac{\tanh \frac{\theta_{m,{\bf i}}}{2}}{4 \theta_{m,{\bf i}}} \right] + (1-{\mathcal A}_{m,{\bf i}}) \left[ -\log 2 - \frac{\hat \theta_{m,{\bf i}}}{2} \right. \left.- \log \cosh \left( \frac{\theta_{m,{\bf i}}}{2}\right) - (\hat \theta_{m,{\bf i}}^2 - \theta_{m,{\bf i}}^2) \frac{\tanh \frac{\theta_{m,{\bf i}}}{2}}{4 \theta_{m,{\bf i}}} \right]$$
(32)
$$= \,H(\hat \Uptheta, \Uptheta).$$
(33)

Denote \((\tanh \frac{\theta_{m,{\bf i}}}{2}) / \theta_{m,{\bf i}}\) by \(\Uppsi_{m,{\bf i}}. \) Grouping together constant terms in (32) leads to

$$H(\hat \Uptheta, \Uptheta) = \sum_{m=1}^M \sum_{{\bf i} \in \Upupsilon} \left[ \hat \theta_{m,{\bf i}} \left({\mathcal A}_{m,{\bf i}} - \frac{1}{2}\right) - \frac{\Uppsi_{m,{\bf i}}}{4} \,\hat \theta_{m,{\bf i}}^2 \right] + \,Const. $$
(34)

Note that \({H(\hat \Uptheta, \Uptheta) = {\mathcal {L}}(\hat \Uptheta)}\) only if \(\hat \Uptheta = \Uptheta.\) Therefore, by choosing \(\hat \Uptheta\) that maximizes \(H(\hat \Uptheta, \Uptheta)\) we guarantee \({{\mathcal {L}}(\hat \Uptheta) \ge H(\hat \Uptheta, \Uptheta) \ge H(\Uptheta, \Uptheta) = {\mathcal {L}}(\Uptheta)}\) [11].

We are now ready to constrain the Bernoulli parameters to be optimized [see (9)]:

$$\hat \theta_{m,{\bf i}} = \sum_{{\bf r} \in \rho} {\mathcal {Q}}_{m,{\bf r}} \cdot \prod_{n=1}^N u^{(n)}_{r_n,i_n} + \Updelta_{{\bf i}}.$$
(35)

We will update the model parameters so as to maximize

$${\mathcal{H}}= \sum_{m=1}^M \sum_{{\bf i} \in \Upupsilon} {\mathcal{H}}_{m,{\bf i}},$$
(36)

where

$${\mathcal{H}}_{m,{\bf i}} = \left({\mathcal A}_{m,{\bf i}} - \frac{1}{2}\right) \hat \theta_{m,{\bf i}} - \frac{\Uppsi_{m,{\bf i}}}{4} \,\hat \theta_{m,{\bf i}}^2,$$
(37)

with \(\hat \theta_{m,{\bf i}}\) given by (35).

1.1 Updates for n-mode space basis

When updating the n-mode space basis {u (n)1 u (n)2 , ..., u (n) R_n }, the bias tensor \(\Updelta\) and the expansion coefficients \({{\mathcal {Q}}_{m,{\bf r}}, }\) m = 1, 2, ...Mr  ∈ ρ, are kept fixed to their current values.

For n = 1, 2, ..., N, define

$$\Upupsilon_{-n} = \{1,2,\ldots,I_1\} \times\cdots \times \{1,2,\ldots,I_{n-1}\} \times \{ 1 \} \times \{1,2,\ldots,I_{n+1}\} \times\cdots\times \{1,2,\ldots,I_N\},$$
(38)

with obvious interpretation in the boundary cases. Given \({\bf\ {{i}}} \in \Upupsilon_{-n}\) and an n-mode index \(j \in \{1,2,\ldots,I_{n}\},\) the index N-tuple (i 1, ..., i n-1ji n+1, ..., i N ) formed by inserting j at the nth place of i is denoted by [ij|n].

In order to evaluate

$$\frac{\partial \,{\mathcal{H}}}{\partial \,u^{(n)}_{q,j}}, \quad q=1,2,\ldots,R_n, \,j = 1,2,\ldots,I_n,$$

we realize that \(u^{(n)}_{q,j}\) is involved in expressing all \(\hat \theta_{m,[{\bf i}, j|n]},\) m = 1, 2, ..., M, with \({\bf\ {{i}}} \in \Upupsilon_{-n}.\) Therefore,

$$\frac{\partial \,{\mathcal{H}}}{\partial \,u^{(n)}_{q,j}} = \sum_{m=1}^M \,\sum_{{\bf i} \in \Upupsilon_{-n}} \frac{\partial \,{\mathcal{H}}_{m,[{\bf i}, j|n] }} {\partial \,\hat \theta_{m,[{\bf i}, j|n]}} \,\frac{\partial \,\hat \theta_{m,[{\bf i}, j|n]}}{\partial \,u^{(n)}_{q,j}},$$
(39)

where

$$\frac{\partial \,{\mathcal{H}}_{m,[{\bf i}, j|n] }} {\partial \,\hat \theta_{m,[{\bf i}, j|n]}} = \left({\mathcal A}_{m,[{\bf i}, j|n]} - \frac{1}{2}\right) - \frac{\Uppsi_{m, [{\bf i}, j|n] }}{2} \,\hat \theta_{m,[{\bf i}, j|n]}$$
(40)

and from (35),

$$\frac{\partial \,\hat \theta_{m,[{\bf i}, j|n]}}{\partial \,u^{(n)}_{q,j}} = {\mathcal {B}}^{(n)}_{m,{\bf i},q} = \sum_{{\bf r} \in \rho_{-n}} {\mathcal {Q}}_{m, [{\bf r}, q|n]} \cdot \prod_{s=1, s \neq n}^N u^{(s)}_{r_s,i_s}.$$
(41)

Here, the index set \(\rho_{-n}\) is defined analogously to \(\Upupsilon_{-n}:\)

$$\rho_{-n} = \{1,2,...,R_1\} \times\cdots\times \{1,2,\ldots,R_{n-1}\} \times \{ 1 \} \times \{1,2,\ldots,R_{n+1}\} \times\cdots\times \{1,2,\ldots,R_N\}.$$
(42)

Setting the derivative (39) to zero results in

$${\sum_{m=1}^M \,\sum_{{\bf i} \in \Upupsilon_{-n}} (2 {\mathcal A}_{m,[{\bf i}, j|n]} - 1) \,{\mathcal {B}}^{(n)}_{m,{\bf i},q} = } \sum_{m=1}^M \ \sum_{{\bf i} \in \Upupsilon_{-n}} \Uppsi_{m, [{\bf i}, j|n]} \ \hat \theta_{m,[{\bf i}, j|n]} \ {\mathcal {B}}^{(n)}_{m,{\bf i},q}.$$
(43)

Rewriting (35) as

$$\hat \theta_{m,[{\bf i},j|n]} = \sum_{t = 1}^{R_n} \ \sum_{{\bf r} \in \rho_{-n}} {\mathcal {Q}}_{m, [{\bf r},t|n] } \ u^{(n)}_{t,j} \prod_{s=1, s \neq n}^N u^{(s)}_{r_s,i_s} \\ + \ \Updelta_{[{\bf i},j|n]}$$
(44)

and applying to (43) we obtain

$$\sum_{t = 1}^{R_n} u^{(n)}_{t,j} \ {\mathcal {K}}^{(n)}_{q,t,j} = {\mathcal {S}}^{(n)}_{q,j},$$
(45)

where

$${\mathcal {S}}^{(n)}_{q,j} = \sum_{m=1}^M \sum_{{\bf i} \in \Upupsilon_{-n}} (2 {\mathcal A}_{m,[{\bf i}, j|n]} - 1 - \Uppsi_{m, [{\bf i}, j|n]} \Updelta_{[{\bf i}, j|n]}) {\mathcal {B}}^{(n)}_{m,{\bf i},q},$$
(46)

and

$${\mathcal {K}}^{(n)}_{q,t,j} = \sum_{m=1}^M \sum_{{\bf r} \in \rho_{-n}} {\mathcal {Q}}_{m, [{\bf r},t|n] } \times \sum_{{\bf i} \in \Upupsilon_{-n}} \Uppsi_{m, [{\bf i}, j|n]} \ {\mathcal {B}}^{(n)}_{m,{\bf i},q} \prod_{s=1, s \neq n}^N u^{(s)}_{r_s,i_s}.$$
(47)

For each n-mode coordinate \(j \in \{ 1,2,\ldots,I_n\},\) collect the j-th coordinate values of all n-mode basis vectors into a column vector \({\bf u}^{(n)}_{:,j} = (u^{(n)}_{1,j}, u^{(n)}_{2,j}, \ldots, u^{(n)}_{R_n,j} )^T.\) Analogously, stack all the \({{\mathcal {S}}^{(n)}_{q,j}}\) values in a column vector \({{\mathcal {S}}^{(n)}_{:,j} = ({\mathcal {S}}^{(n)}_{1,j}, {\mathcal {S}}^{(n)}_{2,j}, \ldots,}\) \({\mathcal {S}}^{(n)}_{R_n,j})^T.\) Finally, we construct an \(R_n \times R_n\) matrix \({\mathcal {K}}^{(n)}_{:,:,j}\) whose q-th row is \(({\mathcal {K}}^{(n)}_{q,1,j}, {\mathcal {K}}^{(n)}_{q,2,j},\ldots, {\mathcal {K}}^{(n)}_{q,R_n,j}), q = 1,2,\ldots,R_n.\) The n-mode basis vectors are updated by solving I n linear systems of size \(R_n \times R_n\):

$${\mathcal {K}}^{(n)}_{:,\,:,\,j} \, {\bf u}^{(n)}_{:,\,j} = {\mathcal {S}}^{(n)}_{:,\,j},\quad j = 1,2,\ldots,I_n.$$
(48)

1.2 Updates for expansion coefficients

When updating the expansion coefficients \({{\mathcal {Q}}_{m,{\bf\ {{r}}}}, }\) the bias tensor \(\Updelta\) and the basis sets \(\{ {\bf\ {{u}}}^{(n)}_1, {\bf u}^{(n)}_2,\ldots, {\bf u}^{(n)}_{R_n} \}\) for all n modes n = 1, 2, ..., N are kept fixed to their current values.

For \({\bf r} \in \rho\) and \({\bf i} \in \Upupsilon\) denote \(\prod_{n=1}^N u^{(n)}_{r_n,i_n}\) by \(C_{{\bf r},{\bf i}}.\) For data index \(\ell = 1,2,\ldots,M\) and basis index \({\bf\ {{v}}} \in \rho\) we have

$$\frac{\partial \ {\mathcal{H}}}{\partial \ {\mathcal {Q}}_{\ell,{\bf v}}} = \sum_{m=1}^M \ \sum_{{\bf i} \in \Upupsilon} \frac{\partial \ {\mathcal{H}}_{m,{\bf i}}} {\partial \ \hat \theta_{m,{\bf i}}} \ \frac{\partial \ \hat \theta_{m,{\bf i}}}{\partial \ {\mathcal {Q}}_{\ell,{\bf v}}},$$
(49)

where

$$\frac{\partial \ {\mathcal{H}}_{m,{\bf i}}} {\partial \ \hat \theta_{m,{\bf i}}} = \left({\mathcal A}_{m,{\bf i}} - \frac{1}{2}\right) - \frac{\Uppsi_{m, {\bf i} }}{2} \ \hat \theta_{m,{\bf i}}$$
(50)

and \({ \frac{\partial \ \hat \theta_{m,{\bf i}}}{\partial \ {\mathcal {Q}}_{\ell,{\bf v}}} = C_{{\bf v},{\bf i}}}\) if \(m=\ell\) and \({\frac{\partial \ \hat \theta_{m,{\bf\ {{i}}}}}{\partial \ {\mathcal {Q}}_{\ell,{\bf v}}} = 0}\) otherwise.

By imposing \({\frac{\partial \ \mathcal{H}}{\partial \ {\mathcal {Q}}_{\ell,{\bf v}}} = 0, }\) we get

$${\mathcal {T}}_{{\bf v},\ell} = \sum_{{\bf r} \in \rho} {\mathcal {P}}_{{\bf v},{\bf r},\ell} \ {\mathcal {Q}}_{\ell,{\bf r}},$$
(51)

where

$${\mathcal {T}}_{{\bf v},\ell} = \sum_{{\bf i} \in \Upupsilon} (2 {\mathcal A}_{\ell,{\bf i}} - 1 - \Uppsi_{\ell,{\bf i}} \ \Updelta_{{\bf i}}) \ C_{{\bf v},{\bf i}}$$
(52)

and

$${\mathcal {P}}_{{\bf v},{\bf r},\ell} = \sum_{{\bf i} \in \Upupsilon} \Uppsi_{\ell,{\bf i}} \ C_{{\bf v},{\bf i}} \ C_{{\bf r},{\bf i}}.$$
(53)

To solve for expansion coefficients using the tools of matrix algebra, we need to vectorize tensor indices. Consider any one-to-one function \(\kappa\) from \(\rho\) to \(\{ 1,2,\ldots,\prod_{n=1}^N R_n \}.\) For each input tensor index \(\ell = 1,2,\ldots,M,\)

  • create a square \((\prod_{n=1}^N R_n) \times (\prod_{n=1}^N R_n)\)matrix \({{\mathcal {P}}_{:,:,\ell}}\) whose \((\kappa({\bf v}), \kappa({\bf r}))\)-th element is equal to \({{\mathcal {P}}_{{\bf\ {{v}}},{\bf r},\ell}, }\)

  • stack the values of \({{\mathcal {T}}_{{\bf v},\ell}}\) into a column vector \({{\mathcal {T}}_{:,\ell}}\) whose \(\kappa({\bf\ {{v}}})\)-th coordinate is \({{\mathcal {T}}_{{\bf v},\ell}, }\)

  • collect the expansion coefficients \({{\mathcal {Q}}_{\ell,{\bf\ {{r}}}}}\) in a column vector \({{\mathcal {Q}}_{\ell,:}}\) with \(\kappa({\bf r})\)-th coordinate equal to \({{\mathcal {Q}}_{\ell,{\bf r}}. }\)

The expansion coefficients for the \(\ell\)-th input tensor \({{\mathcal A}_\ell}\) can be obtained by solving

$${\mathcal {P}}_{:,:,\ell} \ {\mathcal {Q}}_{\ell,:} = {\mathcal {T}}_{:,\ell}, \ \ \ \ell = 1,2,\ldots,M.$$
(54)

1.3 Updates for the bias tensor

As before, when updating the bias tensor \(\Updelta,\) the expansion coefficients \({{\mathcal {Q}}_{m,{\bf r}}, }\) m = 1, 2, ...M\({\bf r} \in \rho,\) and the basis sets \(\{ {\bf u}^{(n)}_1, {\bf u}^{(n)}_2, \ldots, {\bf u}^{(n)}_{R_n} \}\) for all n modes n = 1, 2, ..., N are kept fixed to their current values.

Fix \({\bf j} \in \Upupsilon.\)We evaluate

$$\frac{\partial \ {\mathcal{H}}}{\partial \ \Updelta_{{\bf j}}} = \sum_{m=1}^M \ \sum_{{\bf i} \in \Upupsilon} \frac{\partial \ {\mathcal{H}}_{m,{\bf i}}} {\partial \ \hat \theta_{m,{\bf i}}} \ \frac{\partial \ \hat \theta_{m,{\bf i}}}{\partial \ \Updelta_{{\bf j}}},$$
(55)

where \(\frac{\partial \ \hat \theta_{m,{\bf i}}}{\partial \ \Updelta_{{\bf j}}}\) is equal to 1 if i = j and 0 otherwise.

Solving for \(\frac{\partial \ \mathcal{H}}{\partial \ \Updelta_{{\bf j}}} = 0\) leads to

$$\Updelta_{{\bf j}} = \frac{\sum_{m=1}^M 2 {\mathcal A}_{m,{\bf j}} -1 - \Uppsi_{m,{\bf j}} \cdot \sum_{{\bf r} \in \rho} {\mathcal {Q}}_{m,{\bf r}} \ C_{{\bf r},{\bf j}} } {\sum_{m=1}^M \Uppsi_{m,{\bf j}}}.$$
(56)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mažgut, J., Tiňo, P., Bodén, M. et al. Dimensionality reduction and topographic mapping of binary tensors. Pattern Anal Applic 17, 497–515 (2014). https://doi.org/10.1007/s10044-013-0317-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-013-0317-y

Keywords

Navigation