Abstract
We introduce a method to extract compressed outline shapes of objects from global textures of volumetric data and to classify them by multiway tensor analysis. For the extraction of outline shapes, we applied three-way tensor principal component analysis to voxel images. A small number of major principal components represent the shape of objects in a voxel image. For the classification of objects, we use tensor subspace method. Using extracted outline shapes and tensor-based classification method, we achieve pattern recognition for volumetric data.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
1 Introduction
For shape analysis in medicine and biology, outline shapes are fundamental feature for recognition and retrieval in information filtering from large amount of data. Therefore, for fast recognition and retrieval, we are required to extract outline shapes from volumetric medical data. Since the nature of medical and biological data is volumetric, these data are expressed in multiway data array. Using multiway data analysis [1], we extract compressed outline shapes of objects from global texture of volumetric data. For this extraction, we applied three-way tensor principal component analysis [2] to voxel images, although in traditional data analysis, three-way data are embedded in vector space. A small number of major principal components represent the shape of the objects in a voxel image. Applying tensor subspace method (TSM) [3] to these major principal components, we construct tensor-based classification. In the classification, the TSM measures the similarity between a query and a tensor subspace spanned by principal axes of a category.
For numerical computation, we deal with sampled patterns. In traditional pattern recognition, these sampled patterns are embedded in an appropriate-dimensional Euclidean space as vectors. The other way is to deal with sampled patterns as higher-dimensional array data. These array data are expressed by tensor to preserve multilinearity of function in the original pattern space. Tensors allow expressing multidimensional array data in multilinear forms.
For the analysis of three-way array, three-mode factor analysis has been proposed [1] in statistics. This method called Tucker decomposition. The Tucker decomposition with orthogonal constraint is equivalent to the third-order tensor principal component analysis (TPCA) [2]. Applying this three-way TPCA to voxel images, we can directly extract the compressed outline shape of objects from volumetric data. In the decomposition procedure of the TPCA, we can extract multilinear structure of each category from data. Using these structures of each category, we can construct subspace learning method.
For the extraction of the outline of shapes in two-dimensional images, active contour model [4] and active shape model [5] has been proposed. The active contour model, which is known as snakes, extracts the outline of a shape by an energy-minimising spline without specific models for categories. The active shape model extracts the shape of objects using the specific models, which are obtained from learning data of categories. As the extensions of these shape extraction models to points cloud in three-dimensional space, statistical models are proposed [6, 7]. In two- and three-dimensional images, these models rely on the set of points, which represent each boundary of objects on images. These points are manually and semi-automatically extracted in advance as preprocessing, while our method extracts an outline shape without the extraction of points.
Furthermore, we show that the compression of volumetric data by the three-way TPCA can be approximated by the reduction based on the three-dimensional discrete cosine transform (3D-DCT).
2 Extraction Based on Tensor Form
2.1 Tensor Representation for N-way Arrays
A Nth-order tensor \(\mathcal {X}\) defined in \(\mathbb {R}^{I_1 \times I_2 \times \dots I_N}\) is expressed as
for \({x}_{i_1,i_2, \dots i_N}\in \mathbb {R}\) by Nindices \(i_n\). Each subscript n denotes the n-mode of \(\mathcal {X}\). For the outer products of N vectors, if the tensor \(\mathcal {X}\) satisfies the condition
where \(\circ \) denotes the outer product, we call this tensor \(\mathcal {X}\) a rank-one tensor. For \(\mathcal {X}\), the n-mode vectors, \(n=1,2,\dots ,N\), are defined as the \(I_n\)-dimensional vectors obtained from \(\mathcal {X}\) by varying this index \(i_n\) while fixing all the other indices. The unfolding of \(\mathcal {X}\) along the n-mode vectors of \(\mathcal {X}\) is defined by
where \(I_{n'} = I_1 \times I_2 \times \dots I_{n-1} \times I_{n+1} \times \dots \times I_N\), and the column vectors of \(\mathcal {X}_{(n)}\) are the n-mode vectors of \(\mathcal {X}\). The n-mode product \(\mathcal {X}\times _n \varvec{U}\) of a matrix \(\varvec{U} \in \mathbb {R}^{J_n\times I_n}\) and a tensor \(\mathcal {X}\) is a tensor \(\mathcal {G} \in \mathbb {R}^{I_1 \times I_2 \times \dots \times I_{n-1} \times J_n \times I_{n+1}\times \dots \times I_N}\) with elements
by the manner in ref. [8]. For the m- and n-mode product by matrices \(\varvec{U}\) and \(\varvec{V}\), we have
since n-mode projections are commutative [8]. We define the inner product of two tensors \(\mathcal {X}=(x_{i_1,i_2,\dots ,i_N})\) and \(\mathcal {Y}=(y_{i_1,i_2,\dots ,i_N})\) in \(\mathbb {R}^{I_1 \times I_2 \times \dots \times I_N}\) by
Using this inner product, the Frobenius norm of a tensor \(\mathcal {X}\) is
where vec and \(\Vert \cdot \Vert _2\) are the vectorisation operator and Euclidean norm of a tensor, respectively. For the two tensors \(\mathcal {X}_1\) and \(\mathcal {X}_2\), we define the distance between them as
Although this definition is a tensor-based measure, this distance is equivalent to the Euclidean distance between the vectorised tensors \(\mathcal {X}_1\) and \(\mathcal {X}_2\) from Eq. (7).
2.2 Projection to Tensor Subspace
As the tensor \(\mathcal {X}\) is in the tensor space \(\mathbb {R}^{I_1} \otimes \mathbb {R}^{I_2} \otimes \dots \otimes \mathbb {R}^{I_N}\), the tensor space can be interpreted as the Kronecker product of N vector spaces \(\mathbb {R}^{I_1},\mathbb {R}^{I_2}, \dots , \mathbb {R}^{I_N}\). To project \(\mathcal {X} \in \mathbb {R}^{I_1} \otimes \mathbb {R}^{I_2} \otimes \dots \otimes \mathbb {R}^{I_N}\) to another tensor \(\mathcal {Y}\) in a lower-dimensional tensor space \(\mathbb {R}^{P_1} \otimes \mathbb {R}^{P_2} \otimes \dots \otimes \mathbb {R}^{P_N}\), where \(P_n \le I_n \) for \(n=1,2, \dots , N\), we need N orthogonal matrices \(\{ \varvec{U}^{(n)} \in \mathbb {R}^{I_n \times P_n} \}_{n=1}^N\). Using the N projection matrices, the tensor-to-tensor projection (TTP) is given by
This projection is established in N steps, where at the nth step, each n-mode vector is projected to a \(P_n\)-dimensional space by \(\varvec{U}^{(n)}\). The reconstruction from a projected tensor \(\hat{\mathcal {X}}\) is achieved by
2.3 Principal Component Analysis for Third-Order Tensors
A third-order tensor \(\mathcal {X}\in \mathbb {R}^{I_1 \times I_2 \times I_3}\), which is the array \(\varvec{X} \in \mathbb {R}^{I_1 \times I_2 \times I_3} \), is denoted as a triplet of indices \((i_1, i_2, i_3)\). We set the identity matrices \(\varvec{I}_j, \ j=1, 2, 3\) in \(\mathbb {R}^{I_j\times I_j}\). Here we summarise higher-order singular value decomposition (HOSVD) [9] for third-order tensors. For a collection of tensors \(\{\mathcal {X}_i\}_{i=1}^N \in \mathbb {R}^{I_1 \times I_2 \times I_3}\) satisfying the zero expectation condition \(\mathrm {E} (\mathcal {X}_i )\) \(= 0\), we compute
where \(\varvec{U}^{(j)}=[\varvec{u}^{(j)}_1,\dots ,\varvec{u}^{(j)}_{I_j}]\), that minimises the criterion
and maximises the criterion
with respect to the conditions \(\varvec{U}^{(j)\top }\varvec{U}^{(j)} =\varvec{I}_j\). By fixing \(\{ \varvec{U}^{j} \}_{j=1}^N\) except \(\varvec{U}^{(j')}\), \(j'=\{ 1, 2, 3 \}\), we have
Eigendecomposition problems are derived by computing the extremes of
For matrices \(\varvec{M}^{(j)} = \frac{1}{N} \sum _{i=1}^N \mathcal {X}_{i,(j)}\mathcal {X}_{i,(j)}^{\top }\), \(j=1,2,3\) of \(\mathrm {rank} \, \varvec{M}^{(j)}=K\), the optimisation of \(J_-\) derives the eigenvalue decomposition
where \(\varvec{\varSigma }^{(j)} \in \mathbb {R}^{I_j \times I_j}, \ j=1, 2, 3\), are diagonal matrices satisfying the relationships \(\lambda ^{(j)}_k =\lambda ^{(j')}_k, \ k \in \{ 1, 2, \dots , K\}\) for
For the optimisation of \( \{ J_j \}_{j=1}^3\), there is no closed-form solution to this maximisation problem [9]. Algorithm 1 is the iterative procedure of multilinear TPCA [2]. We adopt Algorithm 1 [2] to optimise \( \{ J_j \}_{j=1}^3\). For \(\varvec{p}_{k} \in \{ \varvec{e}_k\}_{k=1}^K\), we set orthogonal projection matrices \(\varvec{P}^{(j)} = \sum _{k=1}^{k_j} \varvec{p}_{k} \varvec{p}_{k}^{\top }\) for \(j=1, 2, 3\). Using these \(\{ \varvec{P}^{(j)} \}_{j=1}^3\), the low-rank tensor approximation [9] is given by
where \(\varvec{P}^{(j)}\) selects \(k_j\) bases of projection matrices \(\varvec{U}^{(j)}\). The low-rank approximation using Eq. (18) is used for compression in TPCA.

For HOSVD for third-order tensors, we have the following theorems.
Theorem 1
The compression computed by HOSVD is equivalent to the compression computed by TPCA.
(Proof) The projection that selects \(K=k_1 k_2 k_3\) bases of the tensor space spanned by \(u^{(1)}_{i_1} \circ u^{(2)}_{i_2} \circ u^{(3)}_{i_3}, \ i_j = 1, 2, \dots , k_j\) for \(j= 1, 2, 3\), is
where \(\varvec{W}\) and \(\varvec{P}\) are the projection matrix and a unitary matrix, respectively. Therefore, HOSVD is equivalent to TPCA for third-order tensors. ( Q.E.D.)
Furthermore, we have the following theorem.
Theorem 2
The HOSVD method is equivalent to the vector PCA method.
(Proof) The equation
is equivalent to
( Q.E.D.)
This theorem implies that the 3D-DCT-based reduction is an acceptable approximation of HOSVD for third-order tensors since this is an analogy of the approximation of PCA for two-dimensional images by the reduction based on the two-dimensional discrete cosine transform [10].
2.4 Reduction by Three-Dimensional Discrete Cosine Transform
For sampled one-dimensional signal \(\varvec{x}=(x_1,x_2,\dots ,x_n)^{\top }\), we have transformed signal \(\tilde{\varvec{x}} = (\tilde{x}_1,\tilde{x}_2,\dots ,\tilde{x}_n)^{\top }\) by using discrete cosine transform (DCT)-II [11, 12]. Matrix representation of the DCT transform is given by
where \(i, j = 1, 2, \dots n\).
For a set of a third-order tensor \( \{ \varvec{X}_i \}_{i=1}^N\) such that \( \varvec{X}_i \in \mathbb {R}^{n \times n \times n}\), setting a DCT matrix \(\varvec{D} \in \mathbb {R}^{n \times n}\) and projection matrices \(\varvec{P}^{(1)} \in \mathbb {R}^{k_1 \times n}\), \(\varvec{P}^{(2)} \in \mathbb {R}^{k_2 \times n}\) and \(\varvec{P}^{(3)} \in \mathbb {R}^{k_3 \times n}\), we define the 3D-DCT-based reduction by
where \(k_1, k_2, k_3 < n\). The 3D-DCT-based reduction is an acceptable approximation of the compression by the PCA, TPCA and HOSVD. If we apply the fast Fourier transform to the computation of the 3D-DCT for tensor projections of each mode, the computational complexity is \(\mathcal {O}(n \log n)\).
3 Classification Based on Tensor Form
We adopt multilinear tensor subspace method for third-order tensors. This method is a three-dimensional version of the 2DTSM [3]. For a third-order tensor \(\mathcal {X}\), setting \(\varvec{U}^{(j)}, \ j=1, 2, 3\), to be orthogonal matrices, we call the operation
the orthogonal projection of \(\mathcal {X}\) to \(\mathcal {Y}\). Therefore, using this expression for a collection of matrices \(\{\mathcal {X}_i \}_{i=1}^M\), such that \(\mathcal {X}_i \in {\mathbb {R}^{I_1 \times I_2 \times I_3}}\) and \(\mathrm {E}(\mathcal {X}_i)=0\), the solutions of
with respect to \(\varvec{U}^{(j)\top }\varvec{U}^{(j)} =\varvec{I}\) for \(j=1, 2, 3\) define a trilinear subspace that approximates \(\{\mathcal {X}_i \}_{i=1}^M\). Therefore, using orthogonal matrices \(\{ \varvec{U}_{k}^{(j)} \}_{j=1}^3\) obtained as the solutions of Eq. (25) for kth category, if a query tensor \(\mathcal {G}\) satisfies the condition
we conclude that \(\mathcal {G} \in \mathcal {C}_k\), \(k,l=1,2, \dots , N_{\mathcal {C}}\), where \(\mathcal {C}_k\) and \(N_{\mathcal {C}}\) are the tensor subspace of kth category and the number of categories, respectively. For the practical computation of projection matrices \(\{ \varvec{U}_{k}^{(j)} \}_{j=1}^3\), we adopt the iterative method described in Algorithm 1.
4 Numerical Examples
We present two examples for extraction of outline shapes of volume data, and abilities of our method for classification of volumetric data. For experiments, we use the voxel images of human livers obtained as CT images. This image set contains 25 male-livers and seven female-livers. Note that these voxel images are aligned to their centre of gravity. In the experiments, we project these voxel images to small size tensors. For the projections, we adopt TPCA and the 3D-DCT. In the iterative method of TPCA, setting the number of bases to the size of the original tensors in Algorithm 1, we call the method full projection (FP). If we set the number of bases to smaller than the size of the original tensors in Algorithm 1, we call the method full projection truncation (FPT). Table 1 summarises the sizes and numbers of original and dimension-reduced voxel images.
Firstly, we show the approximation of a voxel image of a liver by three methods. The FP, FPT and 3D-DCT reduce the size of the data from \(89 \times 97 \times 76\) voxels to \(32 \times 32 \times 32\) voxels. Figure 1 illustrates volume rendering of original data and reconstructed data by these compressed tensors. Compared to Figs. 1(a) and 1(e), in Figs. 1(b)–(c) and (f)–(h), the FP, FPT and 3D-DCT preserve outline shapes of liver. In Fig. 1, the reconstructed data by the 3D-DCT gives a closer outline shape and more similar interior texture to those of the original than the FP and FPT. In Fig. 1, these results show that projections to small-size tensors extract outline shapes.
For the analysis of projected data by the FP, FPT and 3D-DCT, we decompose these projected tensors by Algorithm 1. Here, we set the size of bases in Algorithm 1 to \(32 \times 32 \times 32\) and use 35 projected tensors of livers for each reduction methods. In decompositions, we reordered eigenvalues \(\lambda ^{(j)}_i\), \(j=1, 2, 3\), \(i=1, 2, \dots , 32\) of the three modes to \(\lambda _i\), \(i=1, 2, \dots , 96\) in the descending order. Figure 2 shows the cumulative contribution ratios of reordered eigenvalues for the projected tensors obtained by the FP, FPT and 3D-DCT. Figure 3 illustrates reconstructed data obtained by using the 20 major principal components.
In Fig. 2, profiles of curves for three methods are almost coincident while the CCR of the 3D-DCT is a little bit higher than the others. In three methods, the CCRs become higher than 0.8 if we select more than 19 major principal components. In Fig. 3, shapes and interior texture for three methods are almost the same. In Figs. 3(d)–(f), the interior texture of a liver is not preserved and the outer shape is burred. In these results for three methods, major principal components represent outline shapes.
Secondly, we show results of the classification of voxel images of livers by the TSM. For the classification, we use 25 male-livers and seven female-livers since the sizes and shapes of livers between male and female are statistically different. Figures 4(a) and 4(b) illustrates the examples of livers of male and female, respectively. We use the voxel images of livers of 13 males and 4 females as training data. The residual voxel images are used as test data. In the recognition, we estimate the gender of livers. The recognition rate is defined as the successful estimation ratio for 1000 gender estimations. In each estimation of a gender for a query, queries are randomly chosen from the test dataset. For the 1-, 2- and 3-modes, we evaluate the results for multilinear subspaces with sizes from one to the dimension of the rejected tensors. Figure 4(c) shows the results of the classification. The TSM give 90 % recognition rate at the best with tensor subspace spanned by every two major principal axis of the three modes.
Original and reconstructed volumetric data of liver data. (a) shows the rendering of original data. (b)–(d) show the rendering of reconstructed data after the FP, FPT and 3DDCT, respectively. (e)–(f) illustrate axial slice images of these volumetric data in (a)–(d), respectively. The sizes of reduced tensors are shown in Table 1.
Reconstruction by using only major principal components of the decomposition by the FP. Top and bottom rows illustrate volume rendering and axial slice of reconstructed data, respectively. For reconstruction, we use the 20 major principal components. Left, middle and right columns illustrate the results for the tensors projected by the FP, FPT and 3D-DCT.
Recognition rates of gait patterns and liver data for original and compressed tensors. (a) and (b) illustrate the examples of livers of male and female, respectively. (c) shows the recognition rates for three compression methods. For compression, we use the HOSVD, FP, FPT and 3D-DCT. In (c), the horizontal and vertical axes represent the compression ratio and recognition ratio [%], respectively. For the original size \(D = 89 \times 97 \times 76\) and the reduced size \(K = k \times k' \times k''\), the compression ratio is given as D / K for reduced size k.
5 Conclusions
We applied the three-way TPCA to the extraction of outline shapes of volumetric data and classification of them. In the numerical examples, we demonstrated that the three-way TPCA extracts the outline shape of human livers. Furthermore, the TSM accurately classified the extracted outline shapes. Moreover, we showed that the 3D-DCT-based reduction approximated both the outline shape and texture of livers.
This research was supported by the “Multidisciplinary Computational Anatomy and Its Application to Highly Intelligent Diagnosis and Therapy” project funded by a Grant-in-Aid for Scientific Research on Innovative Areas from MEXT, Japan, and by Grants-in-Aid for Scientific Research funded by the Japan Society for the Promotion of Science.
References
Kroonenberg, P.M.: Applied Multiway Data Analysis. Wiley, Hoboken (2008)
Lu, H., Plataniotis, K., Venetsanopoulos, A.: MPCA: multilinear principal component analysis of tensor objects. IEEE Trans. Neural Netw. 19(1), 18–39 (2008)
Itoh, H., Sakai, T., Kawamoto, K., Imiya, A.: Topology-preserving dimension-reduction methods for image pattern recognition. In: Kämäräinen, J.-K., Koskela, M. (eds.) SCIA 2013. LNCS, vol. 7944, pp. 195–204. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38886-6_19
Kass, M., Witkin, A., Trzopoulos, D.: Snakes: active contour models. IJCV 1(4), 321–331 (1988)
Cootes, T.F., Cooper, D., Taylor, C.J., Graham, J.: Active shape models - their training and application. CVIU 61, 38–59 (1995)
McInerney, T., Terzopoulos, D.: Deformable models in medical image analysis: a survey. Med. Image Anal. 1(2), 91–108 (1996)
Davies, R.H., Twining, C.J., Cootes, T.F., Taylor, C.J.: Building 3-D statistical shape models by direct optimization. IEEE Trans. Med. Imaging 29(4), 961–981 (2010)
Cichoki, A., Zdunek, R., Phan, A.H., Amari, S.: Nonnegative Matrix and Tensor Factorizations. Wiley, Hoboken (2009)
Lathauwer, L.D., Moor, B.D., Vandewalle, J.: On the best rank-1 and rank-(\(r_1, r_2, r_n\)) approximation of higher-order tensors. SIAM J. Matrix Anal. Appl. 21(4), 1324–1342 (2000)
Oja, E.: Subspace Methods of Pattern Recognition. Research Studies Press, Baldock (1983)
Ahmed, N., Natarajan, T., Rao, K.R.: Discrete cosine transform. IEEE Trans. Comput. C–23, 90–93 (1974)
Hamidi, M., Pearl, J.: Comparison of the cosine and Fourier transforms of Markov-1 signals. IEEE Trans. Acoust. Speech Sig. Process. 24(5), 428–429 (1976)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Itoh, H., Imiya, A., Sakai, T. (2016). Classification of Volumetric Data Using Multiway Data Analysis. In: Robles-Kelly, A., Loog, M., Biggio, B., Escolano, F., Wilson, R. (eds) Structural, Syntactic, and Statistical Pattern Recognition. S+SSPR 2016. Lecture Notes in Computer Science(), vol 10029. Springer, Cham. https://doi.org/10.1007/978-3-319-49055-7_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-49055-7_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49054-0
Online ISBN: 978-3-319-49055-7
eBook Packages: Computer ScienceComputer Science (R0)