Incremental multi-linear discriminant analysis using canonical correlations for action recognition
Introduction
Nowadays, many feature extraction methods have been used in recognition related tasks, such as action recognition [1], [2], [3] and face recognition [4], [5]. Most traditional algorithms, such as principal component analysis (PCA) [6], [7] and linear discriminant analysis (LDA) [8], [9], represent an object as an one-dimensional vector. Recently, canonical correlations analysis (CCA) [10], [11], which reflects the degree of similarity between two image sets in orthogonal subspaces, has received increasing attention when used in recognition. Kim et al. [12] proposed an optimal discriminant function of canonical correlations (DCC) to transform image sets, so that the similarity of intra-class is maximized while the similarity of inter-class is minimized. Wu et al. [13] proposed an incremental learning scheme to update the discriminant matrix for the analysis of canonical correlations (IDCC), which does not require a complete re-training when training samples are available incrementally, resulting in reduced computational cost. However, in DCC and IDCC the original spatial structure was broken and the curse of dimensionality problem arose because of the image vectorization of these methods mentioned above.
In order to overcome this limitation, a number of multi-linear subspace analysis (MSA) methods [14], [15], [16] have been suggested for recognition related tasks. In [17], the discriminant analysis with tensor representation (DATER) was proposed to capture most of the discriminatory information by maximizing a tensor-based scatter ratio criterion. The incremental tensor biased discriminant analysis (ITBDA) [18] is suitable for distinguishing and tracking the objects by learning the tensor biased discriminant subspace online. However, most of the MSA methods work directly on a single sample, without considering the canonical correlations between different samples.
In this paper, we propose a novel CCA-based feature extraction method, called multi-linear discriminant analysis of canonical correlations (MDCC), to iteratively learn the multi-linear discriminant subspace using canonical correlations between different samples. We develop an online learning scheme for MDCC which is named incremental multi-linear discriminant analysis of canonical correlations (IMDCC). In IMDCC the added samples incrementally update the discriminant information, which can maximize the canonical correlations of the intra-class samples while minimizing the canonical correlations of the inter-class samples. We summarize the advantages of our algorithm IMDCC as follows:
- 1.
IMDCC operates on each mode of the training tensors separately to alleviate the curse of dimensionality problem.
- 2.
The optimization algorithm IMDCC converges fast in a few iterations as discussed in Section 3.3.
- 3.
IMDCC demonstrates the high computational efficiency of tensor subspace learning.
The rest of the paper is organized as follows. In Section 2, we introduce the tensor algebra and DCC algorithm. In Section 3, we present the MDCC and IMDCC algorithms and discuss the convergence performance of IMDCC. In Section 4, we compare the experimental results and the computational cost of IMDCC with those of other methods. Finally, conclusions are drawn in Section 5.
Section snippets
Multi-linear algebra
A tensor is a multi-dimensional array. In this paper, scalers are denoted by lowercase letters, e.g., a. Vectors (1-order tensor) are denoted by bold lowercase letters, e.g., a. Matrices (2-order tensor) are denoted by bold uppercase letters, e.g., . Higher-order tensors (3-order or higher) are denoted by calligraphic uppercase letters, e.g., .
An N-order tensor is represented as , where In is the dimensions of mode-n . An element of is denoted as
Incremental multi-linear discriminant analysis of canonical correlations
An action sample is naturally represented by an N-order tensor. The purpose of the IMDCC method is to find the discriminant transformation matrix (DTM) which maps the original multi-linear space to , using canonical correlations of incremental tensors. Assuming that m tensor samples come from C classes: , where is the i-th N-order tensor in the c-th class, mc is the number of tensors in the c-th
Actions from the Weizmann database
The experiment was performed on the Weizmann database, which was a commonly used database for human action recognition. There were 90 low-resolution (180×144, 25 fps) videos which came from 10 action categories in the database.
We extracted 3500 samples from these 90 videos, and each sample consists of 20 successive frames and begins every other frame. We used 3000 samples for training and the remaining 500 samples for testing. Both the training set and testing set contain all 10 different
Conclusions
In this paper we proposed MDCC, a novel CCA-based feature extraction method. MDCC iteratively learns the multi-linear discriminant subspace using the canonical correlations between different samples. Furthermore, we developed IMDCC, which is an online learning scheme for MDCC. IMDCC incrementally updates the discriminant transformation matrices, which can maximize the canonical correlations of intra-class samples while minimize the canonical correlations of inter-class samples.
The features of
Acknowledgements
This paper is supported by (1) the National Natural Science Foundation of China under Grant Nos. 61175023, 60973092, 60903097, (2) project of science and technology innovation platform of computing and software science (985 engineering), (3) the Key Laboratory for Symbolic Computation and Knowledge Engineering of Ministry of Education, China, (4) the Natural Science Foundation of Jilin province of China under Grant No. 201115022, (5) W. Pang is funded by the UK Biotechnology and Biological
Cheng-Cheng Jia is currently a Ph.D. candidate in the Department of computer science and technology at Jilin University, China. She received her M.S. in Department of computer science and technology at Jilin University, China, in June 2010. Her present research interest centers on pattern recognition and image processing.
References (19)
- et al.
Face recognition using second order discriminant tensor subspace analysis
Neurocomputing
(2011) - et al.
Incremental discriminant-analysis of canonical correlations for action recognition
Pattern Recognition
(2010) - et al.
Tensor rank one discriminant analysis—a convergent method for discriminative multilinear subspace selection
Neurocomputing
(2008) - et al.
Incremental tensor biased discriminant analysis: a new color-based visual tracking method
Neurocomputing
(2010) - et al.
A set of co-occurrence matrices on the intrinsic manifold of human silhouettes for action recognition
- et al.
Eigen-space learning using semi-supervised diffusion maps for human action recognition
- L. Shao, X. Chen, Histogram of body poses and spectral regression discriminant analysis for human action...
- et al.
Tensor discriminant color space for face recognition
IEEE Trans. Image Process.
(2011) - et al.
Eigenfaces vs. fisherfaces: recognition using class specific linear projection
IEEE Trans. Pattern Anal. Mach. Intell.
(1997)
Cited by (0)
Cheng-Cheng Jia is currently a Ph.D. candidate in the Department of computer science and technology at Jilin University, China. She received her M.S. in Department of computer science and technology at Jilin University, China, in June 2010. Her present research interest centers on pattern recognition and image processing.
Su-Jing Wang received the Master's degree from the Software College of Jilin University, Changchun, China, in 2007. From September 2008, he is pursuing to the Ph.D. degree at the College of Computer Science and Technology of Jilin University. He has published more than 20 scientific papers. His research was published in IEEE Transactions on Image Processing, Neurocomputing, etc. His current research interests include pattern recognition, computer vision and machine learning. For details, please refer to his homepage http://sujingwang.name.
Xu-Jun Peng obtained his Ph.D. from department of computer science and engineering at the state university of New York at Buffalo. Currently, he is a research scientist with Raytheon BBN technologies. His research interests include Machine Learning, Image Processing and Document Analysis.
Wei Pang received the B.Sc. and M.Sc. degrees in computer science from Jilin University in 2001 and 2004, and PhD degree in computing science from University of Aberdeen in 2009. He is currently a research fellow in University of Aberdeen, and also holds a lectureship in Jilin University. His research interests include qualitative model learning, evolutionary algorithms, and artificial immune systems.
Can-Yan Zhang is a master in the Department of Computer Science and Technology at Harbin Engineering University, Harbin, China. His present research interest centers on Distribute Computation and Networks.
Chun-Guang Zhou is Jilin-province-management Expert, Highly Qualified Expert of Jilin Province, One-hundred Science-Technique elite of Changchun. And he is awarded the Governmental Subsidy from the State Department. He has many pluralities of national and international academic organizations. His research interests include related theories, models and algorithms of artificial neural networks, fuzzy systems and evolutionary computations, and applications of machine taste and smell, image manipulation, commercial intelligence, modern logistic, bioinformatics, and biometric identification based on computational intelligence. He has published over 168 papers in Journals and conferences and he published one academic book.
Zhe-Zhou Yu studied at College of Computer Science and Technology, Jilin University since 1978 and worked at Changchun Institute of Fine Mechanics and Optics, Academia Sinica in 1982. In 2000, he returned to Jilin University. Now, his research interests mainly include computational intelligence and embedded system applications. He has published over 40 research papers including more than 20 indexed by EI/SCI/ISTP, and owns one national patent for invention and three software copyrights.