Elsevier

Neurocomputing

Volume 83, 15 April 2012, Pages 56-63
Neurocomputing

Incremental multi-linear discriminant analysis using canonical correlations for action recognition

https://doi.org/10.1016/j.neucom.2011.11.006Get rights and content

Abstract

Canonical correlations analysis (CCA) is often used for feature extraction and dimensionality reduction. However, the image vectorization of CCA breaks the spatial structure of the original image, and the excessive dimensions of vectors often cause the curse of dimensionality problem. In this paper, we propose a novel feature extraction method based on CCA in multi-linear discriminant subspace by encoding each action sample as a high-order tensor. An optimization approach is presented to iteratively learn the discriminant subspace by unfolding the tensor along different tensor modes, which shows that most of the underlying data structure, including the spatio-temporal information, is retained and the curse of dimensionality problem is alleviated by the use of the proposed approach. At the same time, an incremental scheme is developed for multi-linear subspace online learning, which can improve the discriminative capability efficiently and effectively. In addition, the nearest neighbor classifier (NNC) is employed for action classification. Experiments on the Weizmann database show that the proposed method outperforms the state-of-the-art methods in terms of accuracy and time complexity, and it is robust against partial occlusion.

Introduction

Nowadays, many feature extraction methods have been used in recognition related tasks, such as action recognition [1], [2], [3] and face recognition [4], [5]. Most traditional algorithms, such as principal component analysis (PCA) [6], [7] and linear discriminant analysis (LDA) [8], [9], represent an object as an one-dimensional vector. Recently, canonical correlations analysis (CCA) [10], [11], which reflects the degree of similarity between two image sets in orthogonal subspaces, has received increasing attention when used in recognition. Kim et al. [12] proposed an optimal discriminant function of canonical correlations (DCC) to transform image sets, so that the similarity of intra-class is maximized while the similarity of inter-class is minimized. Wu et al. [13] proposed an incremental learning scheme to update the discriminant matrix for the analysis of canonical correlations (IDCC), which does not require a complete re-training when training samples are available incrementally, resulting in reduced computational cost. However, in DCC and IDCC the original spatial structure was broken and the curse of dimensionality problem arose because of the image vectorization of these methods mentioned above.

In order to overcome this limitation, a number of multi-linear subspace analysis (MSA) methods [14], [15], [16] have been suggested for recognition related tasks. In [17], the discriminant analysis with tensor representation (DATER) was proposed to capture most of the discriminatory information by maximizing a tensor-based scatter ratio criterion. The incremental tensor biased discriminant analysis (ITBDA) [18] is suitable for distinguishing and tracking the objects by learning the tensor biased discriminant subspace online. However, most of the MSA methods work directly on a single sample, without considering the canonical correlations between different samples.

In this paper, we propose a novel CCA-based feature extraction method, called multi-linear discriminant analysis of canonical correlations (MDCC), to iteratively learn the multi-linear discriminant subspace using canonical correlations between different samples. We develop an online learning scheme for MDCC which is named incremental multi-linear discriminant analysis of canonical correlations (IMDCC). In IMDCC the added samples incrementally update the discriminant information, which can maximize the canonical correlations of the intra-class samples while minimizing the canonical correlations of the inter-class samples. We summarize the advantages of our algorithm IMDCC as follows:

  • 1.

    IMDCC operates on each mode of the training tensors separately to alleviate the curse of dimensionality problem.

  • 2.

    The optimization algorithm IMDCC converges fast in a few iterations as discussed in Section 3.3.

  • 3.

    IMDCC demonstrates the high computational efficiency of tensor subspace learning.

The rest of the paper is organized as follows. In Section 2, we introduce the tensor algebra and DCC algorithm. In Section 3, we present the MDCC and IMDCC algorithms and discuss the convergence performance of IMDCC. In Section 4, we compare the experimental results and the computational cost of IMDCC with those of other methods. Finally, conclusions are drawn in Section 5.

Section snippets

Multi-linear algebra

A tensor is a multi-dimensional array. In this paper, scalers are denoted by lowercase letters, e.g., a. Vectors (1-order tensor) are denoted by bold lowercase letters, e.g., a. Matrices (2-order tensor) are denoted by bold uppercase letters, e.g., A. Higher-order tensors (3-order or higher) are denoted by calligraphic uppercase letters, e.g., A.

An N-order tensor is represented as ARI1×I2××In××IN, where In is the dimensions of mode-n (1nN). An element of A is denoted as Ai1i2iniN(1inIn)

Incremental multi-linear discriminant analysis of canonical correlations

An action sample is naturally represented by an N-order tensor. The purpose of the IMDCC method is to find the discriminant transformation matrix (DTM) TnRIn×Jn(Jn<In,1nN) which maps the original multi-linear space RI1×I2××IN to RJ1×J2××JN, using canonical correlations of incremental tensors. Assuming that m tensor samples come from C classes: {A11,,Am11,A12,,Am22,,A1C,,AmCC}, where AicRI1×I2××IN is the i-th N-order tensor in the c-th class, mc is the number of tensors in the c-th

Actions from the Weizmann database

The experiment was performed on the Weizmann database, which was a commonly used database for human action recognition. There were 90 low-resolution (180×144, 25 fps) videos which came from 10 action categories in the database.

We extracted 3500 samples from these 90 videos, and each sample consists of 20 successive frames and begins every other frame. We used 3000 samples for training and the remaining 500 samples for testing. Both the training set and testing set contain all 10 different

Conclusions

In this paper we proposed MDCC, a novel CCA-based feature extraction method. MDCC iteratively learns the multi-linear discriminant subspace using the canonical correlations between different samples. Furthermore, we developed IMDCC, which is an online learning scheme for MDCC. IMDCC incrementally updates the discriminant transformation matrices, which can maximize the canonical correlations of intra-class samples while minimize the canonical correlations of inter-class samples.

The features of

Acknowledgements

This paper is supported by (1) the National Natural Science Foundation of China under Grant Nos. 61175023, 60973092, 60903097, (2) project of science and technology innovation platform of computing and software science (985 engineering), (3) the Key Laboratory for Symbolic Computation and Knowledge Engineering of Ministry of Education, China, (4) the Natural Science Foundation of Jilin province of China under Grant No. 201115022, (5) W. Pang is funded by the UK Biotechnology and Biological

Cheng-Cheng Jia is currently a Ph.D. candidate in the Department of computer science and technology at Jilin University, China. She received her M.S. in Department of computer science and technology at Jilin University, China, in June 2010. Her present research interest centers on pattern recognition and image processing.

References (19)

There are more references available in the full text version of this article.

Cited by (0)

Cheng-Cheng Jia is currently a Ph.D. candidate in the Department of computer science and technology at Jilin University, China. She received her M.S. in Department of computer science and technology at Jilin University, China, in June 2010. Her present research interest centers on pattern recognition and image processing.

Su-Jing Wang received the Master's degree from the Software College of Jilin University, Changchun, China, in 2007. From September 2008, he is pursuing to the Ph.D. degree at the College of Computer Science and Technology of Jilin University. He has published more than 20 scientific papers. His research was published in IEEE Transactions on Image Processing, Neurocomputing, etc. His current research interests include pattern recognition, computer vision and machine learning. For details, please refer to his homepage http://sujingwang.name.

Xu-Jun Peng obtained his Ph.D. from department of computer science and engineering at the state university of New York at Buffalo. Currently, he is a research scientist with Raytheon BBN technologies. His research interests include Machine Learning, Image Processing and Document Analysis.

Wei Pang received the B.Sc. and M.Sc. degrees in computer science from Jilin University in 2001 and 2004, and PhD degree in computing science from University of Aberdeen in 2009. He is currently a research fellow in University of Aberdeen, and also holds a lectureship in Jilin University. His research interests include qualitative model learning, evolutionary algorithms, and artificial immune systems.

Can-Yan Zhang is a master in the Department of Computer Science and Technology at Harbin Engineering University, Harbin, China. His present research interest centers on Distribute Computation and Networks.

Chun-Guang Zhou is Jilin-province-management Expert, Highly Qualified Expert of Jilin Province, One-hundred Science-Technique elite of Changchun. And he is awarded the Governmental Subsidy from the State Department. He has many pluralities of national and international academic organizations. His research interests include related theories, models and algorithms of artificial neural networks, fuzzy systems and evolutionary computations, and applications of machine taste and smell, image manipulation, commercial intelligence, modern logistic, bioinformatics, and biometric identification based on computational intelligence. He has published over 168 papers in Journals and conferences and he published one academic book.

Zhe-Zhou Yu studied at College of Computer Science and Technology, Jilin University since 1978 and worked at Changchun Institute of Fine Mechanics and Optics, Academia Sinica in 1982. In 2000, he returned to Jilin University. Now, his research interests mainly include computational intelligence and embedded system applications. He has published over 40 research papers including more than 20 indexed by EI/SCI/ISTP, and owns one national patent for invention and three software copyrights.

View full text