Elsevier

Knowledge-Based Systems

Volume 66, August 2014, Pages 126-135
Knowledge-Based Systems

Supervised locality discriminant manifold learning for head pose estimation

https://doi.org/10.1016/j.knosys.2014.04.028Get rights and content

Highlights

  • We propose a novel supervised locality discriminant manifold learning approach.

  • We combine the discriminant graph embedding and Laplacian regularized least square.

  • We design an optimal supervised weight for estimating head pose more accurately.

Abstract

In this paper, we propose a novel supervised manifold learning approach, supervised locality discriminant manifold learning (SLDML), for head pose estimation. Traditional manifold learning methods focus on preserving only the intra-class geometric properties of the manifold embedded in the high-dimensional ambient space, so they cannot fully utilize the underlying discriminative knowledge of the data. The proposed SLDML aims to explore both geometric structure and discriminant information of the data, and yields a smooth and discriminative low-dimensional embedding by adding the local discriminant terms in the optimization objectives of manifold learning. Moreover, for efficiently handling out-of-sample extension and learning with the local consistency, we decompose the manifold learning as a two-step approach. We incorporate the manifold learning and the regression with a learned discriminant manifold-based projection function obtained by discriminatively Laplacian regularized least squares. The SLDML provides both the low-dimensional embedding and projection function with better intra-class compactness and inter-class separability, therefore preserves the local geometric structures more effectively. Meanwhile, the SLDML is supervised by both biased distance and continuous head pose angle information when constructing the graph, embedding the graph and learning the projection function. Our experiments demonstrate the superiority of the proposed SLDML over several current state-of-art approaches for head pose estimation on the publicly available FacePix dataset.

Introduction

Head pose estimation is an integral component of multi-view face recognition systems, driver attention monitoring and other human-centered computing applications. It already becomes a hot research topic in computer vision applications [1]. Currently, the state-of-art methods for head pose estimation have adopted intensively the technique of manifold learning and embedding [2]. The technique is based on the idea that the dimensionality of the dataset is only artificially high, and it may be described as a function of only a few underlying parameters. This technique attempts to uncover these parameters in order to find a low-dimensional representations of the data. That is to say, the data points are actually sampled from a low-dimensional manifold that is embedded in a high-dimensional ambient space. For human head pose estimation, a fundamental assumption of using this technique is that face images with varying pose angles are data points that lies on a smooth low-dimensional manifold constrained by the head pose variations. It is always believed that the manifold models the nonlinear and continuous variations of face appearance with head pose angle, and if learned properly, new face images can then be embedded in the low-dimensional space to estimate the head poses.

The manifold learning includes nonlinear types, such as Isometric Feature Mapping (ISOMAP) [3], Laplacian Eigenmap (LE) [4] and Locally Linear Embedding (LLE) [5], and linear types, such as Locality Preserving Projection (LPP) [6] and Neighborhood Preserving Embedding (NPE) [7], etc. Typically, a quadratic objective derived from a neighborhood graph is set up and solved for its leading eigenvectors.

The nonlinear manifold learning assumes that the underlying structure of the real data is often highly nonlinear and hence cannot be accurately approximated by linear manifolds. It aims to preserve certain geometric properties among neighboring data points in the process of projecting the high-dimensional data to low-dimensional embedding. There are various different geometric properties that include the pairwise geodesic distances (Isometric Feature Mapping (ISOMAP) [3]), the local convexity (Local Linear Embedding (LLE) [5]), the local distance (Laplacian Eigenmap (LE) [4]). By incorporating prior knowledge of class labels, manifold learning methods also perform pattern classification in the feature space. Mostly, above algorithms are formulated as convex optimization problems. These models generally assume that the low-dimensional manifold is isometric to a convex subset of Euclidean space, and there may exist problems of high curvature of the manifold and out-of-sample extension for non-isometric manifold [8].

For nonlinear manifold learning a critical issue is the lack of a direct mapping from the input space to the manifold space, which limits the applicability of these methods. One solution to this issue was presented in [9], where the distance matrix was viewed as a kernel and some new points were embedded by using the Nystrom approximation. Unfortunately, it is a rather complex process, and time-consuming does not have a clear interpretation for the LLE [5] case. More recent linear manifold learning approaches (these can be thought as the subspace learning [10]), such as LPP [6] and NPE [7], try to overcome the out-of-sample extension problem by performing a linear approximation of the underlying manifold. The fundamental idea behind these methods is that, using a linear map manifolds can be approximated reasonably well within a local neighborhood, even though they are nonlinear structures. The benefits of linear approximations are some savings in computational time.

The disadvantage of the nonlinear manifold learning makes the out-of-sample extension tricky, while that of linearized approaches can offer only an approximation of the underlying manifold structure. Hence, we propose a novel manifold learning approach to overcome the two issues. We decompose the manifold learning as a two-step approach: graph embedding [11] for the underlying manifold learning and the regression for the learned manifold-based projective function learning. The regression can construct the direct map obtained by using discriminatively Laplacian regularized least square to project the training data and new data. By using the regression as a process of building the projection function rather than direct linear transformation from the input space to the low-dimensional embedding, different kinds of regularizers can be naturally incorporated.

In particular, Locality Sensitive Discriminant Analysis (LSDA) [12] has been proposed recently to exploit both geometric and discriminant information simultaneously in manifold learning. LSDA [12] incorporates discriminant information based on the graph Laplacian and demonstrated better performance than some other manifold learning methods. It constructs intra-class graph and inter-class graph to model the local neighborhood relationship of the data according to their labels. By making the margin maximal among data points of different classes in each local neighborhood, it can perform the discriminative ability for classification in the reduced subspace. Intuitively, we incorporate the discriminant knowledge as the supplement into the objective functions of manifold learning in order to find the optimal low-dimensional embedding for estimating head poses accurately.

Furthermore, in the regression stage of the proposed two-step approach, we develop discriminatively Laplacian regularized least square, and this is partly inspired by discriminatively regularized least square [13]. We directly embed the discriminative information as well as the local geometry of the data based on the graph Laplacian in the regularization term such that it can explore as much underlying knowledge as possible.

Meanwhile, some head pose estimation methods are proposed based on manifold learning. Raytchev applied ISOMAP-based manifold learning technique for user-independent pose estimation [14]. Fu and Huang presented an appearance-based strategy for head pose estimation using supervised Graph Embedding (GE) analysis [15]. To incorporate the pose labels that are usually available during the training phase, Balasubramanian proposed the Biased Manifold Embedding framework (BME) [16] for head pose estimation, which uses biased distance measurement to determine k nearest neighbors such that the head pose angle information can be incorporated as the prior knowledge for estimating head pose more accurately. BME [16] uses a Generalized Regression Neural Network (GRNN) to learn the nonlinear mapping for dealing with out-of-sample data points, and applies linear multivariate regression to estimate the pose. In [17], by incorporating prior knowledge of head pose angle, BenAbdelkader proposed Supervised Manifold Learning (SML) framework for head pose estimation, and uses nonlinear mapping, cubic smooth splines and support vector regression, to estimate head pose angle from embedded face images. SML [17] performs better than other head pose estimation methods by incorporating continuous head pose angle information in the process of graph construction and graph embedding. They all demonstrated their effectiveness for head pose estimation. However, their methods fail to efficiently handle out-of-sample extension data points. In addition, they use a nonlinear mapping (e.g. GRNN and cubic smooth splines) to estimate the head poses.

Instead, we propose a novel manifold learning approach, supervised locality discriminant manifold learning (SLDML), which decomposes the manifold learning as a two-step approach: the graph embedding stage and the regression stage. Firstly, we construct an intra-class graph Gw and an inter-class graph Gb according to their head pose labels. In this way, the geometric structures and discriminant knowledge of the data can be accurately characterized by the two graphs. Based on the graph Laplacian, we then add the local discriminant term constructed by minimizing the margin of data points from Gw and maximizing the margin of data points from Gb in the optimization objective of manifold learning. Secondly, we develop discriminatively Laplacian regularized least squares which map the data to the low-dimensional reduced space for directly out-of-sample extension more effectively. Moreover, we incorporate continuous head pose angle information into all stages of the manifold learning. Meanwhile, for a new face image, we first embed it in the low-dimensional space where we determine its k nearest neighbors, and then estimate the head pose angles.

The rest of this paper is organized as follows. Section 2 describes our supervised locality discriminant manifold learning process. Experiments on public dataset are presented to show the robustness and superiority of our method over other state-of-the-art methods in Section 3. Finally, Section 4 concludes our work with a summary and introduces the future work.

Section snippets

Proposed method

In this section, we introduce supervised locality discriminant manifold learning approach. Specifically, for a given data set of n points in D-dimensional space, we denote X=[(x1,z1),(x2,z2),,(xn,zn)],xiRD, where zi is label (i.e. head pose angle) of xi. We wish to reduce the dimensionality of this data, and assume only that the data lies on a d-dimensional manifold embedded into RD, where d<D. Moreover, we assume that the manifold is given by a single coordinate chart f. We can now describe

Experiments results

In this section, we investigate the use of our proposed SLDML approach for head pose estimation, and we compare SLDML with the current state-of-art methods, Biased Manifold Embedding (BME) [16], and Supervised Manifold Learning (SML) [16].

To demonstrate the efficiency of our approach, comprehensive experiments are conducted on FacePix data set which is a famous publicly available database. The FacePix data set is provided by CUbiC (the Center for Cognitive Ubiquitous Computing) and consists of

Conclusion and future work

In this paper, we propose a novel supervised locality discriminant manifold learning (SLDML) approach for the head pose estimation. The main contribution of our work is that we decompose the manifold learning as a two-step approach which incorporates both the nonlinear manifold learning in the graph embedding (the first step) and projection function learning in the regression stages (the second step). Moreover, we construct the local discriminant term in the optimization objectives to find the

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant No. 61001143), by the Key Project Funds from the Province Office of Sci. & Tech. of Fujian in China (Grant No. 2012H6024) and by the Natural Science Foundation of Fujian Province (Grant No. 2012J01286).

References (19)

  • J. Wu et al.

    A two-stage pose estimation framework and evaluation

    Pattern Recogn.

    (2008)
  • E. Murphy-Chutorian et al.

    Head pose estimation in computer vision: a survey

    Pattern Anal. Mach. Intell.

    (2009)
  • J. Tenenbaum et al.

    A global geometric framework for nonlinear dimensionality reduction

    Science

    (2000)
  • M. Belkin et al.

    Laplacian eigenmaps for dimensionality reduction and data representation

    J. Neural Comput.

    (2003)
  • K.S. Lawrence et al.

    Think globally, fit locally: unsupervised learning of low dimensional manifolds

    J. Mach. Learn. Res.

    (2003)
  • X. He et al.

    Locality preserving projections

    Adv. Neural Inform. Process. Syst.

    (2004)
  • X. He, D. Cai, S. Yan, H. Zhang, Neighborhood preserving embedding, in: International Conference on Computer Vision,...
  • P. Doll¢r, V. Rabaud, S. Belongie, Non-isometric manifold learning: analysis and an algorithm, in: Proceedings of...
  • Y. Bengio et al.

    Out-of-sample extensions for LLE, isomap, MDS, eigenmaps, and spectral clustering

    Adv. Neural Inform. Process. Syst.

    (2003)
There are more references available in the full text version of this article.

Cited by (12)

  • Constrained discriminant neighborhood embedding for high dimensional data feature extraction

    2016, Neurocomputing
    Citation Excerpt :

    Based on the above theoretic analysis, the outline of the proposed CDNE is summarized in Table 1. In this section, CMU PIE face data, ORL face data and FERET face data are applied to evaluate the performance of the proposed CDNE algorithm by making comparisons to some related dimensionality reduction methods such as discriminant neighborhood embedding (DNE) [4], supervised locality discriminant manifold learning (SLDML) [7], discriminant sparse neighborhood preserving embedding (DSNPE) [5], local graph embedding based on maximum margin criterion (LGE/MMC) [8], uncorrelated discriminant locality preserving projections (UDLPP) [6] and LUDP. The above methods besides CDNE are all supervised manifold learning based approaches, where both class information and manifold local structure are taken into account to model the corresponding objective functions.

  • Extended semi-supervised fuzzy learning method for nonlinear outliers via pattern discovery

    2015, Applied Soft Computing Journal
    Citation Excerpt :

    A good image feature representation method could significantly reduce the computational complexity of a model. The methods that utilize the different results of output variables to identify the best subset of given features in a dataset can be divided into supervised and unsupervised methods [6–10]. Despite extensive studies have used supervised or unsupervised models to exploit effective feature representations, few attempts have been made to identify important outlier instances by means of the SSL methods [11–13].

  • Hope: heatmap and offset for pose estimation

    2022, Journal of Ambient Intelligence and Humanized Computing
  • Students head-pose estimation using partially-latent mixture

    2020, Lecture Notes in Electrical Engineering
View all citing articles on Scopus
View full text