Elsevier

Image and Vision Computing

Volume 30, Issue 2, February 2012, Pages 100-108
Image and Vision Computing

Morphable model space based face super-resolution reconstruction and recognition

https://doi.org/10.1016/j.imavis.2012.01.005Get rights and content

Abstract

Super-resolution image reconstruction is the process of producing a high-resolution image from a set of low-resolution images of the same scene. For the applications of performing face evaluation and/or recognition from low-resolution video surveillance, in the past, super-resolution image reconstruction was mainly used as a separate preprocessing step to obtain a high-resolution image in the pixel domain that is later passed to a face feature extraction and recognition algorithm. Such three-stage approach suffers a high degree of computational complexity. A low-dimensional morphable model space based face super-resolution reconstruction and recognition algorithm is proposed in this paper. The approach tries to construct the high-resolution information both required by reconstruction and recognition directly in the low dimensional feature space. We show that comparing with generic pixel domain algorithms, the proposed approach is more robust and more computationally efficient.

Highlights

► We present a lower dimensional morphable model space based super-resolution face reconstruction and recognition algorithm. ► The approach construct the high-resolution information both required by reconstruction and recognition directly in the low dimensional feature space. ► The approach is more robust and more computationally efficient.

Introduction

In many real applications, like performing face evaluation and/or recognition from low-resolution (LR) surveillance videos, we need to apply magnification to the LR face images. Nevertheless, generic magnification techniques usually result in blurred images, and unacceptable recognition rates can be expected if such blurred images are passed to a face recognition system. Therefore, super-resolution (SR) techniques have been proposed to solve this problem [1], [2], [3], [4].

SR image reconstruction is the process of recovering a high-resolution (HR) image from a single or a set of LR images. Single frame SR image reconstruction resolves high frequency information via various methods, e.g., probability density distribution of the scene [28], learning from examples [1]. The essential difference between single frame and multi-frame SR image reconstruction is that new high frequency information can also be recovered from different frames [22], [23]. For ideal cases, at most q2 non-redundant LR frames (each frame has different motion vector, thus containing different sub-pixel information) are possible for resolution enhancement of q in each dimension. However, for real image sequences, it is usually almost impossible to acquire q2 non-redundant frames except for the cases like translation-controlled imagers or similar systems.

Many SR algorithms have been proposed so far, including the frequency domain method, the projection onto convex sets (POCS) method, and the maximum a posteriori (MAP) method [5], [6], [7], [8], [9], [10], [11]. Among all the proposed SR algorithms, MAP has received much attention in recent years because it has many outstanding advantages, such as complete theory framework, flexible spatial domain observation model, powerful inclusion of a priori knowledge, and producing superior results.

To improve the quality of the reconstructed human face, Baker and Kanade [3] proposed to add a LR-HR face mapping based prior into the MAP framework. Capel and Zisserman[2] presented a similar approach, but the mapping was separately carried out for different regions of a face. Though all these methods [2], [3] can improve the quality of the final HR face image to some extend, the computational complexity and the sensitivity to noise are increased. In [12], [13], face SR reconstruction and recognition was performed directly in Eigenface space. Since Eigenface space is only a subspace generated by applying PCA analysis to the pixel space, the capability of representing a human face is worse than that of the pixel space. It can be expected that such methods should produce worse results than pixel-space-based methods. By learning the mapping among faces with different resolutions in a unified feature space, Li et al. [14] found that it was possible to carry out LR face recognition without any SR reconstruction preprocessing. Hennings–Yeomans et al. [15] proposed an approach for the recognition of LR faces by including the SR model and face features in a regularization framework. However, those methods [14], [15] are meant for recognition, not reconstruction.

Morphable model space is constructed by calculating dense pixelwise correspondence of different example images of objects of the same class, so it really contains the characteristics of shape and texture of the objects of a class. It was shown that with morphable model space, it is possible to construct a complete human face even when one quarter of the face information is lost [16], [17]. On the other hand, for the recognition of faces covering large variations in pose and illumination, V.Blanz and T.Vetter [18] got recognition rates above 95% in the morphable model space. All these researches [16], [17], [18] indicate that morphable model space is a powerful and versatile representation for human faces. Considering SR image reconstruction is also an information recovering process, through which low-frequency informations are used to resolve the hidden high-frequency informations, it is reasonable to expect better results when using the morphable model space to perform face SR reconstruction and recognition.

SR image reconstruction is also computationally expensive. Many works have been published to address this problem and can be roughly grouped into two categories. The first category was originated by Elad and Hel-or [24], which is characterized by braking the SR reconstruction into two consecutive stages: fusing and de-blurring, and applying fast algorithm to either or both of the two steps. Similar approaches include: S.Farsiu et al. [29] presented a fast implementation for the second step using L1 norm minimization; Miravet and Rodriguez [25] accelerated the process by using a hybrid neural network and a filtering operation for the first and the second step, respectively; M. Protter et al. [26] also suggested a fast nonlocal-means filtering algorithm for the first step. The second category speeds up the SR process by numeric techniques. For example, Hardie et al. [27] suggested the conjugate gradient (CG) method, Nguyen et al. [22] presented the preconditioned conjugate gradient (PCG) method.

Unlike the previous works, in this paper, we present a novel multi-frame face SR observation model between LR and HR morphable model space, and propose a complete MAP framework to solve this model. The proposed method also reduces the computational complexity by performing face reconstruction and recognition in a low-dimensional space.

Section snippets

A morphable model space of human face

The 2D morphable model space of human face is a vector space that any combination of shape and texture vectors Si and Tiof a set of examples describes a realistic human face, or more formally [16], [18]:S=iαiSi,T=iβiTiwhere factors αiandβisatisfy the normalization condition: iαi=1,iβi=1. Continuous changes in the model parameters αi generate a smooth transition such that each point of the initial surface moves toward a point on the final surface. Just as in morphing, artifacts are avoided

SR reconstruction in pixel space

In pixel space SR image reconstruction, each LR frame contributes new information for interpolating sub-pixel values of the same scene. To get different information of the same scene, relative scene motions must be recorded from frame to frame. If these scene motions are known or can be estimated within sub-pixel accuracy, SR image reconstruction is possible. Fig. 3 illustrates a simple SR image reconstruction conceptually. In the figure, the pixel (1,1) of the LR image is a weighted average

The model

In Eq. (15), F and f represent the HR and LR images in pixel space. However, from Eq. (10) we know that face image can also be represented as a vector in morphable model space. Now we rewrite HR and LR images F and f in morphable model space as:F=Φa+eFf=Ψb+efwhere Φ and Ψ are q2S2 × 2 L and MS2 × 2ML matrices that contain the base vectors of HR and LR morphable model space in their columns, a is a 2 L × 1 vector represents a face in the HR morphable model space, b is a 2ML × 1 vector represents M LR

Constructing morphable model space

In this paper, we used a database with a total number of 200 different face images as the training set, some of them are selected from the ORL database [20] (the selected faces are the ones without glasses and almost with neutral expression) and the others are captured by a webcam with a resolution of 150 million pixels. All the training samples are scaled to have a standard resolution of 92 × 92 as the original HR images. We then cut and aligned these images using the method described in

Reconstruction results

To provide a quantitative comparison of reconstruction quality, the improved SNR is used and the definition is given by:ΔSNR=10log10||aa0||2||aa||2where a0 is the bilinear interpolation of the reference frame, a is the original HR image, and ais the estimated HR image.

We performed SR reconstruction on synthetic LR faces selected both from and outside the training set, and the results are shown in Fig. 6, Fig. 7, respectively. In the figures, typical LR images are shown in (a), magnified

Conclusion

In the applications of face evaluation and recognition from LR video sequences, SR techniques can be used to obtain a HR face by combining different information from different LR images. Though SR can be applied as a separate preprocessing step in pixel space, in this paper, we proposed to apply face SR reconstruction and feature extraction simultaneously in the morphable model space. To do so, LR observations are first projected into the LR morphable model space, and then SR technique is

Acknowledgement

This research is supported by the National Nature Science Foundation of China (Research Grant # 60772117 ) and the Natural Science Foundation of Guangdong Province, China (Research Grant # 07006491).

References (29)

  • Di Zhang et al.

    Fast MAP-based multiframe super-resolution image reconstruction

    Image Vision Comput.

    (2005)
  • Carlos Miravet et al.

    A two-step neural-network based algorithm for fast image super-resolution

    Image Vision Comput.

    (2007)
  • Ce Liu et al.

    A two-step approach to hallucinating faces: global parametric model and local nonparametric model

  • David Capel et al.

    Super-resolution from multi views using learnt image models

  • S. Baker et al.

    Limits on super-resolution and how to break them[J]

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2002)
  • Ayan Chakrabarti et al.

    Super-resolution of face images using kernel pca-based prior

    IEEE Trans. Multimedia

    (Jun 2007)
  • R.Y.Tsai and T.S.Huang. Multiframe image restoration and registration. Advances in Computer Vision and Image...
  • K.S. Ni et al.

    Image superresolution using support vector regression

    IEEE Trans. Image Process.

    (2007)
  • A.J. Patti et al.

    Artifact reduction for set theoretic super resolution image reconstruction with edge adaptive constraints and higher-order interpolants

    IEEE Trans. Image Process.

    (2001)
  • Hui Ji et al.

    Robust wavelet-based super-resolution reconstruction: theory and algorithm

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2009)
  • R.R. Schultz et al.

    Ectraction of high-resolution frames from vedio sequences

    IEEE Trans. Image Process.

    (1996)
  • Di Zhang et al.

    Fast hybrid approach to large magnification super-resolution image reconstruction

    Opt. Eng.

    (2005)
  • B.K. Gunturk et al.

    Eigenface-domain super-resolution for face recognition

    IEEE Trans. Image Process.

    (2003)
  • Wan Zhifei et al.

    Feature-based super-resolution for face recognition

  • Cited by (19)

    View all citing articles on Scopus

    This paper has been recommended for acceptance by Cornelia M Fermuller.

    View full text