Illumination compensation for facial feature point localization in a single 2D face image

doi:10.1016/j.neucom.2015.07.092

Neurocomputing

Volume 173, Part 3, 15 January 2016, Pages 573-579

https://doi.org/10.1016/j.neucom.2015.07.092 Get rights and content

Abstract

Current researches have demonstrated that illumination variation on face images degrades the accuracy of facial identity and emotion recognition. To decrease the impact of illumination variation, researchers have proposed many creative methods of illumination compensation. However, these methods are limited in compensating for the shadow around the nose. On the basis of our previous researches, we now propose a novel approach which can effectively decrease the impact of illumination variation, especially the shadow around the nose. Firstly, we preprocessed the face image with uneven brightness using technologies of illuminant direction estimation and improved Retinex. Secondly, we turn the original face image into a binary image with only shadow region or non-shadow region using region growing technology. Thirdly, we calculate the difference between the intensity of the original input face image and the average intensity of the face images under the frontal illumination. Fourthly, for the face image preprocessed in the first step, we keep its non-shadow region. For the intensity difference, we extract its shadow region whose intensity is reduced by an adaptive value. Fifthly, we synthesize the non-shadow region and the shadow region in step four. Finally, we apply maximum filter to smooth the boundary between them. The proposed method is simple in computation and does not need any training steps or any knowledge of 3D models. The experimental results using extended Yale face database B show that our method achieves better illumination compensation comparing with the existing techniques, and provide more satisfactory experimental data for facial identity and emotion recognition.

Introduction

Facial identity and emotion recognition have been widely used in our daily life and have extensive application prospects in various industries, such as medical care, health, family service, sports, entessrtainment and education. However, there are still many uncontrollable factors restricting the development of facial identity and emotion recognition. Among them, the most representative factor is uneven illumination. In order to address the problems due to uneven illumination, researchers have done some relevant works [1], [2], [3], [4]. Vu and Caplier [1] proposed a new approach of illumination normalization based on Retina modeling by combining two adaptive nonlinear functions and a difference of Gaussians filter. Zhao and Wang [2] used Discrete Cosine Transform (DCT) on the original face images in logarithm domain, and then utilized local normalization method on the inverse discrete cosine transform images to minimize illumination variations in small areas. In order to achieve the illumination invariant eye detection, Jung et al. [3] employed illumination normalization based on Retinex theory. Liu et al. [4] employed Sobel operator to estimate gradients in face images, and then utilized these gradients as weights in an averaging operation. For the purpose of accelerating the operation and removing certain unwanted features, they calculated integral images from the weighted images. Although all above-mentioned methods can partly eliminate uneven illumination in face image and provide barely satisfactory experimental data for subsequent recognition task, they all have the same deficiency that they do not perform well for the shadow around nose. Fig. 1 shows some representative examples from references [1], [2], [3], [4]. We can see that the shadow around nose, which has negative effect on our further image analysis, is obvious. This is because that the assumption of illumination changing slowly and regularly in human face image does not apply to the sudden change of illumination from non-shadow region to shadow region. How to eliminate the occlusive shadow in face images is still an unsolved problem, especially for a single 2D face image. Thus, on the basis of our previous research [5], we propose a novel method which can effectively handle the problems due to uneven illumination, especially the shadow around nose.

The rest of our paper is organized as follows. In Section 2, the framework and details of our proposed illumination compensation method for a single 2D face image are given. Section 3 consists of experimental results and analysis. Finally, conclusions are offered in Section 4.

Section snippets

Proposed method

To help readers better understand our method, we present the system architecture of our proposed illumination compensation method in Fig. 2. Firstly, for the input face images, we pretreat the image to reduce the effect of uneven illumination using illuminant direction estimation and improved Retinex. This part is from our previous studies and not described in detail in this paper. Secondly, region growing technique is used to divide the face image into shadow region and non-shadow region.

Experimental results and analysis

In this section, we test and verify our proposed method by several experiments on face images from the extended Yale face database B [9], [10], which is commonly used to evaluate the performance of illumination compensation algorithm. Under the same experimental condition, we compare the proposed method with Histogram Equalization (HE) [11], Local Normalization Technology (LNT) [12], Local Mean Map (LMM) of an image representing its low-frequency contents [13], Local Variance Map (LVM) carrying

Conclusions

In this paper, a new method for illumination compensation of face image has been proposed. Based on the estimated illuminant direction by our previous work [5], we preliminarily handle the negative effect of uneven illumination of face image through improved Retinex algorithm. Then, we synthesize the obtained non-shadow region and shadow region as a new face image. In this step, we have used some helpful techniques just like region growing, average face, gain compensation, and so on. Finally,

Acknowledgment

This work was supported in part by the Specialized Research Fund for the Doctoral Program of Higher Education (Grant no. 20121102130001), the Innovation Foundation of BUAA for PhD Graduates, and the National Natural Science Foundation of China (Grant no. 61103097). The authors would like to thank the providers of the extended Yale face database B.

Jizheng Yi received the B.S. degree in electronic information science and technology from Shandong Agricultural University, Taian, China, in 2008, the M.S. degree in pattern recognition and intelligent system from Wuyi University, Jiangmen, China, in 2011 and the Ph.D. degree in School of Electronic and Information Engineering, Beihang University, Beijing, China, in 2015. He is currently a lecturer at College of Computer and Information Engineering, Central South University of Forestry and

References (18)

C. Jung et al.
Eye detection under varying illumination using the retinex theory
Neurocomputing
(2013)
X. Xie et al.
An efficient illumination normalization method for face recognition
Pattern Recognit. Lett.
(2006)
J. Ruiz-del-Solar et al.
Illumination compensation and normalization in eigenspace-based face recognition: A comparative study of different pre-processing approaches
Pattern Recognit. Lett.
(2008)
N. Vu, A. Caplier, Illumination-robust face recognition using retina modeling, in: the 16th IEEE International...
M. Zhao, L. Wang, Face recognition based on a novel illumination normalization method, in: the 5th International...
Y. Liu et al.
Illumination normalization using weighted gradient integral images
Lecture Notes Software Eng.
(2013)
J. Yi et al.
Illuminant direction estimation for a single image based on local region complexity analysis and average gray value
Appl. Opt.
(2014)
D.J. Jobson et al.
A multiscale retinex for bridging the gap between color images and the human observation of scenes
IEEE Trans. Image Process.
(1997)
X.X. Zhao
Research of video images enhancement system based on retinex theory (Doctor Thesis)
(2011)

There are more references available in the full text version of this article.

Cited by (6)

Spectrogram-frame linear network and continuous frame sequence for bird sound classification
2019, Ecological Informatics
Citation Excerpt :
The traditional methods for classifying bird sound are mainly based on machine learning techniques, including the classification method based on template matching (Chen and Maher, 2006) typically represented by dynamic time warping (DTW) algorithm (Tan et al., 2015), and feature-based classification methods mainly include hidden Markov model (Lee et al., 2012), Gaussian mixture model (Kalan et al., 2015), Naive bayes (Vilches et al., 2006), random forest (Leng and Dat, 2014; Neal et al., 2011; Stowell and Plumbley, 2014) and support vector machine (Zhao et al., 2017). Deep learning has shown excellent performance in the fields of computer vision (Yi et al., 2016), speech recognition, and text processing, etc. The research by Xie and Zhu (2019) showed that the deep learning technique has proved to outperform the traditional methods in terms of accuracy in bird sound classification.
Inspired by that bird sound has various frequency distributions and continuous time-varying properties, a novel method is proposed for the classification of bird sound based on continuous frame sequence and spectrogram-frame linear network (SFLN). In order to form a continuous frame sequence as the standard input for SFLN, a sliding window algorithm of short frame length is suitable for differentiate the Mel-spectrogram of bird sound. The vertical 3D filter in the linear layer moves linearly along the continuous frame and cover its full frequency band. The weight is initialized to a Gaussian distribution to attenuate the high-and low-frequency noise, thereby extracting the long-and short-term features of the continuous frame of the bird sound. Finally, the GRU network is connected and used as a classifier to directly output the prediction results. Four kinds of bird sound from the xeno-canto website are tested to evaluate the influences of different parameters of sliding window on the effect of SFLN-based classification. In the comparison experiment, the mean average precision (MAP) achieves the highest value of 0.97.
Facial expression recognition of intercepted video sequences based on feature point movement trend and feature block texture variation
2019, Applied Soft Computing Journal
Citation Excerpt :
The reason for this choice is that the locations of these points will change more obviously than others as expression evolves, and these variations have different characteristics which are pretty meaningful for FER. This part is from authors’ previous studies [32,33] and has not been described in detail in this paper. Secondly, Facial Expression Sequence Interception (FESI) is presented by determining two frames whose emotional intensities are minimum and maximum, respectively.
Facial Expression Recognition (FER) is an important subject of human–computer interaction and has long been a research area of great interest. Accurate Facial Expression Sequence Interception (FESI) and discriminative expression feature extraction are two enormous challenges for the video-based FER. This paper proposes a framework of FER for the intercepted video sequences by using feature point movement trend and feature block texture variation. Firstly, the feature points are marked by Active Appearance Model (AAM) and the most representative 24 of them are selected. Secondly, facial expression sequence is intercepted from the face video by determining two key frames whose emotional intensities are minimum and maximum, respectively. Thirdly, the trend curve which represents the Euclidean distance variations between any two selected feature points is fitted, and the slopes of specific points on the trend curve are calculated. Finally, combining Slope Set which is composed by the calculated slopes with the proposed Feature Block Texture Difference (FBTD) which refers to the texture variation of facial patch, the final expressional feature are formed and inputted to One-dimensional Convolution Neural Network (1DCNN) for FER. Five experiments are conducted in this research, and three average FER rates 95.2%, 96.5%, and 97% for Beihang University (BHU) facial expression database, MMI facial expression database, and the combination of two databases, respectively, have shown the significant advantages of the proposed method over the existing ones.
CA-XTree: Age Estimation of Grouped Gradient Regression Tree with Local Channel Attention
2022, Computational Intelligence and Neuroscience
An Extensive Study on Traditional-to-Recent Transformation on Face Recognition System
2021, Wireless Personal Communications
Multi-Stage Feature Constraints Learning for Age Estimation
2020, IEEE Transactions on Information Forensics and Security
Facial Expression Sequence Interception Based on Feature Point Movement
2019, 2019 IEEE 11th International Conference on Advanced Infocomm Technology, ICAIT 2019

Xia Mao received her M.S. degree and Ph.D. degree from Saga University, Japan in 1993 and 1996 respectively. She is currently a professor at School of Electronic and Information Engineering, Beihang University, Beijing, China. Her current research interests include affective computing, artificial intelligence, pattern recognition and Human-Computer Interaction. Prof. Mao has published over 140 pieces of papers both domestically and overseas, many of them have been cited by the SCI, EI, ISTP etc. Dr. Mao is leading several projects supported by the National High-tech Research and Development Program (863 Program), National Natural Science Foundation and Beijing Natural Science Foundation.

Lijiang Chen received his B.S. degree and Ph.D. degree in School of Electronic and Information Engineering, Beihang University, Beijing, China, in 2007 and 2012 respectively. He is currently a lecturer at School of Electronic and Information Engineering, Beihang University, Beijing, China. His research interests include speech signal processing, pattern recognition and speech emotion recognition.

Alberto Rovetta was born in Brescia, Italy, in 1940. He received the Ph.D. degree from the Politecnico di Milano, Milan, Italy, in 1964. Since 1980, he has been an Ordinary Professor with the Department of Mechanics, Politecnico di Milano, where he also coordinates the Laboratory of Robotics. He is the author of more than 400 publications. His current research interests include biorobotic applications, environmental safety, and automobile transportation. Prof. Rovetta has been the Chairman of the International Committee for Advanced Technologies (UITA-UNESCO) since 1987 and a member of the B6/2 Committee of the International Telecommunication Union since 1993. He is also a member of numerous other committees.

View full text

Illumination compensation for facial feature point localization in a single 2D face image

Abstract

Introduction

Section snippets

Proposed method

Experimental results and analysis

Conclusions

Acknowledgment

Neurocomputing

Pattern Recognit. Lett.

Pattern Recognit. Lett.

Illumination normalization using weighted gradient integral images

Lecture Notes Software Eng.

Illuminant direction estimation for a single image based on local region complexity analysis and average gray value

Appl. Opt.

A multiscale retinex for bridging the gap between color images and the human observation of scenes

IEEE Trans. Image Process.

Research of video images enhancement system based on retinex theory (Doctor Thesis)