Towards Continuous Health Diagnosis from Faces with Deep Learning

Martin, Victor; Séguier, Renaud; Porcheron, Aurélie; Morizot, Frédérique

doi:10.1007/978-3-030-00320-3_15

Towards Continuous Health Diagnosis from Faces with Deep Learning

Victor Martin ORCID: orcid.org/0000-0002-8619-6897^17,18,
Renaud Séguier¹⁷,
Aurélie Porcheron¹⁸ &
…
Frédérique Morizot¹⁸

Conference paper
First Online: 13 September 2018

868 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11121))

Abstract

Recent studies show that health perception from faces by humans is a good predictor of good health and healthy behaviors. We aimed to automatize human health perception by training a Convolutional Neural Network on a related task (age estimation) combined with a Ridge Regression to rate faces. Indeed, contrary to health ratings, large datasets with labels of biological age exist. The results show that our system outperforms average human judgments for health. The system could be used on a daily basis to detect early signs of sickness or a declining state. We are convinced that such a system will contribute to more extensively explore the use of holistic, fast, and non-invasive measures to improve the speed of diagnosis.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Judgments of a person’s health based on facial appearance are a daily occurrence in social interactions. Understanding how we perceive health from a face is important because this judgment drive a wide array of social behaviors. Looking healthy has many positive real-life outcomes such as preferential treatment in the professional context, in the justice system or in dating interactions [1,2,3,4]. Inversely, looking unhealthy is associated to lower self-esteem [5] and may lead to a risk of social stigmatization and isolation [6]. A better understanding of how health is perceived and which facial cues alter this perception is likely to help reducing the negative social consequences which can follow.

Scientific recent evidences also show that facial healthy appearance is a good predictor of healthy behaviors [7] and good health [8,9,10]. Faces with an increase of oxygenated blood skin coloration are perceived healthier, and blood oxygenation level is known to be associated with cardiovascular fitness [10]. People with a healthy diet, such as daily consumption of fruits and vegetables, have a more attractive skin color and are perceived healthier [7]. Sleep deprived people appear less healthy compared with when they are well rested [11]. And people would acutely detect signs of sickness from the face in an early phase after exposure to infectious stimuli and potentially contagious people [12]. Figure 1 shows two average faces of people perceived in good health and people perceived in bad health. As health perception and age are known to be correlated [13], health ratings are decorrelated with age.

We aim to develop an automatic system able to imitate human judgments of health. Such a technology, when used over the long term, could enable fast and non-invasive detection of a declining state of a person. That’s why we introduce the first system able to estimate health scores from faces.

To the best of our knowledge, we introduce in this paper the first work on automatic health estimation from face. Lots of works have been made to estimate age from faces [14,15,16,17].

More recently, some researchers have begun to study whether it is possible to estimate less common attributes from the face such as intelligence [18], attractiveness [19,20,21,22] or social relation traits [23].

In view of the current state of art and our constraints, we use a Convolutional Neural Network trained on biological age combined with a Ridge Regression to assess health perception from faces (Sec. 2). Thereafter, we evaluate the system performance on our database and we compare it with human performance on the same database (Sec. 3).

2 Health Estimation

Based on the age estimation method of [17], we employ the Convolutional Neural Network VGG-16 pre-trained on the ImageNet database [24] to detect 1,000 classes of objects, and trained it on the Internet Movie Database (IMDb) of celebrities (Fig. 2). We filtered the \(\approx 500K\) images to keep only those containing faces with resolution greater than 120\(\,\times \,\)120 pixels, no more than one face detected in each image, and only picture depicting people from 11 to 85 years old. For each picture, we have the date of birth of the celebrity pictured and the date of the photo acquisition, thus we can deduce the biological age of the depicted person.

In addition, from the original VGG-16 architecture, we replace the final Multi Layer Perceptron containing a large part of the parameters, by a lighter one with one layer of 1024 units (Fig. 4) and an output layer of 120 units. The objective of doing so is to shift the learning effort onto the convolutional layers because the final Multi Layer Perceptron will be dropped as we want to estimate health and not biological age – thus, having the fastest training with the lowest score is not the main goal here.

Thus, the last 3 convolutional blocks and the fully connected layers has been trained on IMDb with Stochastic Gradient Descent with a Learning Rate of \(10^{-4}\) on 1000 epochs with 10 steps per epoch and a batch size of 16. The decrease of the Mean Absolute Error for the training set and validation set can be seen in Fig. 3.

After that, we have to develop our system of health estimation with only 140 images annotated with health scores (Fig. 5). We want to compute a representation of our faces from the newly trained ConvNet using only the convolutions and pooling blocks, and use a regression to estimate health scores from representations. The question remains, at which epoch can we stop the training for health estimation? If we take the weights at an early epoch, the system will be underfitted. In the same way, as we do not want to predict biological age, taking the weights corresponding to an advanced epoch with a low MAE is not the go-to choice to make.

We evaluate the suitability of ConvNet weights at each epoch for Health Estimation with a simple Linear Regression trained with a 40-fold configuration. We can see in Fig. 6 how the training on a different, but related, task can increase performance on our health estimation problem. At epoch 0, learning for biological age hasn’t started yet and we get a relatively high MAE (9.0). In a second stage, learning for biological age greatly decreases Mean Absolute Error from 9.0 to 6.2. Finally, as learning progresses and the model specializes in biological age estimation, the error increases. An optimal period is found around epoch 60 to take the weights for health estimation.

Now that we found the ConvNet weights to compute representations from faces, we test several estimators to asses health scores from representations. For each estimator, we evaluate a broad range of parameters and report those producing the best performance in Table 1. In the table, the Multi Layer Perceptron is composed of two layers containing n neurons for the first layer and 120 for the output layer.

Table 1. List of tested estimators. The estimator with the lowest Mean Absolute Error is bolded.

Full size table

As we can see on Table 1, simple estimators as a Linear Regression or a Linear Regression regularized with a low \(\ell _2\) penalty (Ridge Regression) can achieve the best performance given our dataset and the feature extraction method we chose earlier. We can explain the fact that simpler estimators perform better than more complex estimators as Random Forests or Multi Layer Perceptron by the scarce number of samples \(n=140\) in regard of the dimensionality of our features \(d=512*7*7=25088\). The final architecture of our system is described in Fig. 7.

3 Experiment: System Versus Human Performance

We have 140 images of faces and each of them had been rated by 74 judges. For every picture, we asked them to evaluate health and to give a score from 0 to 100; 0 being perceived in very bad health and 100 being perceived in very good health. Finally, for each image, we took the average of the 74 ratings to determine a reliable perceived health score. In this database, the health scores obtained are 60% correlated with biological ages.

Exploiting the previously described system, we trained the Ridge Regression in a 140-fold manner to assess its performance.

As we can see on Fig. 8, we can achieve good performance on our dataset with a scarce amount of data. Using mean absolute error MAE, coefficient of determination \(R^2\) and Pearson correlation PC, Table 2 shows that our system estimates health more accurately than an average human working on the same dataset.

In addition, among the 74 judges, one judge with the lowest MAE (i.e. smallest difference in average between his ratings and the average ratings) is selected and placed in the table below under the name Best Human.

Table 2. Performance of our health estimation system compared to human performance.

Full size table

As an additional note, we can observe that health scores are 60% correlated with biological ages, and health estimates outputted by our system are 90% correlated with health scores. Hence, we confirm that our system estimates health from faces, and not just biological age.

4 Conclusion

This paper describes how we manage to develop an automatic system able to imitate human judgments of health. We trained a Convolutional Neural Network to estimate biological age and we used representations produced by the network of our scarce database to train a simpler estimator. We observed a very good performance of the system when we compared it to human judgments of health.

Nevertheless, we identified several areas of improvement.

First, the use of a Linear Regression to rank the different ConvNet weights (Fig. 6) tends to favor this type of estimators in the next step where we compare the performance of different estimators (Table 1). We could have ranked the different weights using a multitude of estimators.

Moreover, by using more images annotated with health ratings, we could improve the performance of our system and make it more robust to variations in pose and illumination.

Additional work will be necessary to test its performance on other demographic groups such as other ethnicities and men.

To conclude, we developed the first automatic health estimation system able to reproduce human judgments. Such a system could be used in institutions such as hospitals or retirement homes to automatically predict a potential future sickness from earlier visual signs present in a face. Similarly, it could be used for the remote monitoring of patients, to detect a sudden drop in health perception and prevent behaviors that negatively impact health.

References

Efran, M.G.: The effect of physical appearance on the judgment of guilt, interpersonal attraction, and severity of recommended punishment in a simulated jury task. J. Res. Pers. 8(1), 45–54 (1974)
Article Google Scholar
Marlowe, C.M., Schneider, S.L., Nelson, C.E.: Gender and attractiveness biases in hiring decisions: Are more experienced managers less biased? J. Appl. Psychol. 81(1), 11–21 (1996)
Article Google Scholar
Ritts, V., Patterson, M.L., Tubbs, M.E.: Expectations, impressions, and judgments of physically attractive students: a review. Rev. Educ. Res. 62(4), 413–426 (1992)
Article Google Scholar
Spisak, B.R., Blaker, N.M., Lefevre, C.E., Moore, F.R., Krebbers, K.F.B.: A face for all seasons: Searching for context-specific leadership traits and discovering a general preference for perceived health. Front. Hum. Neurosci. 8, 792 (2014)
Article Google Scholar
Feingold, A.: Good-looking people are not what we think. Psychol. Bull. 111(2), 304 (1992)
Article Google Scholar
Henderson, A.J., Holzleitner, I.J., Talamas, S.N., Perrett, D.I.: Perception of health from facial cues. Philos. Trans. R. Soc. B: Biol. Sci. 371(1693), 20150380 (2016)
Article Google Scholar
Whitehead, R.D., Re, D., Xiao, D., Ozakinci, G., Perrett, D.I.: You are what you eat: within-subject increases in fruit and vegetable consumption confer beneficial skin-color changes. PLOS ONE 7(3), e32988 (2012)
Article Google Scholar
Zebrowitz, L.A., et al.: Older and younger adults’ accuracy in discerning health and competence in older and younger faces. Psychol. Aging 29(3), 454 (2014)
Google Scholar
Stephen, I.D., Coetzee, V., Smith, L.M., Perrett, D.I.: Skin blood perfusion and oxygenation colour affect perceived human health. PLoS ONE 4(4), e5083 (2009)
Article Google Scholar
Re, D.E., Whitehead, R.D., Xiao, D., Perrett, D.I.: Oxygenated-blood colour change thresholds for perceived facial redness, health, and attractiveness. PLoS ONE 6(3), e17859 (2011)
Article Google Scholar
Axelsson, J., Sundelin, T., Ingre, M., Someren, E.J.W.V., Olsson, A., Lekander, M.: Beauty sleep: experimental study on the perceived health and attractiveness of sleep deprived people. BMJ 341, c6614 (2010)
Article Google Scholar
Axelsson, J., Sundelin, T., Axelsson, C., Lasselin, J., Lekander, M.: Identification of acutely sick people and facial cues of sickness. Brain Behav. Immun. 66, e38 (2017)
Article Google Scholar
Fink, B., Matts, P., D’Emiliano, D., Bunse, L., Weege, B., Röder, S.: Colour homogeneity and visual perception of age, health and attractiveness of male facial skin: perception of male skin colour. J. Eur. Acad. Dermatol. Venereol. 26(12), 1486–1492 (2011)
Google Scholar
Lanitis, A., Taylor, C.J., Cootes, T.F.: Toward automatic simulation of aging effects on face images. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 442–455 (2002)
Article Google Scholar
Lanitis, A., Draganova, C., Christodoulou, C.: Comparing different classifiers for automatic age estimation. IEEE Trans. Syst., Man, Cybern. Part B Cybern. 34, 621–628 (2004)
Article Google Scholar
Guo, G., Mu, G., Fu, Y., Huang, T.S.: Human age estimation using bio-inspired features. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 112–119. IEEE (2009)
Google Scholar
Rothe, R., Timofte, R., Van Gool, L.: DEX: deep expectation of apparent age from a single image. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 10–15 (2015)
Google Scholar
Qin, R., Gao, W., Xu, H., Hu, Z.: Modern physiognomy: an investigation on predicting personality traits and intelligence from the human face. arXiv:1604.07499 [cs], April 2016
Fan, Y.Y., Liu, S., Li, B., Guo, Z., Samal, A., Wan, J., Li, S.Z.: Label distribution-based facial attractiveness computation by deep residual learning. IEEE Trans. Multimed. PP(99), 1 (2017)
Google Scholar
Chen, F., Zhang, D.: Combining a causal effect criterion for evaluation of facial attractiveness models. Neurocomputing 177, 98–109 (2016)
Article Google Scholar
Liu, S., Fan, Y.Y., Samal, A., Guo, Z.: Advances in computational facial attractiveness methods. Multimed. Tools Appl. 75(23), 16633–16663 (2016)
Article Google Scholar
Chen, F., Xiao, X., Zhang, D.: Data-driven facial beauty analysis: prediction, retrieval and manipulation. IEEE Trans. Affect. Comput. PP(99), 1 (2017)
Google Scholar
Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Learning social relation traits from face images. arXiv:1509.03936 [cs], September 2015
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

Download references

Author information

Authors and Affiliations

CentraleSupelec, Avenue de la Boulaie, 35510, Cesson-Sévigné, France
Victor Martin & Renaud Séguier
CHANEL Parfums Beauté, 8 Rue du Cheval Blanc, 93500, Pantin, France
Victor Martin, Aurélie Porcheron & Frédérique Morizot

Authors

Victor Martin
View author publications
You can also search for this author in PubMed Google Scholar
Renaud Séguier
View author publications
You can also search for this author in PubMed Google Scholar
Aurélie Porcheron
View author publications
You can also search for this author in PubMed Google Scholar
Frédérique Morizot
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Victor Martin .

Editor information

Editors and Affiliations

University of Dundee, Dundee, UK
Islem Rekik
Istanbul Technical University, Istanbul, Turkey
Gozde Unal
Stanford University, Stanford, CA, USA
Ehsan Adeli
Daegu Gyeongbuk Institute of Science and Technology, Daegu, Korea (Republic of)
Sang Hyun Park

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martin, V., Séguier, R., Porcheron, A., Morizot, F. (2018). Towards Continuous Health Diagnosis from Faces with Deep Learning. In: Rekik, I., Unal, G., Adeli, E., Park, S. (eds) PRedictive Intelligence in MEdicine. PRIME 2018. Lecture Notes in Computer Science(), vol 11121. Springer, Cham. https://doi.org/10.1007/978-3-030-00320-3_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-00320-3_15
Published: 13 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00319-7
Online ISBN: 978-3-030-00320-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics