Abstract
This paper describes a novel structural approach to recognize the human facial features for emotion recognition. Conventionally, features extracted from facial images are represented by relatively poor representations, such as arrays or sequences, with a static data structure. In this study, we propose to extract facial expression features vectors as Localized Gabor Features (LGF) and then transform these feature vectors into FacE Emotion Tree Structures (FEETS) representation. It is an extension of the Human Face Tree Structures (HFTS) representation presented in (Cho and Wong in Lecture notes in computer science, pp 1245–1254, 2005). This facial representation is able to simulate as human perceiving the real human face and both the entities and relationship could contribute to the facial expression features. Moreover, a new structural connectionist architecture based on a probabilistic approach to adaptive processing of data structures is presented. The so-called probabilistic based recursive neural network (PRNN) model extended from Frasconi et al. (IEEE Trans Neural Netw 9:768–785, 1998) is developed to train and recognize human emotions by generalizing the FEETS representation. For empirical studies, we benchmarked our emotion recognition approach against other well known classifiers. Using the public domain databases, such as Japanese Female Facial Expression (JAFFE) (Lyons et al. in IEEE Trans Pattern Anal Mach Intell 21(12):1357–1362, 1999; Lyons et al. in third IEEE international conference on automatic face and gesture recognition, 1998) database and Cohn–Kanade AU-Coded Facial Expression (CMU) Database (Cohn et al. in 7th European conference on facial expression measurement and meaning, 1997), our proposed system might obtain an accuracy of about 85–95% for subject-dependent and subject-independent conditions. Moreover, by testing images having artifacts, the proposed model significantly supports the robust capability to perform facial emotion recognition.
Similar content being viewed by others
References
Paul E (2004) Emotions revealed. First owl books. Henry Holt and Company LLC, New York
Isen AM (2000) Positive affect and decision making, in handbook of emotions. Guilford Press, New York, pp 417–435
Baron-Cohen S (1995) Mindblindness: an essay on autism and theory of mind. MIT Press, Cambridge
Perlovsky LI (2001) Neural networks and intellect: using model-based concepts. Oxford University Press, New York (3rd printing)
Wierzbicka A (1999) Emotions across languages and cultures: diversity and universals. Cambridge University Press, Paris
Paul E (1999) Facial expressions. In: Dalgleish T, Powers M (eds) Handbook of cognition and emotion. Wiley, New York
Thompson J (1941) Development of facial expression of emotion in blind and seeing children. Arch Psychol 37:1–47
Tian Y-L, Kanade T, Cohn JF (2001) Recognizing action units for facial expression analysis. IEEE Trans Pattern Anal Mach Intel 23(2):1–18
Paul E, Friesen W (1978) Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologist Press, Palo Alto
Gianluca D et al (1999) Classifying facial actions. IEEE Trans Pattern Anal Mach Intell 21(10):974–989. doi:10.1109/34.799905
SNHC MPEG4 (1996) Face and body definition and animation parameters. In: ISO/IEC JTC1/SC29/WG11 MPEG96/N1365
Mase K (1991) Recognition of facial expression from optical flow. IEICE Trans E 74(10):3474–3483
Rosenblum M, Yacoob Y, Davis L (1996) Human expression recognition from motion using a radial basis function network architecture. IEEE Trans Neural Netw 7(5):1121–1138. doi:10.1109/72.536309
Lanitis A, Taylor C, Cootes T (1997) Automatic interpretation and coding of face images using flexible models. IEEE Trans Pattern Anal Mach Intell 19(7):743–756. doi:10.1109/34.598231
Belhumeur PN, Hespanha JP, Kriegman DJ (1996) EigenFaces vs. FisherFaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720. doi:10.1109/34.598228
Penev PS, Atick JJ (1996) Local feature analysis: a general statistical theory for object representation. Network Comput Neural Syst 7(3):477–500. doi:10.1088/0954-898X/7/3/002
Bartlett MS, Sejnowski T (1997) Viewpoint invariant face recognition using independent component analysis and attractor networks. In: Mozer M, Jordan M, Petsche T (eds) Advances in neural information processing systems. MIT Press, Cambridge
Daugman JG (1988) Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression. IEEE Trans Pattern Anal Mach Intell 36:1169–1179
Frasconi P, Gori M, Sperduti A (1998) A general framework for adaptive processing of data structures. IEEE Trans Neural Netw 9:768–785. doi:10.1109/72.712151
Tsoi AC (1998) Adaptive processing of data structure : an expository overview and comments. Faculty informatics, University of Wollongong, Wollongong, Australia
Sperduti A, Starita A (1997) Supervised neural networks for classification of structures. IEEE Trans Neural Netw 8:714–735. doi:10.1109/72.572108
Cho SY, Chi Z, Siu WC, Tsoi AC (2003) An Improved Algorithm for learning long-term dependency problems in adaptive processing of data structures. IEEE Trans Neural Netw 14(4):781–793. doi:10.1109/TNN.2003.813831
Cho SY (2008) Probabilistic based recursive model for adaptive processing of data structures. Expert Syst Appl 32(2):1403–1422
Cho S-Y, Wong J-J (2005) Probabilistic based recursive model for face recognition. In: Wang L, Jin Y (eds) Lecture notes in computer science. Springer GmbH, New York, pp 1245–1254
Platt J (1998) Fast training of support vector machines using sequential minimal optimization. In: Scholkopf B, Burges C, Smola A (eds) Advances in Kernel methods—support vector learning. MIT Press, Cambridge, pp 185–208
Aha D, Kibler D (1991) Instance based learning algorithms. Mach Learn 6:37–66
John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: The eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo
Lyons MJ, Budynek J, Akamatsu S (1999) Automatic classification of single facial images. IEEE Trans Pattern Anal Mach Intell 21(12):1357–1362. doi:10.1109/34.817413
Lyons MJ et al (1998) Coding facial expressions with Gabor wavelets. In: Third IEEE international conference on automatic face and gesture recognition. Nara Japan, IEEE Computer Society
Cohn JF et al (1997) Automated face coding: a computer-vision based method of facial expression analysis. In: 7th European conference on facial expression measurement and meaning
Kelly MD (1970) Visual identification of people by computer. Technical report AI-130, Stanford AI Proj., Standford, CA
Yang J, Waibel A (1996) A real-time face tracker. In: IEEE workshop on applications of computer vision. Saratosa, FL, USA
Aras S, Subramanian AK, Zhang Z (2004) Face recognition. In: CSE717 lecture notes. University of Buffalo, New York
Sirovich L, Kirby M (1987) Low-dimensional procedure for the characterization of human face. J Opt Soc Am 4:519–524
Kirby M, Sirovich L (1990) Application of the Karhunen–Loeve procedure for characterization of human faces. IEEE Trans Pattern Anal Mach Intell 12:103–108
Chengjun Liu, Wechsler Harry (2003) Independent component analysis of gabor features for face recognition. IEEE Trans Neural Netw 14(4):919–928. doi:10.1109/TNN.2003.813829
Daugman JG (1985) Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional cortical filters. J Opt Soc Am 2(7):1160–1167
Jones J, Palmer L (1987) An evaluation of the two dimensional Gabor Filter model of simple receptive fields in cat striate cortex. J Neurophysiol 58(6):1233–1258
Marcelja S (1980) Mathematical description of the responses of simple cortical cells. J Opt Soc Am 70:1297–1300
Brunelli R, Poggio T (1993) Face recognition: features versus templates. IEEE Trans Pattern Anal Mach Intell 15:1042–1052. doi:10.1109/34.254061
Weyrauch B, Huang J Component-based face recognition with 3D morphable models. In: 4th Conference on audio- and video-based biometric person authentication. 2003
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Paul Ekman, Rosenberg EL (1997) What the face reveals: basic and applied studies of spontaneous expression using the facial action coding system (FACS). Oxford University Press, New York
Adolphs R, Damasio H, Tranel D, Damasio AR (1996) Cortical systems for the recognition of emotion in facial expression. J Neurosci 16(23):7678–7697
Manjunath BS, Ma WY (1996) Texture features for browsing and retrieval of image data. IEEE Trans Pattern Anal Mach Intell 18(8):837–842. doi:10.1109/34.531803
Viola P, MR Jones (2001) Robust real-time object detection. In: Second international workshop on statistical and computational theories of vision—modeling, learning, computing, and sampling. Vancouver, Canada
Streit DF, Luginhuhl TE (1994) Maximum likelihood training of probabilistic neural networks. IEEE Trans Neural Netw 5(5):764–783. doi:10.1109/72.317728
Roberts S, Tarassenko L (1994) A probabilistic resource allocating network for novelty detection. Neural Comput 6:270–284. doi:10.1162/neco.1994.6.2.270
Mak MW, Kung SY (2000) Estimation of elliptical basis function parameters by the EM algorithms with application to speaker verification. IEEE Trans Neural Netw 11(4):961–969. doi:10.1109/72.857775
Lin SH, Kung SY, Lin LJ (1997) Face recognition/detection by probabilistic decision-based neural network. IEEE Trans on Neural Networks Spec Issue Biometric Identif 8(1):114–132
Hammer B et al (2004) A general framework for unsupervised processing of structured data. Neurocomputing 57:3–35. doi:10.1016/j.neucom.2004.01.008
Bengio Y, Simard P, Frasconi P (1994) Learning long term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166. doi:10.1109/72.279181
Bengio Y, Frasconi P (1996) Input-Output HMMs for sequence processing. IEEE Trans Neural Netw 7:1231–1249. doi:10.1109/72.536317
Kung SY, Taur JS (1995) Decision-based neural networks with signal/image classification applications. IEEE Trans Neural Netw 6:170–181. doi:10.1109/72.363439
Cho S-Y, Chow TWS (1999) Training multilayer neural networks using fast global learning algorithm—least squares and penalized optimization methods. Neurocomputing 25(1–3):115–131. doi:10.1016/S0925-2312(99)00055-7
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Statist Soc Ser B Methodological 39(1):1–38
Abboud B, Davoine F (2004) Appearance factorization based facial expression recognition and synthesis. In: Proceedings of 17th international conference on Pattern Recognition 4:163–166
Ma L, Khorasani K (2004) Facial expression recognition using constructive feedforward neural networks. IEEE Trans on Systems, Man and Cybernetics Part B 34(3):1588–1595
Guo G, Dyer C (2005) Learning from examples in the small case: face expression recognition systems. IEEE Trans on Systems, Man and Cybernetics Part B 35(3):477–488
Zheng W, Zhou X, Zou C, Zhao C (2006) Facial expression recognition using Kernel canonical correlation analysis (KCCA). IEEE Trans Neural Netw 17(1):233–238. doi:10.1109/TNN.2005.860849
Wu Y, Liu H, Zha H (2005) Modeling facial expression space for recognition. In: Proceedings of international conference on intelligent Robots and systems, pp 1968–1973
Author information
Authors and Affiliations
Corresponding author
Appendix A
Appendix A
1.1 A.1. Proof of the condition 1 for PRNN convergence
By taking the expectation in (22) and as \( E\{ {\mathbf{a}}_{k} (n)\Upphi_{j} \} = 0, \) we get
where we can define:
and
We refer to R Φ as the correlation matrix of the input pattern set and to r dΦ as the cross-correlation between the input pattern and the desired output. In the mean time, if we assume that \( \Upgamma = E\left\{ {\Uplambda \left( {\Upphi_{j}^{1} } \right)\Upphi_{j}^{1} } \right\} \) as it is applied in a deep tree structure, so we can define a new matrix \( {\mathbf{R}}_{\Upphi }^{*} = {\mathbf{R}}_{\Upphi } \Upgamma \) and \( {\mathbf{r}}_{d\Upphi }^{*} = {\mathbf{r}}_{d\Upphi } \Upgamma \). In order to find the condition for the convergence in the mean, we can make use of an orthogonal similarity transformation on the matrix R *Φ by
where Λ is a diagonal matrix made up of the eigenvalues of the matrix R *Φ and P is an orthogonal matrix whose columns are the associated eigenvectors of R *Φ . By the Wiener–Hopf filtering, we can define:
where a o is the optimum solution. Using the above equation for r *dΦ substituting the orthogonal similarity transformation of (33) into (28), we get
Let w(n) be defined as a transformed version of the deviation between the \( E\left\{ {{\mathbf{a}}_{k} (n)} \right\} \) and a o . We may have the affine transformation as:
and
So, from (36), we simply to have
The above equation can represent a system of uncoupled homogeneous first-order difference equations as shown:
where the λ j are the eigenvalues of the matrix R *Φ and w j (n) is the jth element of the vector w(n). For the algorithm to be convergent in the mean, we require that for an arbitrary choice of the initial value of w j (n) the following condition be satisfied:
Under this condition, if w j (n) → 0 as n → ∞, so we can define the selection of the learning parameter η as follow
where λ max is the largest eigenvalue of the matrix R *Φ .
1.2 A.2. Proof of Condition 2 for PRNN convergence
The cost function defined in (21) can be expanded by a first order Taylor series as:
where \( \Updelta {\mathbf{a}}_{k} = \eta \left[ { - \frac{\partial J}{{\partial {\mathbf{a}}_{k} }}} \right]. \) By the convergence theorem of the second distinct, ΔJ ≤ 0 is given, so we define:
Substituting (23) and (26) into (43), we get,
and as η is supposed to be positive value, thus we have:
Taking the logarithm in both sides to get rid of exp, we have,
Initially, we assume that |β| ≫ 0 and is a quite large value, so β 2 → ∞ then \( \frac{1}{{\beta^{2} }} \to 0, \) therefore
So, we can choose the penalty factor β as follows
Assume that it is working under a deep tree structure; Eq. (26) can substitute into (49) as:
Rights and permissions
About this article
Cite this article
Wong, JJ., Cho, SY. A face emotion tree structure representation with probabilistic recursive neural network modeling. Neural Comput & Applic 19, 33–54 (2010). https://doi.org/10.1007/s00521-008-0225-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-008-0225-z