1 Introduction

The action of changing facial expression is one of major means for communicating what one is feeling and thinking to others. Understanding and recognition of facial expressions therefore have been an important subject in researches both scientific and engineering and found many important applications in communication and medical areas.

Among theoretical models supporting current engineering approaches on facial expression recognition, the categorical theory by Ekman claims universal existence of basic facial expressions (Anger, Disgust, Fear, Surprise, Happiness, Sadness) which are independent of culture and race or individual experience [1, 2]. The 7 expressions with the Neutral added to basic facial expressions are still used for labeling expression images in many facial expression databases. This discrete theory certainly provides a convenient way for classification and symbolic processing and also a common basis for comparison and mutual understanding of facial expressions and emotions among individuals in the linguistic level. On the other hand, it is difficult to deal with ambiguous and subtle expressions. Besides, there is no place to take into account of individual variations.

As to the first problem, the dimension theory to describe expressions with a small number of dimensions provided a partial answer. In fact, this model represents expressions on a 2D or 3D space with continuous coordinates and seek for inter-relationship among the basic expressions described by their positions in the space, such as the circular model proposed by Russell et al. [3]. One of the problems with these approaches is that the psychological expression space has no direct correspondence with physical stimuli or the facial images.

There is another problem seemed still remained open, which is also related with the difficulty to describe individual characteristics in expression perception, is that it is difficult to find an objective representation for facial expressions so that one can compare perceptions between different individuals.

Recently, Sumiya et al. reported construction of a psychophysical space of the facial expressions in which one can find direct correspondence between psychological responses and physical stimuli [7]. The new facial expression space was obtained by measurement of JND (just noticeable difference) thresholds of facial expressions in the image space or its PCA subspaces. It is also revealed that the facial expression space is not a Euclidean space but a Riemann space, of which the Riemann metric tensor is defined by the JND thresholds.

In this research, we try to model personal characteristics in expression perception using geometry of the expression space as a Riemann space, and propose a method to compare facial expression perceptions between different individuals then furthermore to exchange or share their impressions with each other. In particular, we shown algorithms to build an isometric or a distance-preserving mapping which transform JND threshold ellipsoids between individuals.

2 Space of Facial Expressions

The first representation of facial expressions in a spacial form was the dimensional theory which places the expressions in a space of small dimension obtained from MDS or PCA and called it a facial expression space. A well know circular model of expressions by Russell et al. [3] showed that basic facial expressions have a circular like distribution in the space of two dimensions: “Pleasure-Displeasure” and “Arousal-Sleep”.

These facial expression spaces in the dimensional theory are actually psychological spaces since they are obtained from evaluation scores of psychological experiments such as SD tests and Affect Grid, these high dimensional data are then mapped to low dimensional subspaces by dimension reduction using MDS or PCA. Hence it is difficult to find direct correspondence with physical stimuli. In fact, the coordinates of points in the space become continuous only after projections by MDS or PCA, before that they are discrete levels of relative rather than abstract evaluations. Therefore, such a spacial coordinates seemed not an ideal option as a continuous and quantitative representation for facial expressions.

Recently a facial expression space in which one can track correspondence between physical stimuli and psychological response was proposed by Sumiya et al. [7]. The psychophysical space is obtained by measurements of the facial expression JND thresholds in the image space or PCA subspaces. In addition, it is discovered that the facial expression space is not a Euclidean space but a complicatedly curved space or a Riemannian manifold.

To be specific, recall that a Riemannian space S is a space in which a Riemann metric tensor \(G(\varvec{x})=(g_{ij})\) is smoothly defined on a point \(\varvec{x}\) in the space. The inner product between two infinitesimal vectors \(\varvec{a}, \varvec{b}\) centered as the point \(\varvec{x}\) or two tangent vectors in the tangent space \(T_{\varvec{x}} S\) at \(\varvec{x}\) is defined as

$$ (\varvec{a}, \varvec{b}):= \varvec{a}^TG(\varvec{x})\varvec{b}=\sum _{ij}g_{ij}a^ib^j, \quad \forall \varvec{a}=(a^i), \varvec{b}=(b^j)\in T_{\varvec{x}}S $$

Therefore the local distance from the point \(\varvec{x}\) to \(\varvec{x}+\varvec{a}\), and the “unit circle” centered at \(\varvec{x}\) is defined as

$$ \Vert \varvec{a}\Vert ^2=(\varvec{a}, \varvec{a}):= \varvec{a}^TG(\varvec{x})\varvec{a}=\sum _{ij}g_{ij}a^ia^j=1 \quad \forall \varvec{a}=(a^i)\in T_{\varvec{x}}S $$

The matrix \(G(\varvec{x})\) is called the Riemann metric tensor which determines geometry of the space [10].

In fact, the Riemann metric tensor in the facial expression space in [7] is defined by the JND thresholds known as subjective unit circles in the image space or its PCA subspaces. The measurements of JND thresholds showed a stable common pattern among different observers, meanwhile individual variations were also observed. Therefore, this approach actually also suggested possibility to model personal characteristics in facial expression perception through Riemann metric tensors in the facial expression space of each observer.

3 Compare and Exchange Facial Expression Perceptions

It seemed that among numerous researches of facial expressions, there are few ones focusing on the individual difference in facial expression recognition and cross examination of the perceptions between different observers.

Here, we propose to use the Riemann metric tensor in the facial expression space as characteristic in perception per individual and show algorithms of comparison and exchange or sharing subjective facial expressions between different individuals. In particular we consider hereafter two observers names Alice and Bob.

The proposed method consists of the following five steps, and Sects. 3.1, 3.2 and 3.3 were based on procedures by Sumiya et al. [4].

3.1 Build PCA Subspaces of the Image Space and Morphing

First we prepare the image space of facial expressions and morphing sequences required in the experiments which are then projected to a low dimensional subspace by dimension reduction. In this paper, we applied principal component analysis to obtain these PCA subspaces. Figure 1 shows an example of a two-dimensional subspace of the image space of facial expressions by PCA. The morphing sequences between Neutral and basic expressions in the image space are also shown.

Fig. 1.
figure 1

Image subspace obtained by PCA and morphing sequences

3.2 Measure JND Thresholds for Alice and Bob

The next, is to measure the facial expression JND thresholds of Alice and Bob. In particular, a continuous sequence of images from one expression to another is shown to the observer and the point where the facial expression is judged as changed is recorded as the expression JND threshold.

3.3 Estimate the JND Ellipsoids

We then calculate the equation of the ellipsoids of JND thresholds of Alice and Bob obtained in the previous experiments. We used the Fundamental Numerical Scheme (FNS) [5] to fit the points in the PCA subspace with an ellipse, which has the equation as

$$\begin{aligned} Ax^2+Bxy+Cy^2 = 1. \end{aligned}$$

Define the variable vector as

$$\begin{aligned} \theta\equiv & {} (A, B, C, D)^T \qquad D = -1 \end{aligned}$$

and the data vector as

$$\begin{aligned} \xi _{\alpha }\equiv & {} (x_{\alpha }, 2x_{\alpha }y_{\alpha }, y_{\alpha }, 1)^T. \end{aligned}$$

Then the covariance matrix of the data vectors is defined as

$$\begin{aligned} V[\xi _{\alpha }]= & {} \begin{pmatrix} \overline{x}^2_{\alpha } &{} \overline{x}_{\alpha }\overline{y}_{\alpha } &{} 0 &{} 0\\ \overline{x}_{\alpha }\overline{y}_{\alpha } &{} \overline{x}^2_{\alpha } + \overline{y}^2_{\alpha } &{} \overline{x}_{\alpha }\overline{y}_{\alpha } &{} 0\\ 0 &{} \overline{x}_{\alpha }\overline{y}_{\alpha } &{} \overline{y}^2_{\alpha } &{} 0\\ 0 &{} 0 &{} 0 &{} 0 \end{pmatrix}. \end{aligned}$$

Here the Sampson error J is defined as

$$\begin{aligned} J = \frac{1}{N} \Sigma ^N_{\alpha = 1} \frac{(\xi _{\alpha }, \theta )^2}{(\theta , V[\xi _{\alpha }]\theta )}. \end{aligned}$$

The value of \(\theta \) which minimizes the Sampson error therefore provides estimate of ABC.

3.4 Find Riemann Metric and the Isometry or Metric Preserving Map

Here we shown how to obtain the Riemann metric matrix G and the isometry or the map preserving distance.

The Riemann metric matrix G can be found from the ellipses of Alice and Bob at different expressions obtained in Sect. 3.2 as follows. Denote for the ellipses at a point \(\varvec{x}\), the long axis of the ellipses as a, the short axis as b, rotation angle of the axis as \(\theta \), then G can be calculated as

$$\begin{aligned} G= & {} \begin{pmatrix} g_{11} &{} g_{12}\\ g_{21} &{} g_{22} \end{pmatrix}\\ g_{11}= & {} \frac{a^2\sin ^2\theta +b^2\cos ^2\theta }{a^2b^2}\\ g_{12}= & {} g_{21} = \frac{\sin \theta \cos \theta (b^2-a^2)}{a^2b^2}\\ g_{22}= & {} \frac{a^2\cos ^2\theta +b^2\sin ^2\theta }{a^2b^2}. \end{aligned}$$

Now denote the JND threshold ellipses of Alice and Bob as follows

$$\begin{aligned} X^TG_1X = 1 , \qquad Y^TG_2Y = 1 \end{aligned}$$
(1)

where the Riemann metric matrices of Alice and Bob are

$$\begin{aligned} G_1= \begin{pmatrix} g^{(1)}_{11} &{} g^{(1)}_{12}\\ g^{(1)}_{21} &{} g^{(1)}_{22} \end{pmatrix}, \qquad G_2= \begin{pmatrix} g^{(2)}_{11} &{} g^{(2)}_{12}\\ g^{(2)}_{21} &{} g^{(2)}_{22} \end{pmatrix} \end{aligned}$$

Now we wish to find a map from the space of Alice to the space of Bob while preserving the Riemann metric therefore the distance and geometry in the space. Such a map is called an isometry in Riemannian geometry. Since the Riemann metric \(G_1 \) and \(G_2\) at every points of the facial expression space characterize the subjective perceptions on facial expressions for Alice and Bob respectively, an isometry between expression spaces of Alice and Bob will preserve such perceptional properties.

In particular, we show how to find a local linearization of the nonlinear map, or a local isometry as a matrix M, which by definition actually maps the JND threshold ellipses of Alice exactly onto the JND ellipses of Bob. This JND ellipses matching between Alice and Bob is shown in Fig. 2. Indeed, denote the local linear map by M as

$$\begin{aligned} Y = MX \end{aligned}$$
(2)

By (1), (2), the matrix M should meet the following condition in order to be an isometry:

$$\begin{aligned} G_1 = M^TG_2M \end{aligned}$$
(3)
Fig. 2.
figure 2

An isometry matches ellipses in both spaces

In fact, such a matrix is not unique since M has 4 entries but (3) only has 3 independent equations. So we here assume M has the following form

$$\begin{aligned} M = \begin{pmatrix} M_{1} &{} M_{2}\\ M_{3} &{} M_{4} \end{pmatrix} := \begin{pmatrix} \cos \theta &{} -\sin \theta \\ \sin \theta &{} \cos \theta \end{pmatrix} \begin{pmatrix} a &{} 0\\ 0 &{} b \end{pmatrix} \end{aligned}$$
(4)

In general, we can use an n by n matrix \(M\in GL(n)\) as an isometry in nD space to be the product of \(R\in O(n)\) or an orthogonal matrix (or SO(n) instead) and A an n by n diagonal matrix. The condition (3) gives \(n(n+1)/2\) equations and since dimO(n)=dim \(SO(n)=n(n-1)/2\), so the number of variables is \(n(n-1)/2+n=n(n+1)/2\) and the solution is unique.

The M can be found from solutions of the following quadratic equations according to (3), (4), (see also [9]).

$$\begin{aligned} g^{(1)}_{11}= & {} g^{(2)}_{11}M^2_1+2g^{(2)}_{12}M_1M_3+g^{(2)}_{22}M^2_3 \\ g^{(1)}_{12}= & {} g^{(2)}_{11}M_1M_2+g^{(2)}_{12}(M_1M_4+M_2M_3)+g^{(2)}_{22}M_3M_4 \\ g^{(1)}_{22}= & {} g^{(2)}_{11}M^2_2+2g^{(2)}_{12}M_2M_4+g^{(2)}_{22}M^2_4 \\ 0= & {} M_1M_2+M_3M_4 \end{aligned}$$

3.5 Comparison of Expression Perceptions

Now we can use the isometry M to match between the JND threshold ellipses of Alice and Bob as in Sect. 3.4 therefore to compare and exchange facial expression perceptions between them. To map by M the image in the Alice’s space \(P_1=(x_1, y_1)\) to the space of Bob as a new image \(P_2=(x_2, y_2)\), which preserves Alice’s perceptional properties therefore shows to Bob what Alice feels. The isometry from Bob’s space to Alice’s space is defined by \(M^{-1}\).

4 Experiments

The above procedures were implemented using the database “A database of facial expressions in younger, middle-aged, and older women and men” [6]. An experimental image was created by dividing the morphing movie for each frame, reducing the size of the images and the number of pixels, then transform to grayscale. Figure 3 shows an example of the created facial expression image sequence.

Fig. 3.
figure 3

An example used in the experiment

4.1 Estimate of JND Thresholds

For the two subjects as Alice and Bob, we measured facial expression JND thresholds of facial expression images that change from the Neutral to five basic expressions, and estimated JND thresholds ellipse at the Neutral. Five facial expression JND thresholds were determined by the adjustment method. We measured the discrimination threshold 4 times in total, twice from the Neutral towards another expression, twice from another expression to the Neutral, and taking the average. Figure 4 shows the facial expression JND thresholds and JND ellipses of two subjects. Individual differences can be observed easily.

Fig. 4.
figure 4

Facial expression JND thresholds and JND threshold ellipses of Alice and Bob

4.2 Comparison and Exchange Experiments

Here, a comparison experiment of expression sensation is performed using JND ellipses of Alice and Bob, which are as shown in Fig. 5 by the experiment in the last section. By applying the method in Sects. 3.4 and 3.5 to these ellipses, it is possible to define an isometry map M from Alice’s facial expression space to Bob’s space. This makes it possible for Bob to compare Alice’s facial expression perception with his own and also to share Alice’s perception.

Fig. 5.
figure 5

Facial expression JND threshold ellipses of Alice and Bob

Fig. 6.
figure 6

Input image to Alice and Bob (Sadness version)

Fig. 7.
figure 7

Isometry M from Alice’s space to Bob’s, showing Alice’s view to Bob space (sadness version)

Fig. 8.
figure 8

Bob compares Alice’s view with his own one (sadness version)

Fig. 9.
figure 9

Isogemy \(M^{-1}\) from Bob’s space to Alices’s, showing Bob’s view to Alice (sadness version)

Fig. 10.
figure 10

Alice compare Bob’s view with her own one (sadness version)

Fig. 11.
figure 11

Input image to Alice and Bob (anger version)

Fig. 12.
figure 12

Isometry M from Alice’s space to Bobs, showing Alice’s view to Bob (anger version)

Fig. 13.
figure 13

Bob compares Alice’s view with his own one (Anger version)

Fig. 14.
figure 14

Isogemy \(M^{-1}\) from Bob’s space to Alice’s, showing Bob’s view to Alice (anger version)

Fig. 15.
figure 15

Alice compare Bob’s view with her own one (anger version)

For example, suppose that the expression image shown to both Alice and Bob is Fig. 6 as a common input. This image was selected from the morphing sequence of images changing from the Neutral to the Sadness. By multiplying the isometry matrix M to the vector of the input, the image is mapped from Alice’s space to Bob’s space, the result is Fig. 7 which now shows to Bob the expression perception of Alice on the input Fig. 6. Therefore, Bob can compare the original input he saw and Fig. 7 by Alice, as shown in Fig. 8. It is also possible now for him to share perception the same as Alice by looking at Fig. 7 instead of the original input. On the other direction, by applying the inverse map \(M^{-1}\) of the isometry M to the input in Fig. 6, it is transformed from Bob’s space to Alice’ space, the result is Fig. 9. Now Alice can compare her own view on the input image and Bob’s view in Fig. 9 and understand the difference, as shown in Fig. 10. By looking at Fig. 9 instead of the original input, Alice is possible to obtain the same facial expression perception as Bob. Next, we try different groups of expression images using the same isometry M obtained earlier. The input image shown to Alice and Bob is Fig. 11, which is selected from the morphing sequence changing from the Neutral to the Anger. The isometry M maps the input image Fig. 11 from Alice’s space to Bob’s space, the resulting image in Bob’s space is Fig. 12 which shows to Bob the expression perception of Alice. Comparing it with Bob’s own view on the same input, he could realized the different between him and her as shown in Fig. 13. He can also obtain the same perception as Alice by looking at Fig. 12 instead of the original input. Again, by applying the inverse map \(M^{-1}\) of the isometry M to the input Fig. 11 so maps it from Bob’s space to Alice’s, the result is in Fig. 14. Similarly Alice can now compare her view looking at the original input with Bob’s view in Fig. 14 and find the discrepancy between her and him. By looking at this image instead of the original input Alice can obtain the same facial expression perception as Bob (Fig. 15).

5 Discussions

In the experiments in Sect. 4.1, we observed individual differences in facial expression JND threshold ellipses at the Neutral. On the other hand, since the number of expression threshold measurements is small, one need to improve the accuracy of ellipse estimation and the expression discrimination thresholds as well by increasing the number of morphing sequences. A problem is that most of facial expression images in the published database are of Ekman’s basic 6 facial expressions therefore do not provide enough variations. One possibility is to produce image sequences among various facial expressions using tools in CG etc. Also, PCA is used here to create expression image space, but nonlinear dimension reduction such as manifold learning could be useful.

6 Conclusions and Future Works

We proposed to characterize individual perception in facial expressions by Riemann metric tensor in the facial expression space and showed algorithms to compare and exchange subjective perceptions between different individuals using isometries between Riemann spaces. We showed experiments to model individual differences and sharing facial expression perception. By transforming the input with isometry, it became possible subjectively to compare impressions between individuals and also to share the way the other person feels about the same facial expression image.

The future works include to extend the isometry to other expressions in order to build a global isometry for the space. It is known that it is possible to smoothly paste up local isometry at different points [10]. Another way is to build global isometry directly. See [8] for related works.