Facing Driver Frustration: Towards Real-Time In-Vehicle Frustration Estimation Based on Video Streams of the Face

Franz, Oliver; Drewitz, Uwe; Ihme, Klas

doi:10.1007/978-3-030-50732-9_46

Oliver Franz^8,9,
Uwe Drewitz⁸ &
Klas Ihme⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1226))

Included in the following conference series:

International Conference on Human-Computer Interaction

2606 Accesses
4 Citations

Abstract

Drivers frequently experience frustration when facing traffic jams, red lights or badly designed in-vehicle interfaces. Frustration can lead to aggressive behaviors and negative influences on user experience. Affect-aware vehicles that recognize the driver’s degree of frustration and, based on this, offer assistance to reduce the frustration or mitigate its negative effects promise remedy. As a prerequisite, this needs a real-time estimation of current degree of frustration. Consequently, here we describe the development of a classifier that can recognize whether a frustrated facial expression was shown based on video streams of the face. For demonstration of its real-time capabilities, a demonstrator of a frustration-aware vehicle including the classifier, the Frust-O-Meter, is presented. The system is integrated into a driving simulator and consists of (1) a webcam, (2) a preprocessing unit, (3) a user model, (4) an adaptation unit and (5) a user interface. In the current version, a happy song is played once a high degree of frustration is detected. The Frust-O-Meter can form the basis for the development of frustration-aware vehicles and is foreseen to be extended to more modalities as well as more user need-oriented adaption strategies in the near future.

You have full access to this open access chapter, Download conference paper PDF

Facial Expressions as Indicator for Discomfort in Automated Driving

Experimental Induction and Measurement of Negative Affect Induced by Interacting with In-vehicle Information Systems

Integration of Driver Behavior into Emotion Recognition Systems: A Preliminary Study on Steering Wheel and Vehicle Acceleration

Keywords

1 Introduction

Frustration is a negative affective state that occurs when goal-directed behavior is blocked (e.g. [1]). Because driving is normally done for a purpose (e.g. going to work, taking the kids to school or quickly driving to the super market), drivers frequently experience frustration when they face obstacles, such as traffic jams and red lights or have problems to program their in-vehicle navigation or infotainment systems due to badly designed interfaces. Frustration can lead to aggressive behaviors (e.g. [2]) and can affect driving behavior due to negative effects on cognitive processes relevant for driving [3]. In addition, negative user experience coming along with frustration impacts user interaction with technical systems in general and has a significant influence on the acceptance of technical systems [4]: The lower the quality of the user experience, the lower the acceptance and thus also the willingness to use and buy a technical system. However, frustrating experiences when using technical systems, especially in complex traffic, cannot always be avoided by design for every situation. Here, affect-aware vehicles that recognize the driver’s degree of frustration and, based on this, offer assistance to reduce frustration or mitigate its negative effects promise remedy (e.g. [5,6,7,8]). As a prerequisite for the development of frustration-aware vehicles, a method for recognizing frustration in real-time is needed. Humans communicate emotions by changing their facial expression, so that understanding a person’s facial expression can help to infer his or her current emotional state [9]. Hence, an automated recognition of a frustrated facial expression could be an important brick for developing affect-aware vehicles. Interestingly, recent studies identified facial activation patterns that correlate with the experience of frustration during driving that may be automatically classified as indicator for frustration [10, 11].

Consequently, the goal of this work is to present a facial expression classifier that is capable of determining whether or not a frustrated face was shown based on video recordings. In order to show its real-time capability, we integrated the classifier into a demonstrator, the Frust-O-Meter, which works as part of a realistic driving simulator. In the following, we describe the development and performance of the classifier, introduce the modules of the Frust-O-Meter as well as their interplay and finally discuss potential improvements of the demonstrator together with ideas for further research.

2 Facial Expression Classifier

2.1 Short Description of Data Set

The data set used for training and validation of the classifier stems from an earlier driving simulator study with 30 participants conducted to investigate facial muscle activities that are indicative for frustration. Participants had to drive through an urban street with the task to quickly deliver a parcel to a client. Obstacles, such as red traffic light, traffic jams, or construction sites occurred that blocked the participant and thus induced frustration (for details, see [10, 11]). Participants’ faces were videotaped using a standard IP-cam (resolution of 1280 × 720 pixels) with 10 frames per second. The software package Emotient FACET (Imotions, Copenhagen, Denmark) was used to extract information regarding the evidence of activity of 19 facial action units (AUs^{Footnote 1}) frame-wise from the facial videos. Using a data-driven clustering approach based on the AU data averaged in windows of 1 s, five facial expressions were identified to predominantly occur in the data set corresponding a frustrated facial expression, two slightly different neutral expressions (neutral 1 and neutral 2), smiling and frowning (for details, see [11]). These facial expressions were used as labels for the classifier development for this paper. Thus, the final data set contained the activity information for 19 AUs together with a label of a facial expression (neutral 1, smiling, frowning, frustrated or neutral 2) for each second from 30 participants driving roughly for 10 min (30 participants × 10 min × 60 s ~ 18,000 data points).

2.2 Classifier Set-up for Frame-to-Frame Frustration Classification

To classify the labeled data, a multi-layer perceptron (MLP) was used. This type of supervised learning method uses interconnected layers of neurons to learn and generalize over training examples. An MLP learns by adjusting the initially random weights, which are attached to each neuron in the net in order to minimize the output error. The algorithm used to adjust the weights is the backpropagation algorithm [13]. It is a way of realizing gradient descent to minimize the error of the net which allows adjusting the weights and learns to discriminate the different classes of data.

A feature vector with 19 dimensions (corresponding to the 19 AUs) was fed with a batch size of 32 into the net. Three fully connected hidden layers (32, 16 and 16 neurons) were used to project onto the output layer. The activation of the output layer (5 neurons) was calculated with a SoftMax function to generate the probabilities for each label. The argmax of the output layer returned the predicted label for each sample. The classifier was implemented in Python. The computational graph underlying the neuronal net was written with the TensorFlow package.

2.3 Evaluation Procedure

Before the training of the artificial neural network was started, 20% of the data were randomly split as hold-out set for later testing and never used during training. The remaining 80% of the data were split into 70% training and 30% validation data. In each epoch both sets were randomly shuffled before using the training data for training and the validation set to check the performance at the end of each epoch and test for potential overfitting of the net. After finishing the training, the hold-out set was used to test the performance of the net on previously unseen data. In the end, the structure and weights were saved for the usage in the demonstrator at a later point.

2.4 Classification Results

The classification result shown in Fig. 1 represents the performance of the trained net on the test set. The true labels are plotted on the y-axis against the predicted label on the x-axis. An overall accuracy of 69% was reached on the test set with the MLP. With 93%, the accuracy was highest for the frustrated facial expression, despite relatively high accuracies also for most of the other expressions (neutral 1: 70%, smiling: 71%, frowning: 79%). Solely the expression ‘neutral_2’ had lower accuracies (41%) and was repeatedly misclassified as frowning. As our goal was to construct a classifier for detecting the facial expression of frustration, the performance of this net was considered to be acceptable.

3 Demonstrator: The Frust-O-Meter

In order to present the real-time capability of the classifier, we integrated it into a demonstrator, called the Frust-O-Meter. The Frust-O-Meter is currently integrated into a realistic driving simulator and consists of the five modules (1) a webcam, (2) preprocessing unit, (3) user model, (4) adaptation unit and (5) user interface (see Fig. 2 for a sketch of the architecture). The modules are detailed in the following:

Webcam: The webcam is a standard Logitech C920 webcam recording with a resolution of 1920 × 1080 at a framerate of 30 fps. The camera is mounted on the dashboard to record the participants face from a frontal position. The video data is streamed to the preprocessing unit.
Preprocessing unit: The purpose of the preprocessing unit is to extract the frame-wise activation of the facial AUs from the video streams and to make these available for the user model for further processing. In the current version, the commercial software package Emotient FACET (Imotions, Copenhagen, Denmark) is used in this step, which can estimate the evidence of activity for 19 AUs for each frame. Thus, a 19-dimensional vector for each frame is passed on to the user model.
User model: In the user model, the preprocessed data regarding the facial activation are interpreted to gain an estimate of the user’s current degree of frustration. For this, initially the model classifies the incoming AU data using the trained facial expression classifier described above with respect to the currently shown facial expression (two different neutral expressions, smiling, frowning, frustrated, see 2 Facial Expression Classifier). Following the results reported in [11] that the frequency of showing the frustrated facial expression correlates with the subjective frustration experience, the occurrence of this facial expression is integrated over the last 20 data points in order to estimate the current degree of frustration. This means that the frustration estimate could take values between 0 (no frustrated facial expression was shown in the last 20 frames) and 20 (the frustrated facial expression was shown permanently during the last 20 data points). The result of the frustration estimation from the user model is passed on to the adaptation unit and the user interface.
Adaptation unit: The idea of the adaptation unit is to select and execute an appropriate adaptation strategy that supports the user in mitigating her or his currently experience level of frustration or help to reduce the negative effects frustration has on (driving) performance. Currently, this is realized in a very simple form by randomly playing one of two happy music songs (either Hooked on a Feeling in the version by Blue Swede or Have You Ever Seen the Rain by Creedence Clearwater Revival) via loud speakers for about one minute once the frustration estimate reached a threshold value of 15 or above.
User interface: Finally, the user interface has the purpose to present the frustration estimate as well as the output of the facial expression classifier to the user. In the simulator the user interface was shown on a monitor in the center console of the vehicle mock-up. The upper half of the monitor contained a display of the frustration estimate in the style of a speedometer (which explains Frust-O-Meter as name for the demonstrator), in which values above 15 are displayed in red, while the remainder is shown in white (see Fig. 2). The lower half contains smileys for the facial expressions together a time series display of the classifier output over a configurable time window.

The modules of the Frust-O-Meter work in real-time, so that user could drive through a simulated urban drive (realized in Virtual Test Drive, VIRES Simulationstechnologie, Bad Aibling, Germany) with frustrating situations, such as construction sites or red lights (cf. [2, 10, 11]). In this way, it is possible for the users to experience a real-time adaptation of a system to their current frustration level.

4 Discussion and Outlook

Here, we presented a real-time capable classifier for recognizing a frustrated facial expression from video streams and its integration into the Frust-O-Meter, a demonstrator of a frustration-aware vehicle. The Frust-O-Meter links a real-time estimation of the degree of frustration from the facial expression with a simple adaptation and a user interface to communicate the current level of frustration. The user model of the demonstrator estimates the current degree of frustration based on a temporal integration of facial expressions classified as frustrated. As classifier, we used a multi-layer perceptron that was trained on video recordings of 30 participants experiencing frustration in a driving simulator study. The adaption to the frustration of the user is currently realized by playing a happy song when a certain degree of frustration is realized.

While the Frust-O-Meter is a useful means to render possible that people experience the idea of a real-time adaption to their degree of frustration, there are a lot of challenges that need to be tackled to develop a real fully-functioning frustration-aware vehicle. For instance, affective states like frustration are multi-component processes [14, 15] that not only manifest in the facial expression, but also come along with changes in cognitive appraisals, physiology, gestures or prosody. Therefore, information from other sensors besides a webcam (e.g. electrocardiogram, infrared imaging, or mricrophone [16,17,18,19]) should be integrated into the user model to improve the frustration estimation. Surely, extending the sensor set also demands a more sophisticated preprocessing unit and user model. Moreover, the only possible adaptation was playing a happy song randomly chosen from a set compiled based on the authors’ preferences. Although music has the potential to improve the drivers’ mood [20], other strategies may even be better suited. Very promising seem to be empathic voice assistants that support drivers in dealing with their frustration or help to overcome the causes of frustrations (e.g. offer help when dealing with a badly designed interface) [8, 21]. In order to realize this, information about the context to derive the cause of frustration in the user model as well as suitable approaches to select and apply the best possible strategy in the adaptation unit are needed as for example described in [22]. To sum up, in spite of the delineated ideas for further research, the Frust-O-Meter is an elegant way to demonstrate a frustration-aware vehicle based on automated recognition of facial expressions from video recordings.

Notes

1.
AUs can be seen as atomic units of facial behavior related to activation of facial muscles, see [12].

References

Lazarus, R.S.: Progress on a cognitive-motivational-relational theory of emotion. Am. Psychol. 46, 819–834 (1991)
Article Google Scholar
Lee, Y.-C.: Measuring drivers’ frustration in a driving simulator. Proc. Hum. Factors Ergon. Soc. Ann. Meeting 54, 1531–1535 (2010)
Article Google Scholar
Jeon, M.: Towards affect-integrated driving behaviour research. Theoret. Issues Ergon. Sci. 16, 553–585 (2015)
Article Google Scholar
Picard, R.W., Klein, J.: Computers that recognise and respond to user emotion: theoretical and practical implications. Interact. Comput. 14, 141–169 (2002)
Article Google Scholar
Oehl, M., Ihme, K., Pape, A.-A., Vukelic, M., Braun, M.: Affective use cases for empathic vehicles in highly automated driving: results of an expert workshop. Accepted at HCI International 2020 (2020)
Google Scholar
Stephanidis, C., et al.: Seven HCI grand challenges. Int. J. Hum.–Comput. Interact. 35, 1229–1269 (2019)
Article Google Scholar
Löcken, A., Ihme, K., Unni, A.: Towards designing affect-aware systems for mitigating the effects of in-vehicle frustration. In: Proceedings of the 9th International Conference on Automotive User Interfaces and Interactive Vehicular Applications Adjunct, pp. 88–93. ACM, New York (2017)
Google Scholar
Oehl, M., Ihme, K., Drewitz, U., Pape, A.-A., Cornelsen, S., Schramm, M.: Towards a frustration-aware assistant for increased in-vehicle UX. In: Janssen, C.P., Donker, S.F., Chuang, L.L., Ju, W. (eds.) The 11th International Conference on Automotive User Interfaces and Interactive Vehicular Applications: Adjunct Proceedings, pp. 260–264. ACM, New York (2019)
Google Scholar
Erickson, K., Schulkin, J.: Facial expressions of emotion: a cognitive neuroscience perspective. Brain Cogn. 52, 52–60 (2003)
Article Google Scholar
Ihme, K., Dömeland, C., Freese, M., Jipp, M.: Frustration in the face of the driver: a simulator study on facial muscle activity during frustrated driving. Interact. Stud. 19, 488–499 (2018)
Article Google Scholar
Ihme, K., Unni, A., Zhang, M., Rieger, J.W., Jipp, M.: Recognizing frustration of drivers from face video recordings and brain activation measurements with functional near-infrared spectroscopy. Front. Hum. Neurosci. 12, 669 (2018)
Article Google Scholar
Ekman, P., Friesen, W.V., Hager, J.: The Investigator’s Guide for the Facial Action Coding System. A Human face, Salt Lake City (2002)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Article Google Scholar
Scherer, K.R.: Emotions are emergent processes: they require a dynamic computational architecture. Philos. Trans. R. Soc. B Biol. Sci. 364, 3459–3474 (2009)
Article Google Scholar
Scherer, K.R.: What are emotions? And how can they be measured? So. Sci. Inf. 44, 695–729 (2005)
Article Google Scholar
Zepf, S., Stracke, T., Schmitt, A., van de Camp, F., Beyerer, J.: Towards real-time detection and mitigation of driver frustration using SVM. In: 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 202–209 (2019)
Google Scholar
Lotz, A., Ihme, K., Charnoz, A., Maroudis, P., Dmitriev, I., Wendemuth, A.: Behavioral factors while driving: a real-world multimodal corpus to monitor the driver’s affective state. In: Calzolari, N. (ed.) LREC 2018. Eleventh International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), Paris, France (2018)
Google Scholar
Zhang, M., Ihme, K., Drewitz, U.: Discriminating drivers’ emotions through the dimension of power: evidence from facial infrared thermography and peripheral physiological measurements. Transp. Res. Part F Traffic Psychol. Behav. 63, 135–143 (2019)
Article Google Scholar
Cevher, D., Zepf, S. and Klinger, R.: Towards Multimodal Emotion Recognition in German Speech Events in Cars using Transfer Learning, https://arxiv.org/abs/1909.02764
van der Zwaag, M.D., et al.: The influence of music on mood and performance while driving. Ergonomics 55, 12–22 (2012)
Article Google Scholar
Braun, M., Schubert, J., Pfleging, B., Alt, F.: Improving Driver Emotions with Affective Strategies. MTI 3, 21 (2019)
Article Google Scholar
Drewitz, U., et al.: Towards user-focused vehicle automation: the architectural approach of the AutoAkzept project. Accepted at HCI International 2020 (2020)
Google Scholar

Download references

Acknowlegdement

The authors thank Dirk Assmann for his effort in setting up the demonstrator. In addition, we gratefully acknowledge the financial support for the project F-RELACS, which is funded by the German Federal Ministry of Education and Research (grant number: 16SV7930).

Author information

Authors and Affiliations

German Aerospace Center, Institute of Transportation Systems, Lilienthalplatz 7, 38108, Brunswick, Germany
Oliver Franz, Uwe Drewitz & Klas Ihme
University of Osnabrück, 49069, Osnabrück, Germany
Oliver Franz

Authors

Oliver Franz
View author publications
You can also search for this author in PubMed Google Scholar
Uwe Drewitz
View author publications
You can also search for this author in PubMed Google Scholar
Klas Ihme
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Klas Ihme .

Editor information

Editors and Affiliations

University of Crete and Foundation for Research and Technology – Hellas (FORTH), Heraklion, Crete, Greece
Constantine Stephanidis
Foundation for Research and Technology – Hellas (FORTH), Heraklion, Crete, Greece
Margherita Antona

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Franz, O., Drewitz, U., Ihme, K. (2020). Facing Driver Frustration: Towards Real-Time In-Vehicle Frustration Estimation Based on Video Streams of the Face. In: Stephanidis, C., Antona, M. (eds) HCI International 2020 - Posters. HCII 2020. Communications in Computer and Information Science, vol 1226. Springer, Cham. https://doi.org/10.1007/978-3-030-50732-9_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-50732-9_46
Published: 10 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50731-2
Online ISBN: 978-3-030-50732-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics