It is our great pleasure to welcome you to the 8th Audio-Visual Emotion Challenge - AVEC'18, held in conjunction with the ACM Multimedia 2018 in Seoul, Korea.
This year's challenge and associated workshop continues to push the boundaries of audio-visual emotion and health recognition towards real-life applications of affective computing. Looking back in the history of AVEC, the first challenge posed the problem of detecting discrete emotion classes on a large set of natural behaviour data. The second AVEC extended this problem to the prediction of continuous valued dimensional affect, which was enlarged further for the third edition to include the prediction of self-reported severity of depression, and enriched annotations for the fourth edition. Physiological signals were then introduced for the prediction of dimensional affect in the fifth AVEC. In the sixth edition, the focus was on depression analysis from human-agent interactions, and on emotion recognition from human behaviours captured 'in-the-wild' for the seventh AVEC. Finally, for this year edition, we proposed cross-cultural affect prediction 'in-the-wild', classification of bipolar disorder, and generation of dimensional labels for continuous emotion recognition.
The mission of the AVEC series is to provide a common benchmark test set for multimodal information processing and compare the merits of the approaches under well-defined and strictly comparable conditions. The main underlying motivation is the need to advance emotion and health estimation for multimedia retrieval to a level where behaviours can be reliably sensed in real-life conditions, as this is exactly the type of data that the new generation of affect-oriented multimedia and human-machine/human-robot communication interfaces have to face in the real world.
The call for participation attracted 23 submissions from Asia, Europe, Oceania and North America. The programme committee accepted 11 papers in addition to the baseline paper for oral presentation. We hope that these proceedings will serve as a valuable reference for researchers and developers in the area of audio-visual emotion and health sensing.
Proceeding Downloads
Interpersonal Behavior Modeling for Personality, Affect, and Mental States Recognition and Analysis
Imagine humans as complex dynamical systems: systems that are characterized by multiple interacting layers of hidden states (e.g., internal processes involving functions of cognition, perception, production, emotion, and social interaction) producing ...
AVEC 2018 Workshop and Challenge: Bipolar Disorder and Cross-Cultural Affect Recognition
- Fabien Ringeval,
- Björn Schuller,
- Michel Valstar,
- Roddy Cowie,
- Heysem Kaya,
- Maximilian Schmitt,
- Shahin Amiriparian,
- Nicholas Cummins,
- Denis Lalanne,
- Adrien Michaud,
- Elvan Ciftçi,
- Hüseyin Güleç,
- Albert Ali Salah,
- Maja Pantic
The Audio/Visual Emotion Challenge and Workshop (AVEC 2018) "Bipolar disorder, and cross-cultural affect recognition'' is the eighth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic ...
Bipolar Disorder Recognition with Histogram Features of Arousal and Body Gestures
This paper targets the Bipolar Disorder Challenge (BDC) task of Audio Visual Emotion Challenge (AVEC) 2018. Firstly, two novel features are proposed: 1) a histogram based arousal feature, in which the continuous arousal values are estimated from the ...
Bipolar Disorder Recognition via Multi-scale Discriminative Audio Temporal Representation
Bipolar disorder (BD) is a prevalent mental illness which has a negative impact on work and social function. However, bipolar symptoms are episodic, especially with irregular variations among different episodes, making BD very difficult to be diagnosed ...
Multi-modality Hierarchical Recall based on GBDTs for Bipolar Disorder Classification
In this paper, we propose a novel hierarchical recall model fusing multiple modality (including audio, video and text) for bipolar disorder classification, where patients with different mania level are recalled layer-by-layer. To address the complex ...
Automated Screening for Bipolar Disorder from Audio/Visual Modalities
This paper addresses the Bipolar Disorder sub-challenge of the Audio/Visual Emotion recognition Challenge (AVEC) 2018, where the objective is to classify patients suffering from bipolar disorder into states of remission, hypo-mania, and mania, from ...
Speech-based Continuous Emotion Prediction by Learning Perception Responses related to Salient Events: A Study based on Vocal Affect Bursts and Cross-Cultural Affect in AVEC 2018
This paper presents a novel framework for speech-based continuous emotion prediction. The proposed model characterises the perceived emotion estimation as time-invariant responses to salient events. Then arousal and valence variation over time is ...
Multimodal Continuous Emotion Recognition with Data Augmentation Using Recurrent Neural Networks
This paper presents our effects for Cross-cultural Emotion Sub-challenge in the Audio/Visual Emotion Challenge (AVEC) 2018, whose goal is to predict the level of three emotional dimensions time-continuously in a cross-cultural setup. We extract the ...
Multi-modal Multi-cultural Dimensional Continues Emotion Recognition in Dyadic Interactions
Automatic emotion recognition is a challenging task which can make great impact on improving natural human computer interactions. In this paper, we present our solutions for the Cross-cultural Emotion Sub-challenge (CES) of Audio/Visual Emotion ...
Towards a Better Gold Standard: Denoising and Modelling Continuous Emotion Annotations Based on Feature Agglomeration and Outlier Regularisation
Emotions are often perceived by humans through a series of multimodal cues, such as verbal expressions, facial expressions and gestures. In order to recognise emotions automatically, reliable emotional labels are required to learn a mapping from human ...
Fusing Annotations with Majority Vote Triplet Embeddings
Human annotations of behavioral constructs are of great importance to the machine learning community because of the difficulty in quantifying states that cannot be directly observed, such as dimensional emotion. Disagreements between annotators and ...
Deep Learning for Continuous Multiple Time Series Annotations
Learning from multiple annotations is an increasingly important research topic. Compared with conventional classification or regression problems, it faces more challenges because time-continuous annotations would result in noisy and temporal lags ...
Learning an Arousal-Valence Speech Front-End Network using Media Data In-the-Wild for Emotion Recognition
Recent progress in speech emotion recognition (SER) technology has benefited from the use of deep learning techniques. However, expensive human annotation and difficulty in emotion database collection make it challenging for rapid deployment of SER ...
Index Terms
- Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop