Towards Human Affect Modeling: A Comparative Analysis of Discrete Affect and Valence-Arousal Labeling

Aslan, Sinem; Okur, Eda; Alyuz, Nese; Arslan Esme, Asli; Baker, Ryan S.

doi:10.1007/978-3-319-92279-9_50

Sinem Aslan¹⁰,
Eda Okur¹⁰,
Nese Alyuz¹⁰,
Asli Arslan Esme¹⁰ &
…
Ryan S. Baker¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 851))

Included in the following conference series:

International Conference on Human-Computer Interaction

2181 Accesses

Abstract

There is still considerable disagreement on key aspects of affective computing - including even how affect itself is conceptualized. Using a multi-modal student dataset collected while students were watching instructional videos and answering questions on a learning platform, we investigated the two key paradigms of how affect is represented through a comparative approach: (1) Affect as a set of discrete states and (2) Affect as a combination of a two-dimensional space of attributes. We specifically examined a set of discrete learning-related affects (Satisfied, Confused, and Bored) that are hypothesized to map to specific locations within the Valence-Arousal dimensions of Circumplex Model of Emotion. For each of the key paradigms, we had five human experts label student affect on the dataset. We investigated two major research questions using their labels: (1) Whether the hypothesized mappings between discrete affects and Valence-Arousal are valid and (2) whether affect labeling is more reliable with discrete affect or Valence-Arousal. Contrary to the expected, the results show that discrete labels did not directly map to Valence-Arousal quadrants in Circumplex Model of Emotion. This indicates that the experts perceived and labeled these two relatively differently. On the other side, the inter-rater agreement results show that the experts moderately agreed with each other within both paradigms. These results imply that researchers and practitioners should consider how affect information would operationally be used in an intelligent system when choosing from the two key paradigms of affect.

You have full access to this open access chapter, Download conference paper PDF

Human Expert Labeling Process: Valence-Arousal Labeling for Students’ Affective States

Towards Better Affect Detectors: Detecting Changes Rather Than States

Measuring Affect in Educational Contexts: A Circumplex Approach

Keywords

1 Introduction

Affect has become an important area of research within learning [1,2,3]. Data labeling is a preliminary step towards training machine learning models to provide affect-related analytics to teachers and learners. However, there is a lack of agreement in the related literature even for how affect is itself conceptualized. There are two major paradigms for affect representation: (1) Affect as a set of discrete states [4,5,6,7,8,9] and (2) Affect as a combination of a two-dimensional space of attributes [11].

There are several benefits to viewing student affect as a set of discrete states. One such benefit is easier understanding of students’ actual states and driving customized interventions accordingly. However, labeling discrete affective states presents a challenge to observers in distinguishing between closely-related affective states. For instance, confusion and frustration are often treated as separate affective states (e.g., [8]), but Liu et al. [10] argue that they may simply represent different ranges of a continuum. Researchers using discrete sets of affective states often also struggle with how to distinguish neutral affect from mild affect and how to handle uncommon affect outside the core affect labeling scheme. These challenges can represent major risks to the quality of affect labeling in ways that are not easily seen in overall inter-rater agreement values that cut across large numbers of constructs. These issues may particularly emerge in situations where affect labelers have limited training or are asked to label data where video is sometimes ambiguous, due to factors such as facial occlusion, adverse pose variations, gum chewing, or many other factors.

In this paper, we study this issue in a focused fashion by examining a set of discrete affective states that can be reasonably expected to correlate to specific locations within the Circumplex Model of Emotion [11]. Specifically, we study (see Fig. 1): Satisfied, which can be hypothesized to map to Positive Valence (regardless of Arousal); Bored, which can be hypothesized to map to Negative Valence and Low Arousal; and Confused, which can be hypothesized to map to Negative Valence and High Arousal. Using the student dataset in [12] and Human-Expert Labeling Process (HELP) [13] as a baseline labeling protocol, we test these hypotheses (i.e., whether these mappings between discrete affective states and Valence-Arousal are valid) and if affect labeling is more reliable with discrete affective states or Valence-Arousal.

2 Data Collection

In this study, we used student data which was a subset of a larger dataset previously collected through authentic classroom pilots [12]. These pilots took place in an afterschool Math course in an urban high school in Turkey. In these pilots, the students used an online educational platform to watch instructional videos and solve relevant questions. Meanwhile, our data collection application was running in the background to collect two video streams: (1) Student appearance videos from the camera (to monitor observable cues available in the student’s face or upper body); and (2) student desktop videos (to monitor contextual information).

3 Labeling Tool, Human-Experts, and Training

A labeling tool was developed and customized for use in multiple labeling experiments. In Fig. 2, a sample view for labeling Valence is shown.

Using HELP [13] as a baseline labeling protocol, five human experts with backgrounds in Psychology/Educational Psychology were recruited and trained (See Tables 1 and 2 for operational definitions of labels). Based on observed state changes, the experts provided their Valence-Arousal or discrete affect labels using all available cues (e.g., student video/audio, desktop recording with mouse cursor locations, and any relevant contextual information from the device and content platform).

Table 1. Operational definitions of discrete affect labels

Full size table

Table 2. Operational definitions of Valence-Arousal labels

Full size table

In total, the human experts labeled seven hours of student data for Valence-Arousal labels first. One week later, we asked them to label the same data for discrete affect labels. Note that although the experts labeled Arousal using three different levels, we combined Low and Medium labels into a Low class for analysis of the labeled data based on the experimental results outlined in [14].

4 Comparing Discrete Affect Labels to Valence-Arousal Labels

4.1 Pre-processing of Label Data

To analyze labeling output data, both for discrete affect and Valence-Arousal labeling outputs, two pre-processing steps were taken: First, we applied windowing on the labeling output data to obtain aligned instance-wise labels of each individual expert. Second, to facilitate analysis, we derived a consensus label from all the expert labels for each instance, using majority voting in each case.

4.2 Metrics for Analysis

The derived consensus labels were then correlated to each other to measure the degree to which each discrete affective state mapped to each Valence-Arousal quadrant. Note that we already presented the hypotheses for how discrete affective states would map to Valence-Arousal in the Introduction section (Fig. 1). We calculated the degree of mapping using Precision, Recall, and F1-measures. For these calculations, the labeled set (e.g., discrete affective states) act as the true labels; whereas the mapped set (e.g., Valence-Arousal mapped to discrete affective states as hypothesized) serve as the predictions. Precision is calculated as the fraction of true predictions (i.e., true positives) to the number of all predictions (i.e., sum of true positives and false positives); whereas recall is calculated as the ratio of true predictions to all true labels (i.e., sum of true positives and false negatives). The F1 measure is calculated as the harmonic mean of precision and recall values, taking into account the trade-off between those two measures. In addition, we also checked inter-rater agreement measures for different labeling tasks (i.e., Discrete Affects, Arousal, Valence) to assess reliability of the obtained label data. As proposed in HELP [13], we utilized Krippendorff’s alpha metric to compute inter-rater agreement among experts.

4.3 Methods for Analysis

To investigate whether the discrete affective states (i.e., Satisfied, Bored, and Confused) actually map to the hypothesized Valence-Arousal quadrants, the degree of mappings was computed using the final labels for the following mapping/comparison sets:

Valence vs. Discrete Affect-to-Valence: We compared Valence labels to discrete affect labels, where affect labels were mapped to Valence labels using: Satisfied to Positive Valence, and Bored/Confused to Negative Valence.
Arousal vs. Discrete Affect-to-Arousal: We compared Arousal labels to discrete affect labels, where affect labels were mapped to Arousal labels using: Bored to Low Arousal, and Confused to High Arousal. Note that Satisfied samples were disregarded in this case since we hypothesized that they could map to both Low and High Arousal on the Circumplex Model of Emotion (See Fig. 1).
Discrete Affect vs. Valence/Arousal-to-Discrete Affect: We compared discrete affect labels to Valence-Arousal labels, where Valence-Arousal label pairs were mapped to discrete affect labels using: Low/High Arousal & Positive Valence to Satisfied, Low Arousal & Negative Valence to Bored, and High Arousal & Negative Valence to Confused.

5 Results

5.1 Mapping Between Discrete Affect and Valence-Arousal Labels

The Precision, Recall, and F1-measures calculated for each mapping sets are summarized in Table 3. As these results indicate, relatively higher F1 measures (consistent for both state-specific and overall results) could be achieved when discrete affect labels were mapped to Positive/Negative Valence (i.e., Valence vs. Discrete Affect-to-Valence). However, the F1 values were lower when discrete affect labels were mapped to High/Low Arousal (i.e., Arousal vs. Discrete Affect-to-Arousal). Although the overall F1 measures seemed reasonable when Valence-Arousal labels were mapped to discrete affects (i.e., Discrete Affect vs. Valence/Arousal-to-Discrete Affect), the state-specific measures highlighted the inconsistency. The reason behind that could be the fact that the distribution of High Arousal samples was lower than ~1.2% in the data, and the samples that were labeled as Confused were therefore drawn mostly from the Low-Arousal samples. This issue was mostly visible when we investigate the Valence-Arousal vs. Discrete Affect mapping Recall and F1 results. Note that although we disregarded Satisfied samples in Arousal vs. Discrete Affect-to-Arousal case with the hypothesis that they could map to both Low and High Arousal, we also checked and observed that among all the Satisfied instances, 99.8% are mapping to Low Arousal and only 0.2% are mapping to High Arousal. Note that this issue is common in all three discrete affective states: Satisfied (0.2% High Arousal), Bored (2.2% High Arousal), and Confused (3.3% High Arousal).

Table 3. Precision/Recall/F1 measures for the mappings between discrete affect labels and Valence-Arousal labels

Full size table

5.2 Inter-rater Agreement for Discrete Affects and Valence-Arousal Labeling

The inter-rater agreement results for discrete affect labeling compared to the Valence-Arousal labeling are given in Table 4. The average of all confusion matrices computed for discrete affect labels provided by all pairwise experts (i.e., any two expert pairs among the five experts) is given in Table 5. As these results indicate, the inter-rater agreement was lower for discrete affect labeling, where the pairwise confusion results showed that the experts had difficulty differentiating between Satisfied and any one of the other two states (Bored or Confused).

Table 4. Consensus measures for Discrete Affects vs. Valence-Arousal

Full size table

Table 5. Average of pair-wise confusion matrices for discrete affects

Full size table

6 Conclusion

In this paper, through a comparative approach, we investigated the two key paradigms of how affect is represented: (1) Affect as a set of discrete states and (2) affect as a combination of a two-dimensional space of attributes. We specifically examined a set of discrete affective states (Satisfied, Confused, and Bored) that can be reasonably expected to map to specific locations within the Valence-Arousal dimensions of the Circumplex Model of Emotion [11]. We tested two major hypotheses: (1) Whether these mappings between discrete affects and Valence-Arousal are valid and (2) whether affect labeling is more reliable with discrete affects or Valence-Arousal. To investigate these hypotheses, we used HELP [13] as a baseline labeling protocol. Using HELP, five human experts labeled seven hours of student data for Valence-Arousal and discrete affect labels.

The relatively low F1 measures (See Table 3) indicate that the discrete affect labels (i.e., Satisfied, Bored, and Confused) do not directly map to Valence-Arousal quadrants in the Circumplex Model of Emotion [11]. This shows that the human experts perceived and labeled these two relatively differently although we reasonably expected the discrete affects to map seamlessly on the model. On the other side, the inter-rater agreement results (See Table 4) show that the experts moderately agree with each other in both discrete affect labeling and Valence-Arousal labeling.

There are two important implications of these major results to researchers in learning analytics field. First, how affect is conceptualized in one paradigm could not be seamlessly transferable to another paradigm (i.e., discrete affective states do not directly map to Valence-Arousal quadrants). Therefore, researchers need to decide on affect labels of interest at the beginning of research considering this limitation. Second, both discrete affect labeling and Valence-Arousal labeling resulted in moderate consensus among the experts. Therefore, researchers should consider how affect information would ultimately be used in a learning system (e.g., affect-aware interventions, feedback to content, etc.) when choosing from Valence-Arousal or discrete affect labeling to generate ground-truth labels for model development.

References

Sabourin, J., Mott, B., Lester, J.C.: Modeling learner affect with theoretically grounded dynamic Bayesian networks. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.-C. (eds.) ACII 2011. LNCS, vol. 6974, pp. 286–295. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24600-5_32
Chapter Google Scholar
Jaques, N., Conati, C., Harley, J.M., Azevedo, R.: Predicting affect from gaze data during interaction with an intelligent tutoring system. In: Trausan-Matu, S., Boyer, K.E., Crosby, M., Panourgia, K. (eds.) ITS 2014. LNCS, vol. 8474, pp. 29–38. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07221-0_4
Chapter Google Scholar
Pardos, Z.A., Baker, R.S., San Pedro, M.O.C.Z., Gowda, S.M., Gowda, S.M.: Affective states and state tests: investigating how affect and engagement during the school year predict end of year learning outcomes. J. Learn. Anal. 1(1), 107–128 (2014)
Article Google Scholar
Kapoor, A., Picard, R.W.: Multimodal affect recognition in learning environments. In: International Conference on Multimedia (2005)
Google Scholar
Kapoor, A., Burleson, W., Picard, R.W.: Automatic prediction of frustration. Int. J. Hum.-Comput. Stud. 65(8), 724–736 (2007)
Article Google Scholar
Hoque, M.E., McDuff, D.J., Picard, R.W.: Exploring temporal patterns in classifying frustrated and delighted smiles. Trans. Affect. Comput. 65(8), 323–334 (2012)
Article Google Scholar
Grafsgaard, J.F., Wiggins, J.B., Boyer, K.E., Wiebe, E.N., Lester, J.C.: Automatically recognizing facial indicators of frustration: a learning-centric analysis. In: Affective Computing and Intelligent Interaction (2013)
Google Scholar
Bosch, N., D’Mello, S., Baker, R., Ocumpaugh, J., Shute, V., Ventura, M., Zhao, W.: Automatic detection of learning centered affective states in the wild. In: International Conference on Intelligent User Interfaces (2015)
Google Scholar
Arroyo, I., Cooper, D.G., Burleson, W., Woolf, B.P., Muldner, K., Christopherson, R.: Emotion sensors go to school. In: Artificial Intelligence in Education (AIED), vol. 200, pp 17–24 (2009)
Google Scholar
Liu, Z., Pataranutaporn, V., Ocumpaugh, J., Baker, R.S.: Sequences of frustration and confusion, and learning. In: Proceedings of the 6th International Conference on Educational Data Mining, pp. 114–120 (2013)
Google Scholar
Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161 (1980)
Article Google Scholar
Okur, E., Alyuz, N., Aslan, S., Genc, U., Tanriover, C., Arslan Esme, A.: Behavioral engagement detection of students in the wild. In: André, E., Baker, R., Hu, X., Rodrigo, M, du Boulay, B. (eds.) AIED 2017. LNCS (LNAI), vol. 10331, pp. 250–261. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61425-0_21
Chapter Google Scholar
Aslan, S., Mete, S.E., Okur, E., Oktay, E., Alyuz, N., Genc, U., Stanhill, D., Arslan Esme, A.: Human expert labeling process (HELP): Towards a reliable higher-order user state labeling process and tool to assess student engagement. Educ. Technol. 57(1), 53–59 (2017)
Google Scholar
Aslan, S., Okur, E., Alyuz, N., Arslan Esme, A., Baker, R.S.: Human expert labeling process: valence-arousal labeling for students’ affective states. In: Proceedings of the 8th International Conference in Methodologies and Intelligent Systems for Technology Enhanced Learning. Springer, Cham (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Intel Corporation, Hillsboro, OR, 97124, USA
Sinem Aslan, Eda Okur, Nese Alyuz & Asli Arslan Esme
University of Pennsylvania, Philadelphia, PA, 19104, USA
Ryan S. Baker

Authors

Sinem Aslan
View author publications
You can also search for this author in PubMed Google Scholar
Eda Okur
View author publications
You can also search for this author in PubMed Google Scholar
Nese Alyuz
View author publications
You can also search for this author in PubMed Google Scholar
Asli Arslan Esme
View author publications
You can also search for this author in PubMed Google Scholar
Ryan S. Baker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sinem Aslan .

Editor information

Editors and Affiliations

University of Crete and Foundation for Research and Technology – Hellas (FORTH), Heraklion, Crete, Greece
Constantine Stephanidis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aslan, S., Okur, E., Alyuz, N., Arslan Esme, A., Baker, R.S. (2018). Towards Human Affect Modeling: A Comparative Analysis of Discrete Affect and Valence-Arousal Labeling. In: Stephanidis, C. (eds) HCI International 2018 – Posters' Extended Abstracts. HCI 2018. Communications in Computer and Information Science, vol 851. Springer, Cham. https://doi.org/10.1007/978-3-319-92279-9_50

Download citation

DOI: https://doi.org/10.1007/978-3-319-92279-9_50
Published: 07 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92278-2
Online ISBN: 978-3-319-92279-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards Human Affect Modeling: A Comparative Analysis of Discrete Affect and Valence-Arousal Labeling

Abstract

Similar content being viewed by others

Human Expert Labeling Process: Valence-Arousal Labeling for Students’ Affective States

Towards Better Affect Detectors: Detecting Changes Rather Than States

Measuring Affect in Educational Contexts: A Circumplex Approach

Keywords

1 Introduction

2 Data Collection

3 Labeling Tool, Human-Experts, and Training

4 Comparing Discrete Affect Labels to Valence-Arousal Labels

4.1 Pre-processing of Label Data

4.2 Metrics for Analysis

4.3 Methods for Analysis

5 Results

5.1 Mapping Between Discrete Affect and Valence-Arousal Labels

5.2 Inter-rater Agreement for Discrete Affects and Valence-Arousal Labeling

6 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Towards Human Affect Modeling: A Comparative Analysis of Discrete Affect and Valence-Arousal Labeling

Abstract

Similar content being viewed by others

Human Expert Labeling Process: Valence-Arousal Labeling for Students’ Affective States

Towards Better Affect Detectors: Detecting Changes Rather Than States

Measuring Affect in Educational Contexts: A Circumplex Approach

Keywords

1 Introduction

2 Data Collection

3 Labeling Tool, Human-Experts, and Training

4 Comparing Discrete Affect Labels to Valence-Arousal Labels

4.1 Pre-processing of Label Data

4.2 Metrics for Analysis

4.3 Methods for Analysis

5 Results

5.1 Mapping Between Discrete Affect and Valence-Arousal Labels

5.2 Inter-rater Agreement for Discrete Affects and Valence-Arousal Labeling

6 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation