Evaluation of Diagnostic Rules for Real-Time Assessment of Mental Workload Within a Dynamic Adaptation Framework

Bruder, Anna; Schwarz, Jessica

doi:10.1007/978-3-030-22341-0_31

Anna Bruder¹⁶ &
Jessica Schwarz¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11597))

Included in the following conference series:

International Conference on Human-Computer Interaction

1702 Accesses
2 Citations

Abstract

Adaptive Instructional Systems (AIS) aim to support the learner by dynamically providing feedback tailored to the individual learner’s needs. To select the appropriate type and level of support, the adaptive system gathers on-task information on learner performance and the learner’s current mental state. The Real-Time Assessment of Multidimensional User State (RASMUS) is a rule-based diagnostic framework providing information on task performance and up to six user states affecting performance. The aim of this paper is to evaluate and optimize the existing diagnostic rules of RASMUS exemplary for the state of high mental workload. Modified rules were defined by performing receiver operator characteristic (ROC) curve analyses using the empirical data obtained in a prior validation experiment (N = 12). Subjective workload ratings served as comparative measure. Specifically, the analysis focused on optimizing the threshold values used to discriminate between critical and noncritical workload states for three physiological mental workload indicators, namely pupil dilation, heart rate variability and respiration rate. Subsequently, the prior validation study was repeated with N = 15 participants to evaluate the diagnostic accuracy of the initial and modified rule sets. Similar outcomes were found for the initial rule set confirming the temporal stability of the RASMUS diagnosis. However, contrary to expectations, the modified rule set showed less diagnostic accuracy when applied to the new data set. This result questions the practicality of ROC curve analysis for defining and optimizing rules in the context of physiological user state diagnosis.

You have full access to this open access chapter, Download conference paper PDF

Hybrid Models of Performance Using Mental Workload and Usability Features via Supervised Machine Learning

Assessment of Mental Workload: A Comparison of Machine Learning Methods and Subjective Assessment Techniques

Analysing the Impact of Machine Learning to Model Subjective Mental Workload: A Case Study in Third-Level Education

Keywords

1 Introduction

Adaptive Instructional Systems (AIS) gather diagnostic information about learner characteristics and provide an individually tailored response. By adapting to temporally changing, individual learning abilities and learner needs, approaches of micro-adaptive instructional systems aim to dynamically support the individual learner in achieving his or her learning goals [1]. On-task measures of learner performance as well as the learner’s current mental state are essential to provide this tailored feedback during the learning process. Several approaches in AIS consider the motivational state of the user (e.g. [2,3,4]) while in the operational context adaptive systems are often based on workload assessments (e.g. [5, 6]). Contrary to these single state analyses, Schwarz, Fuchs and Flemisch [7] proposed a multidimensional assessment of user state as a more holistic approach for user state analysis in adaptive system design. Out of this approach, Schwarz and Fuchs (2017) developed RASMUS (‘Real-Time Assessment of Multidimensional User State’) - the diagnostic component of a dynamic adaptation framework [8, 9]. As the conceptual framework is generic, it can be applied to various operational and instructional settings. The proof of concept implementation focused on a naval air surveillance task, providing on-task information about the potentially critical user states high workload, passive task-related fatigue and incorrect attentional focus [8].

Diagnostic outcomes for these three user states were validated in a prior experimental study [10]. However, as RASMUS diagnostics are currently based on self-determined rules, we assume that modification of these rules might increase the diagnostic accuracy. Hence, the aim of this paper is to investigate an approach to evaluate and optimize the existing diagnostic rules of RASMUS by using the data of the prior validation study. To determine optimized rules receiver operator characteristic (ROC) curve analyses were performed. ROC graphs can be used to assess diagnostic accuracy of a test and are commonly used in medical research [11]. Using the data set of the prior validation study, we performed ROC curve analyses to optimize the rules for the physiological parameters that serve as indicators for high mental workload. Diagnostic accuracy of initial and modified rules was then evaluated in a repetition of the validation study.

The next chapter summarizes main aspects of the conceptual framework and the workload assessment within RASMUS. Section 3 details the results of the ROC curve analyses that were performed for defining optimized rules. Subsequently, Sects. 4 and 5 describe methods and results of the newly conducted validation experiment. The paper concludes with a discussion of the results (Sect. 6) and a summary of conclusions and lessons learned (Sect. 7).

2 Conceptual Framework and Workload Assessment

RASMUS is the diagnostic component of an adaptation framework that detects performance decrements of the user and analyzes which user states show critical outcomes that may have caused the performance decrement [8]. The adaptation management component of this framework, named ADAM, then selects an adaptation strategy that is most appropriate to mitigate the detected critical state, and thus, to restore the user’s effectiveness [9].

RASMUS diagnostics are based on a multidimensional view on user state. This approach considers up to six user state dimensions (mental workload, fatigue, attention, situation awareness, motivation, and emotional state) that are found to have great influence on human performance [7]. Currently, the proof-of-concept implementation provides assessments of high workload, passive task-related fatigue, and incorrect attentional focus as these states can be considered as particularly relevant for the chosen task domain of naval air surveillance.

As the scope of this paper is to evaluate and optimize the diagnostic rules for mental workload the following sections detail the workload assessment in RASMUS and the validation method.

2.1 High Mental Workload

RASMUS combines five parameters to assess mental workload: number of tasks, number of mouse clicks, pupil diameter, heart rate variability (HRV) and respiration rate. A state of high mental workload is assessed if at least three of these parameters show outcomes that indicate high workload. The classification of parameter outcomes is based on self-determined rules that were derived from literature findings. This paragraph briefly describes the three physiological parameters chosen for the workload assessment, their assumed relation to mental workload, and the rules that were defined for critical outcomes.

Pupil diameter has been found to increase during high workload tasks (e.g. [12,13,14]). Coyne and Sibley [15] showed that pupil diameter can be reliably measured with eye tracking systems which are non-invasive and easy to set up. HRV responds to mental stress or workload, suggesting a decrease in HRV during mentally demanding tasks (e.g. [14, 16,17,18,19]). Lastly, there is empirical evidence that during mentally demanding tasks breathing rates increase (e.g. [16, 19, 20]). Both, HRV and breathing frequency or respiration rate, can be measured with rather non-invasive devices, such as chest worn wearable sensors.

The physiological parameters were recorded continuously and evaluated based on moving mean windows of 30 s that were compared to an individual baseline value. Baseline values were calculated for each participant and each parameter at the beginning of the experiment by taking the average of the data collected in a period of 120 s. During this baseline measurement, mental workload was kept low to moderate. In the initial rule set, RASMUS labels each physiological parameter as critically high or low if the current mean deviates by more than 1 standard deviation (SD) from the baseline mean. For pupil diameter and respiration rate high workload is indicated by positive deviations > 1 SD from the baseline mean, and for HRV it is indicated by negative deviations > 1 SD from the baseline mean.

2.2 Perceived Mental Workload

Perceived mental workload was used as a comparative measure for validating and optimizing the mental workload diagnosis of RASMUS. There are various questionnaires to assess subjective mental workload. The National Aeronautics and Space Administration Task Load Index (NASA-TLX [21]) is one of the superior scales with respect to sensitivity and user acceptance [22]. In the prior and the repeated experimental study workload rating was performed using the NASA-TLX subscale of mental effort. Ratings were obtained each time RASMUS detected a performance decrement (the scenario was stopped at that time to ensure the rating did not affect the user’s task completion). The rating was performed on a 15-point scale proposed by Heller [23] that is divided into five subsections: very low (1–3), low (4–6), medium (7–9), high (10–12), very high (13–15).

2.3 Task

The generic diagnostic tool RASMUS has been integrated into a naval anti-air warfare (AAW) simulation [24]. In this simulation, operators completed four different simplified subtasks: identification of contacts, creation of new contacts, warn, and engage contacts. Figure 1 shows the tactical display area (TDA) of the simulation. The blue dot in the center of the map represents the own ship. Identified radar contacts are visualized in green (neutral), blue (friendly), or red (hostile). New, unidentified contacts (yellow) have to be identified as neutral, friendly or hostile according to certain criteria. If hostile contacts enter the blue or the red circle around the own ship (see Fig. 1) they have to be warned or engaged, respectively.

The tasks occur at scripted times during the scenario. If tasks have to be performed simultaneously, users were told to process tasks in order of priority. Each task has to be finished within a specified time limit, cf. [10]. If the time limit is exceeded or the task is not completed correctly, RASMUS logs a performance decrement.

3 Definition of New Rules

ROC graphs or curves aim to quantify the accuracy of a binary diagnostic test or classifier and are created by plotting the diagnostic sensitivity and specificity values. The measure that is commonly used in this context is the area under the curve (AUC) of the ROC curve [cf. [11, 25]). Performing a ROC analyses requires information about the true state. However, user states, such as mental workload, are latent constructs that cannot be measured directly. For this reason we used the NASA-TLX subjective mental effort rating (cf. Sect. 2.2) as an approximation of the true user state within the ROC curve analysis. Subjective rating outcomes were dichotomized in order to discriminate between critical and noncritical mental workload states. The cut-off value was set based on the subsections of the questionnaire (cf. Sect. 2.2): any rating above 9 on the 15 point scale was considered to be a high (critical) workload state whereas a rating of 9 or smaller was not.

At first, ROC curves were calculated using the threshold value initially set for each parameter for discriminating between critical and noncritical outcomes with respect to mental workload. We then systematically varied the threshold values in order to determine the value that maximizes the AUC.

The ROC curve analysis resulted in modified rules for pupil diameter (>.5 SD instead of >1 SD positive deviation from baseline) as well as HRV (>2 SD instead of >1 SD negative deviation from baseline). For respiration rate the analysis did not reveal any improvement by changing the rule. Therefore, the existing rule (>1 SD positive deviation from baseline) was not modified. Table 1 summarizes the resulting AUC, sensitivity and specificity for the initial as well as the modified rules for each parameter. AUC values range between .6–.7 for the modified rules, which can be considered a sufficient outcome [25].

Table 1. Comparison of initial and modified rules for the physiological parameters after performing individual ROC curve analyses

Full size table

As a next step, we analyzed to which extent this modification of the rules for the physiological parameters affect the accuracy of the overall workload assessment in RASMUS. Figure 2 shows the mean deviation (MD) from the baseline of the subjective rating for critical and noncritical system diagnoses with respect to the modified as well as the initial rule set. For the initial rule set, subjects rated their perceived workload significantly higher when the system diagnosis was critical than when it was noncritical (t(74) = 3.301; p < .01). The same outcome was observed for the modified rule set (t(74) = 3.882; p < .001). However, the results suggest a slightly better distinction between critical and noncritical system diagnosis for the modified rule set based on the subjective rating.

The overall ROC curve for the diagnosis of high mental workload (see Fig. 3) also indicates a slightly higher AUC for the modified set of rules (AUC_modified = .780; p < .001) than for the initial set of rules (AUC_initial = .730; p < .01). Exceeding a value of .7, both diagnostic rule sets can be considered good diagnostic tests [25].

4 Repetition of Validation Study

A repetition of the initial validation study was conducted to investigate whether the obtained outcomes from the ROC curve analysis can be replicated, and thus would be temporarily stable.

4.1 Methodological Design

Fifteen subjects (8 male, 7 female) aged between 20 and 51 years (M = 31.26 ± 8.27) participated in the experiment. A multisensory chest strap (Zephyr BioHarness3) was used to collect data on HRV and respiration rate. Pupil diameter was recorded with an eye tracker (Tobii X3-120) placed underneath the monitor. The setup is depicted in Fig. 4.

After reading the instructions, participants completed a ten-minute training scenario, during which the examiner explained the task completion for every subtask (cf. Sect. 2.3). Subsequently, participants performed the tasks in an experimental test scenario with a net duration of 45 min. The scenario was divided into three successive phases, merging into each other without breaks (see Fig. 5). The scenario paused whenever a performance decrement was detected. Users then rated their current perceived mental workload. Thus, the actual duration of the experiment depended on user’s performance. Perceived mental workload was recorded at the end of the training phase as well as at the end of the experiment, to obtain an individual baseline of the subjective rating.

4.2 Hypotheses

Two hypotheses were tested in this experiment (see below). The first hypothesis addresses the question whether the outcomes of the first validation experiment can be replicated. With the second hypothesis we aim to assess whether the modified rule set shows a higher diagnostic accuracy than the initial rule set.

H1: Perceived mental workload is rated higher for performance decrements with critical system diagnoses than for noncritical system diagnoses

(a)
using the initial rule set,
(b)
using the modified rule set.

H2: In comparison to the first rule set, the diagnostic accuracy is increased by the modified rule set.

4.3 Data Analysis

The psychophysiological and behavioral data was logged to text and CSV files for each participant. Data preparation included allocating the subjective ratings to the corresponding diagnostic outcomes of RASMUS. Hypothesis 1 was tested by comparing the mean deviation of the subjective rating from baseline for high and non-high workload outcomes of RASMUS using the initial rule set to test H1a and the modified rule set to test H1b. Concerning Hypothesis 2 the diagnostic accuracy was assessed for modified and initial rule sets by performing ROC curve analyses with the dichotomized subjective rating as “true” user state. The data analysis was conducted with SPSS (version 25.0).

5 Results

5.1 Descriptive Analysis

A total of 79 performance decrements occurred across all subjects. As expected, most of the performance decrements occurred in the high workload phase (see Table 2). During the monotony phase only 12 performance decrements could be observed. Two performance decrements were recorded during the second half of the baseline phase. The number of performance decrements for each of the phases is very similar or the same as in the first validation experiment [10] (see numbers in brackets in Table 2). However, only little more than 25% of the subjects showed performance decrements during the monotony phase in the second experiment, whereas almost 60% of the subjects were affected in the preceding experiment.

Table 2. Number of performance drops and subjects affected per phase. Numbers in brackets refer to the first validation experiment [10].

Full size table

5.2 Hypothesis Testing

A non-parametrical Mann-Whitney U-test was conducted to test H1 due to the violation of the assumption of normality for parts of the data set. Figure 6.a and b show the subjective rating for critical and noncritical system diagnoses for the initial as well as the modified set of rules respectively. The analysis confirmed that perceived workload was rated significantly higher by the subjects for critical states of workload compared to noncritical states of workload diagnosed by RASMUS using the initial set of rules (z = −2.64; p < .01). However, subjective ratings differed less between critical and noncritical states of workload when using the modified set of rules (see Fig. 6.b). The statistical analysis revealed the difference to be nonsignificant (z = −1.3; ns). Therefore, H1 can be confirmed for the initial rule set (H1a) but not for the modified one (H1b).

With respect to H2, we evaluated whether the modified set of rules lead to an improved accuracy of the workload diagnosis. In the first experiment, the overall ROC curve indicated a higher discrimination between critical and noncritical states for the modified rule set compared to the initial rule set (see Fig. 3). Figure 7 shows the resulting ROC curves when applying both rule sets to the data set of the second experiment.

The analysis showed that the modified rule set was less accurate than the initial rule set when applied to the new data set. The diagnostic accuracy of the initial rule set significantly differs from .5 at an AUC of .645 (p < .05; sensitivity = .643; specificity = .647) whereas for the modified rule set it does not (AUC = .588; p = .198; sensitivity = .607; specificity = .569). Consequently, the hypothesis that the diagnostic accuracy is higher with the modified rule set (H2) cannot be accepted.

6 Discussion

The results of the second validation study could confirm the temporal stability of the diagnostic outcomes when using the initial rule set. Surprisingly, the initial rules also showed a better overall diagnostic performance than the modified rules that were determined by performing the ROC curve analysis. Hence, the outcomes of the second study indicate that the initial rule set is likely to achieve a more consistent distinction between critical and noncritical subjective workload states than the modified rule set. Results imply the modified diagnostic rules are specifically tailored to the data set the optimization is based on, and are thus not applicable to a different data set.

It should be noted that as part of a post hoc analysis not detailed in this paper we also performed ROC curve analyses for each individual physiological parameter. Results also indicate that the initial rules for the physiological parameters provide a better diagnostic accuracy than the individual modified rules when applied to the new data set. However, a surprising result was found for heart rate variability. ROC curve analysis revealed that AUC was even below the value of .5 for both, modified and initial rules. This means the accuracy of the diagnosis for this data set is less accurate than guessing (e.g. [25]) and suggests that HRV behaves in the opposite way as indicated by literature findings. This could have various reasons, e.g. sensor-related measurement errors or inadequate sensor placement. However, further post hoc analyses revealed that, according to expectations, HRV negatively correlates with the subjective (not dichotomized) workload rating, even though the correlation is rather weak (r = −.27). Hence, the unexpected AUC outcome may result from the dichotomization of the subjective rating (critical states: ratings > 9) that was necessary for performing the ROC curve analyses.

This contradictory finding on HRV illustrates a general challenge of validating and optimizing rules for user state classification. In contrast to e.g. medical diagnoses it is hard to obtain an appropriate reference measure that reliably differs between true and false critical user states. In our analysis we used the subjective rating as an estimation of true workload. However, the subjective workload rating is also error prone, e.g. affected by response bias, and it has to be artificially dichotomized for performing ROC curve analyses. This means, the cut-off value chosen for discriminating between true and false high workload states also impacts analysis outcomes.

Nevertheless, the diagnostic accuracy of the initial rule set could be confirmed by the second validation study, indicating that RASMUS can reliably differentiate between potentially high and non-high workload states. The fact that HRV was not found to be a reliable indicator of workload in the second study emphasizes the necessity to combine several indicators in order to provide a more robust diagnostic result.

7 Conclusion and Lessons Learned

ROC curve analysis is a common method to evaluate diagnostic tests in medical research. In this paper we investigated whether this approach can be used to evaluate and optimize diagnostic rules for physiological user state assessment in adaptive systems. Considering the results of our study, we suggest, that ROC curve analysis may be useful for evaluating and comparing the diagnostic accuracy of different workload indicators. However, the results of this study could not show that this method is appropriate for defining and optimizing rules for single user state indicators. As these outcomes also depend on the validity of the subjective rating and the dichotomization to obtain a “true” high workload state, in future studies it could be examined, whether other cut-off values for the subjective rating than > 9 may be more appropriate to classify a high workload state. Also other methods could be investigated for optimizing and validating the user state indicators.

Another option for optimizing diagnostic outcomes is to apply methods of artificial intelligence, such as artificial neural networks, as proposed e.g. by Wilson and Russell [26]. However, those systems are often considered “black boxes” as the algorithm, which provides the diagnostic outcomes, is often too complex to be understood [27]. The rule-based approach has the advantage of providing more transparency.

Considering RASMUS’ application within an adaptation framework, results indicate that the current workload assessments of RASMUS are sufficiently accurate and reliable to support a proper selection and configuration of adaptation strategies for this task domain. Nevertheless, we identified further options for improving RASMUS diagnostics: Two of five parameters currently used for the workload assessment (heart rate variability and respiration rate) are retrieved from the same sensor (BioHarness). Hence, whenever there’s a problem with this sensor, both indicators will not be reliable, and thus the robustness of the diagnostic outcomes decreases.

Adding one or more independent parameters to the diagnosis could avoid this problem. Hernandez et al. [28], for example, investigated the possibility to use a pressure-sensitive keyboard and capacitive mouse in the context of stress detection. They found increased typing pressure in more than 79% as well as an increase of surface contact with the mouse in 75% of the participants during stressful tasks [28]. Another possible measure for workload and stress detection could be the inclination of the trunk (e.g. [29]), which actually can be retrieved from the BioHarness sensor but there is also the possibility to use a separate pressure sensing mat that is placed on the seat of the operator.

One last note: The diagnostic framework of RASMUS has been applied to a naval air surveillance task. Hence, the indicators currently used for user state assessment in RASMUS were specifically selected for this task. In the context of AIS these assessments might also prove useful for determining mental states of the learner in order to provide adequate feedback and support. However, this has to be investigated in more detail in future experimental studies.

References

Park, O., Lee, J.: Adaptive instructional systems. In: Jonassen, D.H. (ed.) Handbook of Research on Educational Communications and Technology. Simon & Schuster, New York (1996)
Google Scholar
Ghergulescu, I., Muntean, C.H.: Learner motivation assessment with <e-adventure> game platform. In: Proceedings of AACE E-Learn: World Conference on E-Learning in Corporate, Government, Healthcare and Higher Education, pp. 1212–1221. AACE, Honolulu, Hawaii (2011)
Google Scholar
D’Mello, S., Olney, A., Williams, C., Hays, P.: Gaze tutor: a gaze-reactive intelligent tutoring system. J. Hum. Comput. Stud. 70(5), 377–398 (2012)
Article Google Scholar
Derbali, L., Frasson, C.: Prediction of players motivational states using electrophysiological measures during serious game play. In: Conference on Advanced Learning Technologies, pp. 498–502. IEEE, Sousse (2010)
Google Scholar
Hilburn, B., Jorna, P.G., Byrne, E.A., Parasuraman, R.: The effect of adaptive air traffic control (ATC) decision aiding on controller mental workload. In: Mouloua, M., Koonce, J. (eds.) Human Automation Interaction: Research and Practice, pp. 84–91. Erlbaum, Mahwah (1997)
Google Scholar
Parasuraman, R.: Adaptive automation matched to human mental workload. In: Hockey, G.R.J., Gaillard, A.W.K., Burov, O. (eds.) Operator Functional State Assessment: The Assessment and Prediction of Human Performance Degradation in Complex Tasks, pp. 177–193. IOS Press, Amsterdam (2003)
Google Scholar
Schwarz, J., Fuchs, S., Flemisch, F.: Towards a more holistic view on user state assessment in adaptive human-computer interaction. In: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, pp. 1247–1253. IEEE, San Diego (2014)
Google Scholar
Schwarz, J., Fuchs, S.: Multidimensional real-time assessment of user state and performance to trigger dynamic system adaptation. In: Schmorrow, Dylan D., Fidopiastis, Cali M. (eds.) AC 2017. LNCS (LNAI), vol. 10284, pp. 383–398. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58628-1_30
Chapter Google Scholar
Fuchs, S., Schwarz, J.: Towards a dynamic selection and configuration of adaptation strategies in augmented cognition. In: Schmorrow, Dylan D., Fidopiastis, Cali M. (eds.) AC 2017. LNCS (LNAI), vol. 10285, pp. 101–115. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58625-0_7
Chapter Google Scholar
Schwarz, J., Fuchs, S.: Validating a “Real-Time Assessment of Multidimensional User State” (RASMUS) for Adaptive Human-Computer Interaction. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 700–705. IEEE, Miyazaki, Japan (2018)
Google Scholar
Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)
Article MathSciNet Google Scholar
Marquart, G., Cabrall, C., de Winter, J.: Review of eye-related measures of drivers’ mental workload. Procedia Manuf. 3, 2854–2861 (2015)
Article Google Scholar
May, J.G., Kennedy, R.S., Williams, M.C., Dunlap, W.P., Brannan, J.R.: Eye movement indices of mental workload. Acta Psychol. 75(1), 75–89 (1990)
Article Google Scholar
de Rivecourt, M., Kuperus, M.N., Post, W.J., Mulder, L.J.M.: Cardiovascular and eye activity measures as indices for momentary changes in mental effort during simulated flight. Ergonomics 51(9), 1295–1319 (2008)
Article Google Scholar
Coyne, J., Sibley, C.: Investigating the use of two low cost eye tracking systems for detecting pupillary response to changes in mental workload. Proc. Hum. Factors Ergon. Soc. Annu. Meet. 60(1), 37–41 (2016)
Article Google Scholar
Backs, R.W., Seljos, K.A.: Metabolic and cardiorespiratory measures of mental effort: the effects of level of difficulty in a working memory task. Int. J. Psychophysiol.: Off. J. Int. Organ. Psychophysiol. 16(1), 57–68 (1994)
Article Google Scholar
Kim, H.-G., Cheon, E.-J., Bai, D.-S., Lee, Y.H., Koo, B.-H.: Stress and heart rate variability: a meta-analysis and review of the literature. Psychiatry Investig. 15(3), 235–245 (2018)
Article Google Scholar
Vargas-Luna, M., Huerta-Franco, M.R., Montes, J.B.: Evaluation of the cardiac response to psychological stress by short-term ECG recordings: heart rate variability and detrended fluctuation analysis. In: Long, M. (ed.) World Congress on Medical Physics and Biomedical Engineering, vol. 39, pp. 333–335. Springer, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-29305-4_89
Chapter Google Scholar
de Waard, R.: The Measurement of Driver’s Mental Workload. Traffic Research Centre VSC, University of Groningen, Haren, the Netherlands (1996)
Google Scholar
Grassmann, M., Vlemincx, E., von Leupoldt, A., Mittelstädt, J.M., van den Bergh, O.: Respiratory changes in response to cognitive load: a systematic review. Neural Plast. 2016, 16 p. (2016)
Google Scholar
Hart, S.G., Staveland, L.E.: Development of NASA-TLX (Task Load Index): results of empirical and theoretical research. Adv. Psychol. 52, 139–183 (1988)
Article Google Scholar
Hill, S.G., Iavecchia, H.P., Byers, J.C., Bittner, A.C., Zaklade, A.L., Christ, R.E.: Comparison of four subjective workload rating scales. Hum. Factors: J. Hum. Factors Ergon. Soc. 34(4), 429–439 (1992)
Article Google Scholar
Heller, O.: Theorie und Praxis des Verfahrens der Kategorienunterteilung (KU). Würzburger Psychologisches Institut, Würzburg (1982)
Google Scholar
Kaster, A., Tappert, E., Ruckert, C., Becker, R.: Design of ergonomic user interfaces for asymmetric warfare (Gestaltung ergonomischer Benutzungsschnittstellen für Asymmetric Warfare). Final Report. Fraunhofer-Institute for Communication, Information Processing and Ergonomics – FKIE, Wachtberg (2010)
Google Scholar
Šimundić, A.M.: Measures of diagnostic accuracy: basic definitions. EJIFCC 19(4), 203–211 (2009)
Google Scholar
Wilson, G.F., Russell, C.A.: Operator functional state classification using multiple psychophysiological features in an air traffic control task. Hum. Factors 45(3), 381–389 (2003)
Article Google Scholar
Matthias, A.: The responsibility gap: ascribing responsibility for the actions of learning automata. Ethics Inf. Technol. 6(3), 175–183 (2004)
Article Google Scholar
Hernandez, J., Paredes, P., Roseway, A., Czerwinski, M.: Under pressure: sensing stress of computer users. In: Proceedings of the 32nd annual ACM Conference on Human Factors in Computing Systems 2014, CHI 2014, pp. 51–60. ACM Press, New York (2014)
Google Scholar
Balaban, C.D., Cohn, J., Redfern, M.S., Prinkey, J., Stripling, R., Hoffer, M.: Postural control as a probe for cognitive state: exploiting human information processing to enhance performance. Int. J. Hum.-Comput. Interact. 17(2), 275–286 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Fraunhofer Institute for Communication, Information Processing and Ergonomics Department of Human-Systems Engineering, Fraunhoferstr. 20, 53343, Wachtberg-Werthhoven, Germany
Anna Bruder & Jessica Schwarz

Authors

Anna Bruder
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Schwarz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jessica Schwarz .

Editor information

Editors and Affiliations

Soar Technology, Inc.,, Orlando, FL, USA
Robert A. Sottilare
Fraunhofer FKIE, Wachtberg, Germany
Jessica Schwarz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bruder, A., Schwarz, J. (2019). Evaluation of Diagnostic Rules for Real-Time Assessment of Mental Workload Within a Dynamic Adaptation Framework. In: Sottilare, R., Schwarz, J. (eds) Adaptive Instructional Systems. HCII 2019. Lecture Notes in Computer Science(), vol 11597. Springer, Cham. https://doi.org/10.1007/978-3-030-22341-0_31

Download citation

DOI: https://doi.org/10.1007/978-3-030-22341-0_31
Published: 14 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22340-3
Online ISBN: 978-3-030-22341-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics