Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

According to current statistics, the amount of airline passengers will continue its positive development over the next years, with expected annual growth rates of up to five percent [3, 23]. To maintain the resulting needs and ensure smooth and safe traveling, the duty of air traffic controllers (ATCs) is of high importance. However, tasks of air traffic controllers (ATC) are complex and demanding [31]. Especially, factors which influence the general workload like traffic complexity or frequency congestion also increase the mental workload and therefore might influence the failure rate of an air traffic controller [6, 20].

In general, the performance of high demanding tasks or decision-making processes are depending on several influences [22, 29, 32, 33, 40]. Individual differences can be assessed by personality traits [14]. Especially, the traits neuroticism and conscientiousness seem to be important when attempting to simulate the rate of cognitive processes. Higher amounts of neuroticism usually lead to a lower performance of participants, because the cognitive load imposed by neuroticism (e.g. worry) reduced the processing capacity of the working memory system. People with a higher amount of conscientiousness, on the other hand, results in better overall training performances [40].

Another influence on task-performance depends on the difficulty of the tasks and the time pressure, which influences the workload and therefore affects the performance of problem solving [13, 44].

However, time pressure does not simply increase the cognitive workload, it also influences the emotional valence, resulting in a measurable increase of the arousal and less control options over tasks, compared to scenarios with less challenging time constraints [9, 42]. This experience of task-induced emotions could cause a limitation of cognitive resources needed to solve high demanding tasks [12]. This might in turn lead to an increase of the mental workload which once again impairs the task-performance [32]. Emotional influences, however, do not always result in negative effects. By investigating how emotions influences different executive functions, [29] found that positive mood decreases the performance of working memory tasks and simple problem solving tasks (Tower of London) whereas it increases the task performance which assessed fluency and creativity. In the investigation of influences of negative mood to performance, there exists contradictory results as well: Mitchell and Phillips [29] found no significant influence of negative mood to their investigated tasks performances. Jeon et al. [25] found in a driving scenario that negative mood impairs the performance of the task negatively, Berggren et al. [2] and Jeon and Coshere [24] found evidences that emotional stimuli could slow down executive functions and van Dillen and Koole [11] found that contrary to positive mood which lasts longer, negative mood might be distracted by solving complex and demanding tasks.

2 Emotion and Workload Within Air Controller Tasks

We present research based on a previously developed and validated emotion model [30]. Within our new project “StayCentered – Methodenbasis eines Assistenzsystems für Centerlotsen” (methodical base for an air traffic controller assistance system - MaCeLot) we improve and validate the aforementioned model to assess emotional and cognitive influences on the performance of air traffic controller.

In general, air traffic controllers working within designated air space areas, where they have to maintain the flow of air traffic by contacting aircraft pilots and providing them with advice, instructions and information about weather conditions and safe flight, ascent and descent paths. Under normal conditions, any given airspace sector is monitored by a dyad of air traffic controllers. While they are located at a shared workspace, they assume different roles: The executive is responsible for the coordination of the flights as well as the communication with the pilots, while the planner is managing the acceptance or handover of flights from or to other sectors (for details: Pfeiffer et al. [34]). The work of air traffic controllers is considered to be very demanding, since it is characterized by high responsibility and complex decision making under time pressure which can result in a negative stress response [21, 35]. Short term consequences of negative stress such as anxiety, despondence, anger, and cognitive impairments (e.g. Khansari et al. [27]) can directly effect the emotional and cognitive state of the air controllers and therefore hamper their performance [39]. While the existing literature covers several factors that influence the performance of aviation-related workers like fatigue [8], situation awareness [45] and mental imagery [38] as well as assessment techniques to measure workload (e.g. Vewey and Veltman [43]), it is rather unclear how emotional conditions, for instance as a result of stress, influence the air traffic controllers performance.

Therefore, we examined the influences of emotion and workload on the performance in an air traffic scenario.

3 Pretest Experiment

We collected data from 7 volunteers located at the University of Technology Chemnitz (57.1% male, 42.9% female, \(M_{Age}\) = 27.29; SD = 2.75). The majority of 85.7% participants had no prior experience in air traffic control tasks (including, but not limited to, video games).

3.1 Experimental Design

We conducted this pretest as a between-subjects design, in which each participant completed a simulated air traffic controller task. The simulation was divided into a practice session and two conditions of 4 min, a neutral condition (NC) and an emotional condition (EC). Before the emotional session each participant watched either a positive or a negative film clip to induce the respective mood.

The aim of this pre-test was to revise if the used methods are able to recognize the influences of workload and mood on the performance of the air controller task.

3.2 Methods

Since mixed results exist in literature about the influence of emotions to the performance assessed with different measurements, we decided to use both subjective and objective measurements. We used these measurements to verify as the subjective perception of the emotional and cognitive state of each participant is coherent with the objective measurements.

Measurements of Mood. Changes in the mood were assessed with a questionnaire “Aktuelle Stimmungsskala” (ASTS). This questionnaire [10] is a shortened version of the “profile of mood state”- scale (POMS) from McNair et al. [28]. The ASTS consists of 19 German adjectives calculating five different scales, representing anger, sadness, hopelessness, positive mood and tiredness. Subjects have to estimate their current feeling by rating on a scale from 7 (very strong) to 1 (not at all) how well this adjective represents their feeling. The questionnaire is validated and considered a good to very good reliability.

Furthermore, we recorded the skin conductance during the whole experiment to measure arousal. This data is used as a additionally indicator for the mood induction to check how strong participants are affected by the videos and also to identify how long such an induced mood lasts and influences subjects during the task.

Measurements of personality. The personality traits neuroticism and conscientiousness of each participants were assessed with the NEO-FFI questionnaire [5].

Measurements of Workload. Changes in the workload were assessed with the NASA-TLX based on the weighted mean of 6 sub-scales: mental demand, physical demand, temporal demand, effort, performance and frustration level [17]. This questionnaire is used and revised in air traffic control since more than two decades [16].

As objective measurements, we recorded the pupillary response as [1] referred that different task difficulties and workload led to different pupil dilation. With this measurement, we prove if the reported workload is coherent increased or decreased to the workload assessed by recorded pupil dilation.

Measurements of Performance. During the experiment, the simulation counted and logged if airplanes left an airspace unharmed or if a collision happened. Each airplane left the airspace unharmed counted +5 points, each airplane which collided in the responsible area gives −3 points and every airplane collided in the outer airspace area counted −2 points. The overall score was calculated and displayed on time during neutral and emotional conditions.

Fig. 1.
figure 1

The presentation of the stimulus represents the visual style of the radar screen. (Color figure online)

3.3 Material

Simulation of the air controller task. We designed a radar screen with different airspaces using Unity3D (see Fig. 1). These airspaces were displayed on three screens in front of the participants. During the whole experiment, every 4 s a pair of airplanes appeared from top, down, left or right heading towards the opposite side of the screen. At randomly chosen collision points, a framed green rectangle in the middle screen, reflected the so called responsible airspace. If a collision happened, the program played a collision sound and counts negative points to the score. If an airplane left the screen successfully, the software counted these events as positive points. The simulation stopped automatically, if the scheduled maximum experiment time was reached.

Mood Induction. Participants were randomly assigned to a happy or sad mood condition. We used two video clips with a duration of 4 min to induce the assigned mood (happy or sad). This induction methodology is described by [36] as well as used and revised in several studies [4, 15, 18]. Participants in the sad condition watched a film clip from “The Lion King” and participants in the happy condition watched a film clip from “When Harry mets Sally”. The videos were played on one of the three screen, on which the simulation of the air controller task is going to be displayed.

We also conducted a small pretest with video material were participants had to watch two videos in each condition (sad and happy) at there home. Afterwards they filled out a questionnaire by rating each video on a scale from 5 (strong) and 1 (not at all) how well this video represented the intended mood. The videos we used for the negative mood were “The Lion King” and “The Champ”(English) while the videos for a positive mood were “When Harry met Sally” and “Zoomania” (Trailer). After evaluating the questionnaires, we decided to use “The Lion King” and “When Harry mets Sally” for our mood induction as they were rated the highest and in the native language (German) of most of the expected participants.

Empatica-bracelet. The Empatica E4 wristband with a relatively small weight of only 40 grams combines sensors for the galvanic skin response, heart-rate, skin temperature and an accelerometer. Therefore, it is a valuable tool to unobtrusively measure biophysiological responses without wires. The skin-conductance is measured by two electrodes on the inside of the wristband. It measures in \(\mu s\) at 4 Hz while the resolution is at about 900 picosiemens with a range of 0.01 and 100 \(\mu s\). The heart-rate is measured by four photo-diods via photoplethysmographie which is based on volume changes of the arterial blood-flow in the outer wrist (BVP). The rate of measurement is 64 Hz and besides the BVP it also records the heart-rate-variability.

Eye tracker. For the recordings of gaze-patterns and pupillary responses, we used a SMI ETG2 mobile eye-tracker which records eye-movements and the changes of the pupil-dilation at 60 Hz. The whole eye-tracking-glasses setup weighs about 47 grams and is about the size of everyday protective glasses. They have a tracking accuracy of \(0.5^{\circ }\) and tracks the gaze of a user at \(80^{\circ }\) horizontally and \(60^{\circ }\) vertically. The coax camera is installed inside the frame of the glasses and records the field of view at 960\(\,\times \,\)720 p with 30 frames per second.

3.4 Procedure

The study was conducted in a prepared experiment room at the University of Technology Chemnitz. After the participants completed the consent forms, they completed the NEO-FFI, the first ASTS and a demographic questionnaire (see Fig. 2). Afterwards, the preparation and calibration phase started including the setup of the Empatica-bracelet and the eye tracker glasses.

Fig. 2.
figure 2

Overview of procedure for the pretest.

In the following practice session, each participant got the instruction and an easy task to learn how to manage the airplanes on the screens. This section was finished if the subject was able to manage the scenario (10 consecutive correct answers) and the participant completed another ASTS and the first NASA-TLX questionnaire. After the practice section the participant was exposed to a so called neutral section in which the participant has to maintain all airplanes on the screen for 7 min.

In this section, subjects controlled airplanes on the radar with their speech (see Fig. 3) as a wizard-of-oz-experiment. Each participant was the responsible air traffic controller of the middle airspace (green rectangle) and had to maintain all airplanes which randomly appeared in the outer air spaces. The airplanes appeared as a couple at a given frequency with a given number and a random height from both sides or from top and down heading to the same randomized point in the responsible airspace. During the whole experiment, participants have to keep in mind, that since we displayed a 3D scene on a 2D screen, airplanes which do not collide in 3D could appear at the same location of the screen.

Fig. 3.
figure 3

How participants changes the height of an airplane. (Color figure online)

The used control commands to maintain the air traffic flow and to avoid collisions were similar to real air controller commands (see Fig. 3). Therefore, subjects had to include in their commands the number of the chosen airplane plus the information about what they want to change, for example the heading or height of the airplane. The experiment leader played the role of the pilot controlling the airplanes in the background.

The neutral section was closed by completing the third ASTS and the second NASA-TLX questionnaire. Afterwards, each participant had to watch a video either a positive or a negative one and, subsequently performed last trial section. Since we induced a mood within this section, it is called emotion section. Within this trial, subjects has to maintain the airplanes 7 min as they did in the neutral session. Finally,the experiment was closed by filling out the last ASTS and NASA-TLX.

3.5 Data Preparation

After conducting the experiment and preparing the pupil diameter data recorded by the eye tracker (see material section), data had to be cleaned and artifacts to be removed, blinks and other undesired patterns in the data stream [1] were filtered out. Therefore, we used MATLAB-functions to implement standard methods for cleaning and analyzing pupil diameter data. First, we deleted all blinks in the signal, which are characterized by zero values in the data stream. Then, we interpolated the missing values and used a MATLAB function that detects and deletes outliers (values outside the 25th and 75th percentile of the range of pupil diameter in the whole experiment) and a median filter in order to smooth the signal. Participants with more than 18% blinks or zeros in the data stream were excluded from the statistical analysis as the filtering functions and the evaluation could be falsified by very noisy signals.

The EDA-data of the Empatica-bracelet was also cleaned by removing artifacts and outliers with the MATLAB-functions. Furthermore, we exclude participants with small values (mean of all conditions < 0.3) and low variance in their data, because it seems that their measure of the skin conductance is not valuable for gathering information about their arousal levels.

4 Preliminary Results and Discussion of the Used Methods

Experiment Design. For a maximum of external validity, the measurement of the cognitive and emotional states should ideally be performed under realistic conditions. However, due to legal constraints and security issues it is impossible to conduct experiments with German air traffic controllers while at work. Therefore, we needed to find an adequate scenario to gather data of participants while fulfilling typical air traffic controllers tasks. For this purpose, we decided to create a subtask regarding the prevention of aircraft collisions, since the whole bandwidth of air controller tasks is much too complex for laypersons to handle, and it is also influenced by too many variables to be subject of an controlled experiment (for an overview over the cognitive complexity in air traffic control: [19]). The main sources of information about the ongoing traffic in the sector are the flight strips and the radar screen. Since understanding and properly using the flight strips requires special knowledge, we used a simulated radar screen as device for stimulus presentation. Prior research suggests a possible confounding influence of the visual presentation on the mental workload in comparable tasks [26] due to different levels of perceptual load associated with the visual stimuli [7]. To keep our task comparable to the air traffic controllers work, we re-created a radar-screen using the same color scheme as well as the highly similar icons and information texts (see Fig. 1). Although realistic scenarios involve multiple airplanes at the same time with differences in heading, speed and flight level, this would be overly complex for participants without expertise. Therefore, we use only two airplanes which move faster than the ones the real radar-screen to compensate for the lower demands due to the reduced number of airplanes. The characteristics of the experimental tasks remains the same as for the air traffic controllers: Participants have to percept visual stimuli, retrieve relevant information and perform an adequate input, if necessary.

ASTS. The analysis of the ASTS questionnaires showed no significant correlation with the score, yet. But we saw also huge differences in the small sample size. Participants reported no problems with this questionnaire. Thus, we will use it for the upcoming experiment.

Fig. 4.
figure 4

Recorded diameter during video and emotional session. Green lines represent positive mood induced participants and blue lines negative mood induced participants. (Color figure online)

NASA-TLX. The used NASA-TLX questionnaire causes several questions and confusion about deciding how they should fulfill this questionnaire. In general, subjects said that the questionnaire is unintuitive and the description difficult to understand. Thus we decide to replace the NASA-TLX with the Instantaneousness self-assessment of workload (ISA) in a questionnaire form [41]. This technique is also used in air traffic control to reporting the current workload to their supervisors as well as in simulations.

Pupil diameter. We analyzed the change in the pupil diameter during the video session and in the emotional session. In Fig. 4 we show the recorded curve progressions of all participants. In the analysis of the pupil diameter or workload in the emotional session we see that the diameter within the positive session are more fluctuating than the pupil diameter during the negative session. However, we see that the pupil diameter is increased during the air traffic task (see Fig. 4(b)) in contrast to the video session, we see that pupil diameter during the pupil diameter respectively the workload is decreased (see Fig. 4(a)). Thus, we assume that the recorded pupil diameter is able to represent the workload.

Fig. 5.
figure 5

Recorded EDA during video and emotional session. Green lines represent positive mood induced participants and blue lines negative mood induced participants. (Color figure online)

Fig. 6.
figure 6

Comparison of the reached scores between positive and negative mood induced participants.

EDA. In contrast to the pupil diameter, the EDA-values are increased during the video session in most of the cases (see Fig. 5(a)). This was expected and shows that the mood induction with the videos was successful. The comparison of the EDA curve progression between positive (green lines) and negative (blue lines) session shows that participants watching the positive video are more aroused as the participants watched the negative video. This might be the case as the induced emotion in the negative video is sadness and this might be not an emotion that is represent by a high arousal [37]. During the emotional session participants doing the air controller tasks this arousal influence lasts for three participants the whole time (see Fig. 5(b)). Two of the participants watched the negative video were more aroused by the tasks and one positive induced mood participant shows an decrease in the curve progression.

Score. If we compared the score in the experiment of negative and positive mood induction (see Fig. 6) we see that participants with negative induced mood reached a higher score than participants with the positive induced mood. Thus, we see that the experiment setting is able to investigate the influence of positive and negative mood to the performance of an air controller task.

5 Conclusion

We described how we investigate the influence of emotion and workload on the performance of air controller tasks. We discussed the possible influences and how we are able to investigate this influences within an experiment with students. We present the structure and methods of our pretest in order to conduct a real experiment with a higher sample size. We presented preliminary results and discussed all used methods in respect to their expected out-coming and their usability for the participants. Therefore, we decided to use all described techniques except the NASA-TLX, which we will replace with the ISA- questionnaire. Based on this results we are able to conduct our experiment investigating the influences of workload and emotion on air controller tasks.