Study of NASA-TLX and Eye Blink Rates Both in Flight Simulator and Flight Test

Zheng, Yiyuan; Jie, Yuwen

doi:10.1007/978-3-030-22507-0_28

Yiyuan Zheng⁹ &
Yuwen Jie⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11571))

Included in the following conference series:

International Conference on Human-Computer Interaction

1457 Accesses

Abstract

In order to determine the minimum flight crew number and to show compliance with aircraft airworthiness regulations of CS25.1523, the workload of flight crew should be measured in various fight scenarios both in flight simulator and in flight test. However, the complexity, environment and safety consideration of flight test requires flight crew to take more responsibility and more careful with decisions and actions with higher stress, and it may be inappropriate to carry out the flight test in a high-risk abnormal situation. Therefore, it is necessary to assess workload measures in a simulator to predict in-flight behavior.

In this research, NASA-TLX and eye blinks rate were compared, both in flight simulator and in flight test in three flight scenarios, including Standard Instrument Departure, Manual Departure, and Standard Instrument Approach. This study were carried out in a CRJ-200 full - flight simulator and an aircraft, and a total of nine pilots were participated in.

According to the results, both flight scenarios and environments had the significant influence on NASA-TLX. However, eye blinks rate only manifested significant differences in flight environment. Furthermore, the relation between NASA-TLX and eye blinks rate are weak between simulator and flight test. Therefore, in order to reduce the quantity and risk of compliance demonstrating flight test, it is necessary to figure out more significant psychophysiological measurements.

Download conference paper PDF

An Analysis of Pilot’s Physiological Reactions in Different Flight Phases

Pilot Fatigue Evaluation Based on Eye-Movement Index and Performance Index

The Efficacy of Eye Blink Rate as an Indicator of Sleepiness: A Study of Simulated Train Driving

Keywords

1 Introduction

For commercial aircraft, airworthiness is certification and supervision on the design, manufacture, implementation and maintenance of the aircraft according to the airworthiness regulations and materials on behalf of public [1]. The aim of airworthiness is to ensure the aircraft could achieve the safety level that the regulations required. Typically, the design of commercial aircraft should comply with Certification Specifications for Large Aeroplanes CS-25, which is issued by European Aviation Safety Agency [2].

Human factors is the most important factors that could threaten aviation safety. According to the statistics, over 70% flight accidents were attributed to human factors [3]. There are several airworthiness regulations that concerning human factors in CS-25. Among them, CS25.1523-Minimum Flight Crew, is one of the most important regulations which stipulates the determination of the number of flight crew should base on the workload on individual crew members. In other words, in order to show the compliance with CS25.1523, the workload of each flight crew member should be measured. Furthermore, the recommended means of compliance includes simulator test and flight test.

Typically, the traditional workload measurements for flight crew consist of four types: timeline analysis, task performance measures, subjective rating scale measures and psychophysiological measures [4]. Timeline analysis could be used as an analytic tool in order to make a priori predictions regarding the task demands imposed on the crew [5]. It based on micro-motion techniques and borrowed from industrial engineering, computes workload as a ratio of time required to complete necessary tasks as a fraction of time available. In several aircraft types design, Boeing Commercial Airplane used timeline analysis technique in simulator studies [6]. Task performance measures can be classified into two major types: primary task measures and secondary task measures [7]. Normally, performance of the primary task will always be of interest as its generalization be central to the study. Speed, accuracy, response times, and error rates are often used to assess primary task performance [8]. Bliss and Dunn supported the hypotheses that increasing primary task and alarm task workload degraded alarm response performance [9]. The secondary task technique assumes that operators are given an additional information processing task to perform in conjunction with the task of interest. The rationale underlying the use of secondary tasks is that by applying an extra load which produces a total information processing demand that exceeds the operator’s capacity, workload can be measured by observing the difference between single task and dual task performances [10]. Wester et al. examined the impact of secondary task performance, an auditory oddball task, on a primary lane keeping driving task [11]. By studying the impact of simultaneous information conflicts, from multiple secondary in-vehicle tasks, on the primary task of driving, Lansdown and Brook-Carter suggested overloading the visual channel would result in performance decrements [12]. Subjective rating scale measures assume that an increased power expense is linked to the perceived effort and can be appropriately assessed by individuals. NASA-TLX, Bedford scale, and Modified Cooper-Harper scale are most popular ones. Schnell et al. evaluated Synthetic vision information systems in flight deck by using NASA-TLX [13]. The pilot workload, which was assessed through Bedford scale, resulting from a range of wind-over-deck conditions have been used to develop the Ship-Helicopter Operating Limits for a Lynx-like helicopter and the SFS2 [14]. Physiological measures use the physical reactions of the body to objectively measure the amount of mental work a person is experiencing. It would seem an objective measurement would be the most exact and therefore the best way to find workload because it does not require a direct response from the person, unlike subjective measures [15]. In physiological areas, eye activity and cardiac activity are the most research focuses on. Heart rate measurement is considered the most common and reliable measure of workload. Generally, heart rate increases as workload increases [16]. Moreover, eye activity, including pupil dimension and eye blink rate could also indicate the workload. Normally, pupil diameter is found to increase with increasing mental workload, and eye blinks rate decrease with increasing workload [17].

Since flight test may include high or medium risk scenarios, it is necessary to select the appropriate workload measurement which would not interfere with flight crew operation. Therefore, in order to determining the desirable workload measurement in simulator and flight test, subjective rating scale measures and physiological measures, including NASA-TLX and eye blinks rate, were analyzed in this study. Furthermore, 9 pilots composed 6 flight crews were participated in this test which contained three flight scenarios: Standard Instrument Departure (SID), Manual Departure (MD), and Standard Instrument Approach (SIA).

2 Method

2.1 Subjects

Nine Chinese male pilots ranging in age from 30 to 50 (Mean = 41.3 ± 5.23) were invited to participate in this experiment. These pilots were either commercial airline pilots or flight instructors from China Eastern Airlines. Simultaneously, they had all been recruited as captains or co-captains for some types of aircrafts (5 for B737, 4 for B747). Furthermore, these pilots were paired into six flight crews. Among them, three pilots were assigned with different flight responsibilities in different crews involved, i.e., as Pilot Flying in one crew and as Pilot Monitoring in the other. Before the experiment, all subjects signed the consent form, which was approved by the Institutional Review Board of Shanghai Jiao Tong University.

2.2 Apparatus

The experiment was carried out in a CRJ-200 full - flight simulator. It is a qualified flight simulator (level D). All the configurations in the flight simulator are identical with the real aircraft. Simultaneously, the flight test was conducted in a real CRJ-200 aircraft, shown as in Fig. 1.

Besides the flight simulator and the aircraft, a head-mounted eye tracker (Tobbi Glass, Sweden), which sample rate was 30 Hz, was used to determine the eye blinks rate of the subjects during the experiment, shown as in Fig. 2.

2.3 Procedure

In order to compare the workload measurements in flight simulator and flight test, three flight scenarios were designed, including Standard Instrument Departure (SID), Manual Departure (MD), and Standard Instrument Approach (SIA). Each of the flight scenarios were carried out in flight simulator and flight test respectively by each flight crew. The configurations and operating procedures for the flight scenarios were same in flight simulator and flight test as following.

1.
Standard Instrument Departure

The flight scenario was conducted in Chengdu Shangliu International Airport. The task was started from pressing “TOGA (Takeoff/Go-around)” button by pilots. Then, the subjects pushed the throttle and kept accelerating. When the aircraft reaching the speed of VR, the subjects needed to rotate and maintained a 3 degree climbing approximately. When the aircraft reaching the altitude of 1500 feet, the subjects were required to connect the autopilot system, and keep supervising the essential flight parameters until climbing to 10000 feet.

2.
Manual Departure

The flight scenario was conducted in Chengdu Shangliu International Airport. The task was started from pressing “TOGA” button by pilots. Then, the subjects pushed the throttle and kept accelerating. When the aircraft reaching the speed of VR, the subjects needed to rotate and maintained a 3 degree climbing approximately. Moreover, when supervising the positive rising rate on Primary Flight Display, the subjects were required to retract the landing gear and keep climbing to 10000 feet by hand.

3.
Standard Instrument Approach

The flight scenario was conducted in Chengdu Shangliu International Airport. The task was started in 40 nautical miles away from descending point. After slowing down to 145 knots, and descending to 1500 feet, the aircraft was in landing pattern. The subjects executed a CAT I standard instrument approach procedure and landed on the runway.

The simulation experiment was conducted prior to the flight test. At first time, the subjects performed a standard instrument departure and a standard instrument approach. At the second time, they performed a manual departure and a standard instrument approach. After each task, every subject was asked to fulfill the NASA-TLX scale. In flight test, the procedures were same as in flight simulator.

2.4 Statistical Analysis

SPSS 17.0 for Windows was used to process the experiment data, and ANOVA analysis, and correlation analysis were implemented in this study. When P < 0.05, the results were considered statistically significant.

3 Results

3.1 NASA-TLX Scales

Considering the results of NASA-TLX scales, the three flight scenarios showed the significant differences in the simulator experiment (F(2,12) = 3.01, p = 0.040). Among them, Standard Instrument Approach (SIA) had the maximum average NASA-TLX scores (Mean = 27.92, SD = 9.54), Standard Instrument Departure (SID) was minimum (Mean = 19.85, SD = 5.08), Manual Departure (MD) was in the middle (Mean = 22.58, SD = 7.32). Similarly, in flight test, standard instrument approach had the highest NASA-TLX scores (Mean = 33.42, SD = 10.24), manual departure was medium (Mean = 28.75, SD = 7.06), and standard instrument departure was minimum (Mean = 25.42, SD = 9.00). However, the differences of three flight scenarios in flight test were insignificant (F(2,12) = 3.01, p = 0.063). Furthermore, the difference between simulator experiment and flight test were significant in standard instrument departure (t = 2.43, p = 0.024) and in manual departure (t = 2.10, p = 0.047). Nevertheless, in standard instrument approach, the difference was insignificant (t = 1.36, p = 0.187). Otherwise, NASA-TLX scales showed a moderate correlation between simulator and flight test (R = 0.524, p = 0.001), as was depicted in Fig. 3.

3.2 Eye Blinks Rate

Considering the results of eye blinks rate, as shown in Fig. 4, only in the simulator experiment (F(2,12) = 4.711, p = 0.016), the differences of the three flight scenarios was significant, and in the flight test (F(2,12) = 0.003, p = 0.997), the differences was insignificant. In the simulator experiment, standard instrument departure had the maximum average eye blinks rate (Mean = 14.08, SD = 3.63), standard instrument approach was minimum (Mean = 9.83, SD = 2.72), and manual departure was medium (Mean = 11.58, SD = 3.78). However, in the flight test, the discrepancy is slight. Furthermore, comparing the difference between simulator experiment and flight test for each flight scenarios respectively, only standard instrument departure was significant (t = 3.331, p < 0.01), and both manual departure (t = 1.457, p = 0.159) and standard instrument approach (t = 0.213, p = 0.834) were insignificant. Besides, eye blinks rate expressed a more weak correlation between simulator and flight test (R = 0.242, p = 0.155).

4 Discussion

Flight test is the most direct means of compliance in aircraft human factors airworthiness certification. However, it is not the preferred means due to the following three reasons. Firstly, it might not be appropriate to test an abnormal situation for safety consideration [18]. Secondly, a flight environment is normally difficult to manipulate the operational environment which might be required to apply the scenario-based approach. Last but not least, human factors scenarios performed in flight test could not be easy to duplicate due to the lack of controllability of the operation context [19]. Therefore, simulator test might be more appropriate than flight test, especially in high risk flight scenarios, and both of them should be examined from the standpoint of human workload to shown compliance with airworthiness requirement.

However, the classic workload measurements have their own limitations. Subjective rating scale measures are sometimes uncertain on the repeatability and validity, and data manipulations are often questioned as being inappropriate [20]. Moreover, subjective feeling of workload was essentially dependent on the time stress involved in performing the task for time-stressed tasks only [21]. For task performance method, because of the compensatory effect of increased effort, it is clear t not sufficient to assess the state of the operator [22], and some other factors, such as strategy, affect performance and workload differently [23]. Psychophysiological measures are influenced by ambient environment and task duration [24]. In real flight, most of pilots are preferred to wear a sunglass to prevent direct sunlight. Moreover, some studies assumed that eye movement activity parameters only can provide a sensitive measure of visual workload [25]. Therefore, it is necessary to select the desirable workload measurements according to the specific characteristics of simulator test and flight test.

In the simulator experiment, NASA-TLX is a multidimensional rating scale that assesses a subject’s subjective workload on six 100-point scales related to a different aspect of workload: Mental Demand, Physical Demand, Temporal Demand, Performance, Effort, and Frustration [26]. It is more precise and comprehensive in workload evaluation. Besides, Eye measures were sensitive to intermediate levels of mental effort as well [27], and would also produce reliable near-real-time indicators of workload in flight simulator [28].

In this study, two types of workload measurements were compared, including subjective methods: NASA-TLX, and psychophysiological measures: eye blinks rate both in flight simulator and in flight test in three flight scenarios. The results demonstrated that NASA-TLX, eye blinks rate were credible in flight simulator. Nevertheless, in these three flight scenarios, neither of them produced reliable indictors in flight test. In further study, there are two more aspect would be carried out. Firstly, more measures would be implemented in both simulator and flight test environment, for instance subjective measurements including Bedford methods and Modified Cooper-Harper, and psychophysiological approaches like ECG and EEG. Secondly, in order to ensure the safety of flight, only normal flight scenarios were selected in this study. Therefore, under safe condition, more scenarios should be included, especially some abnormal conditions, such as, crosswind handling, one engine failure.

References

De Florio, F.: Airworthiness: An Introduction to Aircraft Certification and Operations. Butterworth-Heinemann, Oxford (2016)
Book Google Scholar
EASA: Certification Specifications for Large Aeroplanes CS-25 (2009)
Google Scholar
Wiegmann, D.A., Shappell, S.A.: Human error analysis of commercial aviation accidents: application of the Human Factors Analysis and Classification System (HFACS). Aviat. Space Environm. Med. 72(11), 1006–1016 (2001)
Google Scholar
Farmer, E., Brownson, A.: Review of workload measurement, analysis and interpretation methods, vol. 33. European Organisation for the Safety of Air Navigation (2003)
Google Scholar
Stone, G., Gulick, R., Gabriel, R.: Use of task timeline analysis to assess crew workload, DTIC Document (1987)
Google Scholar
O’Donnell, R., Eggemeier, F.T.: Workload assessment methodology. Meas. Tech. 42, 5 (1986)
Google Scholar
Cain, B.: A review of the mental workload literature, DTIC Document (2007)
Google Scholar
Ashcraft, M.H., Kirk, E.P.: The relationships among working memory, math anxiety, and performance. J. Exp. Psychol.: Gen. 130(2), 224 (2001)
Article Google Scholar
Bliss, J.P., Dunn, M.C.: Behavioural implications of alarm mistrust as a function of task workload. Ergonomics 43(9), 1283–1300 (2000)
Article Google Scholar
Wickens, C.D.: Multiple resources and mental workload. Hum. Factors: J. Hum. Factors Ergon. Soc. 50(3), 449–455 (2008)
Article Google Scholar
Wester, A., et al.: Event-related potentials and secondary task performance during simulated driving. Accid. Anal. Prev. 40(1), 1–7 (2008)
Article MathSciNet Google Scholar
Lansdown, T.C., Brook-Carter, N., Kersloot, T.: Primary task disruption from multiple in-vehicle systems. ITS J.-Intell. Transp. Syst. J. 7(2), 151–168 (2002)
Article Google Scholar
Schnell, T., et al.: Improved flight technical performance in flight decks equipped with synthetic vision information system displays. Int. J. Aviat. Psychol. 14(1), 79–102 (2004)
Article Google Scholar
Roper, D., et al.: Integrating CFD and piloted simulation to quantify ship-helicopter operating limits. Aeronaut. J. 110(1109), 419–428 (2006)
Article Google Scholar
De Waard, D.: The Measurement of Drivers’ Mental Workload. Traffic Research Center, Groningen University, Netherlands (1996)
Google Scholar
Hoover, A., et al.: Real-time detection of workload changes using heart rate variability. Biomed. Sig. Process. Control 7(4), 333–341 (2012)
Article Google Scholar
Tsai, Y.-F., et al.: Task performance and eye activity: predicting behavior relating to cognitive workload. Aviat. Space Environ. Med. 78(5), B176–B185 (2007)
Google Scholar
Schutte, P.C., Trujillo, A.C.: Flight crew task management in non-normal situations. In: Proceedings of the Human Factors and Ergonomics Society Annual Meeting, pp. 244–248. SAGE, Los Angeles (1996)
Google Scholar
Perkins, C.D.: Stability and Control: Flight Testing, vol. 2. Elsevier, Amsterdam (2014)
Google Scholar
Annett, J.: Subjective rating scales: science or art? Ergonomics 45(14), 966–987 (2002)
Article Google Scholar
Meshkati, N., et al.: Techniques in mental workload assessment (1995)
Google Scholar
Wilson, A.G.F.: Operator functional state assessment for adaptive automation implementation. In: Proceedings of SPIE - The International Society for Optical Engineering, vol. 5797, pp. 100–104 (2005)
Google Scholar
Wickens, C.D., Huey, B.M.: Workload Transition: Implications for Individual and Team Performance. National Academies Press, Washington, DC (1993)
Google Scholar
Allanson, J., Fairclough, S.H.: A research agenda for physiological computing. Interact. Comput. 16(5), 857–878 (2004)
Article Google Scholar
Wilson, G., Fisher, F.: The use of cardiac and eye blink measures to determine flight segment in F4 crews. Aviat. Space, Environ. Med. 62(10), 959–962 (1991)
Google Scholar
Hart, S.G.: NASA-task load index (NASA-TLX); 20 years later. In: Proceedings of the Human Factors and Ergonomics Society Annual Meeting. Sage Publications (2006)
Google Scholar
De Rivecourt, M., et al.: Cardiovascular and eye activity measures as indices for momentary changes in mental effort during simulated flight. Ergonomics 51(9), 1295–1319 (2008)
Article Google Scholar
Van Orden, K.F., et al.: Eye activity correlates of workload during a visuospatial memory task. Hum. Factors 43(1), 111–121 (2001)
Article Google Scholar

Download references

Acknowledgments

This research work was supported by Professor Fu, Dr. Lu and Dr. Wang in Man-Machine Environment lab, School of Electrical Engineering and Electronic Information, Shanghai Jiao Tong University.

Author information

Authors and Affiliations

Shanghai Aircraft Airworthiness Certification Center of CAAC, Shanghai, People’s Republic of China
Yiyuan Zheng & Yuwen Jie

Authors

Yiyuan Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Yuwen Jie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yiyuan Zheng .

Editor information

Editors and Affiliations

Coventry University, Coventry, UK
Don Harris

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, Y., Jie, Y. (2019). Study of NASA-TLX and Eye Blink Rates Both in Flight Simulator and Flight Test. In: Harris, D. (eds) Engineering Psychology and Cognitive Ergonomics. HCII 2019. Lecture Notes in Computer Science(), vol 11571. Springer, Cham. https://doi.org/10.1007/978-3-030-22507-0_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-22507-0_28
Published: 18 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22506-3
Online ISBN: 978-3-030-22507-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics