Keywords

1 Introduction

User interfaces in the car are required for an increasing number of features and functions [12]. Design approaches consider these features individually and not always as a function of the demands of the target environment. The vehicle environment can be divided into two groups of activities. Those directly related to the act of driving; Driving Related Activities (DRA) and those unrelated to the act of driving; Non-Driving Related Activities (NRA) [23].

The opportunity to engage in NRA in the vehicle is increasing, which directly conflict and oppose the goal of driving. Many existing methods for vehicle interface design are being ignored in favour of more agile methods [22]. Whilst the DRA has remained consistent, NRA’s are constantly changing with inconsistencies existing between manufacturers. This causes difficulty for users moving between vehicles in understanding the UI logic if not managed correctly [17]. The challenge exists to develop interfaces for both DRA and NRA that can work in harmony and give optimal performance for the user during multi-task conditions.

2 Situation Awareness

2.1 A Definition

The most commonly used definition of Situation Awareness (SA) is “the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future” [7]. Figure 1 shows Endsley’s basic SA model. SA precedes decision making and action and is found within short term memory [5]. SA focuses on situational elements of an environment, how they develop over time and how knowledge of a situation forms [3].

Fig. 1.
figure 1

A high level model of Situation Awareness, Endsley et al., 1995b

Endsley reports a three level process [6]. Level 1 (Perception) is goal directed [7]. Goals influence the selection of appropriate long term memory which direct attention and perception [4]. Level 2 (Comprehension) is a synthesis of perceived elements and formulates a Situation Model (SM) [16]. Level 2 requires domain knowledge otherwise inaccuracies developed in the SM may lead to operator error [4]. Level 3 (Projection) is where the SM is used to make predictions about how a situation may develop. In highly dynamic scenarios this is required for successful completion of an activity and is considered the mark of a skilled expert [5]. It is often the case that some people may not achieve L3 [4]. Level 3 is seen regularly in the driving when considering how drivers anticipate traffic situations and requires long term memory structures such as schema, scripts and mental models [9]. SA has been used to develop interfaces in domains such as aviation [1], military [15] and other professional environments [8].

2.2 Applying SA in the Automotive Domain

Driving was highlighted early on as an area of potential for SA [6, 7]. The main goal of driving is to travel from one place to another in a safe, timely manner, with the sub goals being; Control, Monitoring, Hazard Avoidance and Navigation [11].

Matthews et al., (2002) proposed a model of SA for driving using it to analyse how new interfaces affect driver performance [20]. A large number of publications have reported effects of mobile phone conversations on driving performance using SA [21, 24]. More recent SA research looked at interaction with other NRA’s whilst driving to see whether drivers interact in a “situationally aware manner” [25].

Baumann, et al., (2007) decomposed SA for automotive using Construction Integration Theory [16] proposing a cognitive framework for driving SA [2]. A relationship similar to Endsley’s is proposed using construction (Perception) and integration (LTM activations). These comparisons with LTM describe why experienced drivers perform better at hazard perception. For example, if the SM matches knowledge in LTM, the appropriate routine, or schema can be selected to complete the action faster and with more success. Whilst this model offers a cognitive approach to the DRA, it does not explicitly account for conditions when multi-tasking is present.

To date, driver behavior research has considered SA almost exclusively with respect to the DRA where SA is only built relative to the driving environment [10, 18, 19]. There is a general lack of consideration of how the NRA is an integral part of the SM.

2.3 A New Approach to SA in Automotive

In aviation, the goal is to safely control the aircraft. This “Single Goal SA” is an example of a professional operator domain where multitasking happens but is generally related to main goal. This is how the driving domain has been considered to date.

The vehicle exists in an environment where an operator can build SA to the DRA. This is where the similarities to aviation end because the car also has a number of potential competing activities to driving. This key characteristic is the difference between a professional environment, such as the pilot, and a private scenario. So far, SA research has focused on domains associated with those attempting to accomplish a professional role where performance regulation and continuous training is compulsory. This exposes difficulties in applying SA where multi-tasking across competing goals are an accepted part of the environment. There is need to adapt SA to include competing, unrelated goals for environments such as the vehicle.

A theoretical model is proposed (Fig. 2). The original SA model is expanded to include two goals, the DRA and NRA. The user must build up SM’s specific to each goal by focusing on elements within each specific environment. Each SM will decay when out of focus and not part of situation assessment.

Fig. 2.
figure 2

The Dual Goal SA model compared to the original Single Goal SA model

Goals can compete and so factors within the environment may be helpful for one goal, but unhelpful for another (e.g. sunlight, helpful for seeing outside of the car but makes in car displays difficult to read). By looking at SA through this lens it is hoped that identifying specific properties will lead to positive implications relating to UI design and also new ideas relating to attention and the cognitive properties of SA.

3 Methodology

3.1 Experimental Design

A pilot study was proposed to identify a method of assessing two competing goals. The aim was to measure the effect SA has on task performance by varying SA across two competing activities. SA is difficult to measure but good task performance is indicative of good SA, thus good performance on both should be indicative of high SA in both.

The experimental design used was a between subjects, repeated measures design. 20 male participants all between the age of 20 and 40 were recruited. All had valid driver’s licenses and all given a short cognitive test prior to running the experiment which showed no outliers. The independent variables used were DRA SA level (2 levels, High and Low) and NRA SA level (2 levels, High and Low). The 2 × 2 design meant 4 experimental conditions overall.

To generate the two DRA levels, Novices and Experienced drivers were used (10 per group). There is evidence to support that experience is a good measure of a ability to build awareness toward a situation [27]. The Experienced group were given 30–45 min of training prior to the trial. The Novice group were informed of the DRA pre-trial but were immediately exposed to the experimental condition without training.

The driving simulation used was Mario Kart Wii for the Nintendo Wii, with controls built into a traditional driving setup with steering wheel and pedals. This simulation was used to create a true novice grouping and due to it having a highly dynamic driving environment. The simulation was modified to remove any competitive elements (timing, position and threats) and participants were asked to complete three laps whilst keeping the vehicle between the centre line and the right hand side edge of the track. Participants were advised to avoid collisions with other road users and to carry out both activities. Each run lasted approximately 3 min.

To generate the two NRA levels, two interfaces were used which simulated high and low SA. All participants were exposed to both interfaces and the task used in the NRA was consistent, a simple Visual-Manual number entry task (Fig. 3).

Fig. 3.
figure 3

NRA Interfaces, Fixed Keypad (left) and Variable Keypad (Right)

The High SA condition used a telephone number pad with physical moving buttons to give people confidence on location and activation (Fixed Keypad). The Low SA condition used a touchscreen keypad to remove the physical properties. The keys were spread out across a region approximately 15 cm x 15 cm in size and moved randomly after each entry (Variable Keypad). This meant the user could not become familiar with the interface layout, a factor associated with an inability to build awareness.

The interfaces were located on the centre console of the rig, to the left of the participant. Each participant was given an instruction sheet explaining the tasks prior to the experiment but was not allowed to practice prior to the trial. The participants did attempt each in a single task condition before being exposed to dual task.

The main dependent variables were a combination of subjective and objective. For the subjective, self-reported workload using the NASA TLX [14] and Situation Awareness Rating Technique (SART) [26] were asked after each run. The question was asked for both tasks simultaneously which meant the user had to fill out two scores, one for each task. This was done to see whether the participant could distinguish the demand or SA requirement for the individual tasks when being attempted simultaneously.

The objective measures included both DRA and NRA based such as steering angle, driving time, lane exceedances, pedal activations, driving incidents, NRA completion rate, mean task time, errors and glances away from the road but to name a few. Despite the focus being on dual task, single task was also measured to confirm performance for the higher SA conditions in order to validate the approach.

3.2 Setup and Procedure

The experiment was run in a low fidelity driving simulator located at the Jaguar Land Rover HMI Research Lab. Participants were asked to complete a pre-test questionnaire and read instructions including a consent form. The Experienced group were then given their pre-exposure to the DRA (30-45 min) whereas the Novice group were taken straight into the experiment.

The run order varied per participant. The first 3 runs and the final run (run 6) were always the same with runs 4 and 5 counterbalanced to avoid learning effects on the DRA. The run order used was Driving Only, Fixed Keypad Only, Variable Keypad Only, Dual Task 1 (counter balanced), Dual Task 2 (counter balanced) and Driving Only. After each run the participant was asked to complete a post-test questionnaire for Workload and SART before moving on.

3.3 Hypothesis

This experiment focused on testing traditional performance measures and their ability to successfully classify SA corresponding to the experimental groups (Table 1). The hypothesis used was that significant differences would be found between conditions 4 and condition 1. A secondary hypothesis would be that performance for conditions 2 and 3 would fall in between conditions 1 and 4.

Table 1. Experimental Conditions

4 Results

All data passed normality tests and met the assumption of ANOVA and any significant outliers were removed. 3 factors were used in the test, Type (Fixed, Variable Keypad), Mode (Single, Dual Task) and Experience to the DRA (Novice, Experienced).

4.1 Subjective Results

An ANOVA of DRA Workload revealed significant effects between Type [DF (3, 72), F-Value = 7.93, P-Value < 0.01]. No significant effect was found between the Dual-Task conditions. NRA Workload showed no significant results across all of the factors. There was however a trend for higher reported figures in the Dual-Task conditions.

An ANOVA for DRA SART demonstrated no significant effects for any condition. An ANOVA of NRA SART revealed significant effects of Type*Mode*Experience [DF (2,108), F-Value = 3.42, P-Value = 0.036)], a Tukey Post hoc Pairwise comparison showed a differences between Fixed Keypad*Single*Novice and Variable Keypad*Dual*Experienced conditions (P < 0.05). All other differences were found to be insignificant and neither scale agreed with the primary or secondary hypothesis.

4.2 Objective Results

For DRA based objective data Driving Time, Steering Wheel Angle, Total off Track Time and Pedal Activations all showed support to the hypothesis to significant levels (P < 0.05). There was also general agreement of the secondary hypothesis.

ANOVA reported significant effects of Experience for all parameters (P < 0.05). For Driving, the Variable Keypad was significantly worse that the Fixed Keypad [F(3, 75), = 7.76, p < 0.001]. The same was true for Steering Wheel Angle for Type [F(3, 72) = 6.30, p = 0.001],, Pedal Up Time for Type [F(3, 72) = 6.30, p = 0.001].

ANOVA for NRA Completion Rate showed significant effects of Type [F(2, 108) = 766.82, p < 0.001], Mode [F(1, 108) = 593.24, p < 0.001], Experience [F(1, 108) = 6.89, p = 0.010) and Type*Mode [F(2, 108) = 180.24, p < 0.001]. A post hoc Tukey comparison proving the difference between the Variable and Fixed keypad conditions to be significant (p < 0.05). Mean Task Completion Time proved significant for Type [F(2, 106) = 68.97, p < 0.001], Mode [F(1, 106) = 142.14, p < 0.001], Type*Mode [F(2, 106) = 44.82, p < 0.001], Mode*Experience, [F(1, 106) = 4.92, p = 0.029], Type*Experience [F(2, 106) = 9.70, p < 0.001] and Type*Mode*Experience, [F(2, 106) = 4.46, p = 0.014].

4.3 Factor Analysis of Objective Measure

An un-rotated principle component factor analysis was carried out to correlate all of the measures into a smaller number of factors using Minitab (Table 2).

Table 2. Eigenvalues for Factor Analysis of Driving and Non-Driving based Measures

The variables used were Driving Time, Steering Angle, Pedal Activations and Off Track time for driving and Completion Rate, Mean Completion Time, Errors and Mean Glance Time. The eigenvalues of the result can be seen in the Table 1.

Factor 1 is dominated by Completion Rate, Task Time, Glances and Errors and can therefore be known as NRA Performance. Factor 2 is dominated by Driving Time, Steering Angle, Off Track and Pedal Up Time and can therefore be known as DRA Performance. A third factor demonstrates correlation between Pedal Up Time and NRA Errors. This Factor appears to explain that errors are likely to happen when the user comes under some pressure in the DRA and can be known as Incident Related Errors. The data points for factors 1 and 2 can be seen in Fig. 4. The two main factors identified are able to classify the participants broadly into groups consistent with the original hypothesis.

Fig. 4.
figure 4

Score Plot of Driving and Non Driving based Measures

5 Discussion

The factor analysis (FA) was successful in characterising the original experimental groups agreeing with the hypothesis. An FA is useful because the effects of performance in the NRA is seen in the DRA data and vice-versa. The separation between the two NRA is clear but the DRA shows more of an overlap between Novice and Experienced groups. The Novices learned quickly and in some cases became as good as some of the experienced group who were also affected by the environmental variability.

All DRA objective measures agreed with both the primary and secondary hypotheses. The NRA objective measures also agreed with the primary hypothesis but not entirely with the secondary hypothesis. The NRA values lay outside of those found in Cells 1 and 4 of Table 1 rather than inbetween. This can be explained by experience having an effect on the management of risk in a dual task scenario. Experienced users are more aware of what could happen and thus modulate their performance with the NRA.

The DRA measures covered a wide range of situational artefacts. It appears to be important to make use of complimentary and partially redundant measures. Driving Time complemented Pedal Activations; both appear to be useful in distinguishing between Low and High SA, although this seems to diminish with experience. Steering Angle and Off Track Events again are complimentary but seem to measure the same thing. Driving Time and Steering Angle have fine resolution but may fail to pick up short term events, which can be covered by the Pedal Activations and Off Track Events. Adding redundancy will ensure artefacts and significant SA events are accounted for.

For NRA, Mean Completion Time displays similar findings to that of Mean Completion Rate. Again root cause of poor response may be difficult to determine because of the demands of the driving environment. That said combining these with Errors and Glances away from the road will give equal level of redundancy. Mean Completion Time in combination with Task Errors may be a more appropriate match. Glances offer insight into how the user is managing attention but is inconsistent especially in the situation where an Auditory-Vocal interface with no visual information is used. During multi-tasking, performance may be counter intuitive, for example, if the driver is risk adverse performance may be more regulated. It is therefore important to use complimentary measures across the NRA and DRA. For example, Visual-Manual tasks are more likely to cause effects in the steering wheel trace than the pedal. If using an Auditory-Vocal task an equally complimentary measure needs to be found.

Driving Incidents was not considered in the factor analysis due to a lack of observable differences between the Low and High SA groups. Despite having awareness the High SA group still had numerous driving incidents. This demonstrates the load on the user by an NRA and the effect this has on the ability to balance both tasks. Incidents are not always caused by poor SA; this was especially true in the dynamic driving task used.

Neither subjective measures were able to classify the groups specified in the hypothesis. This could be due to methodological issue or individual perception of workload, especially in the Novice group. Workload could distinguish Single and Dual Task conditions but not the different types of Dual Task. This finding conflicts with previous work which found reported workload increases with task difficulty [13]. The approach in asking the question for each task shows promise from the scoring found as individuals were not just putting down the same score for each task.

6 Conclusion

The model proposed opens a new lens on how SA could be applied in the development of automotive user interfaces. The pilot study demonstrated that it is possible to use task performance to classify levels of SA especially when considering a Dual Goal scenario like automotive. A factor analysis proved able to classify groups according to the hypothesis and there is clear evidence that Experience can provide a useful proxy for high awareness. Knowledge built up over time could be considered a pre-cursor to being able to carry out tasks successfully in a demanding environment.

Subjective measures, which should have backed up the objective measures, require further development. There appears to be utility in separating out questions when multiple tasks are being assessed but modification of the questionnaire used and how it is applied is required. Future work would need to make use of a more realistic driving simulation and balance the gender of participants. It is also unlikely to get true novices in the DRA and thus focus here should switch to the NRA. The interface design aspect of this research will consider whether it is possible to design interfaces that enable SA for both DRA and NRA tasks, and whether this gives objective benefits in terms of performance in a modern automobile.