Keywords

1 Introduction

Communication is linked to different sensory modalities. In Collaborative Virtual Environments (CVE) and Human-Machine Interaction (HMI), communication is normally limited to audio-visual communication. Current research suggests that including more channels of communication through added sensory modalities could, to some extent, enhance performance, increase the ability of an individual or a group to accomplish a task, or increase tele-presence [1]. Integrating another sensory modality to the communication model could also convey information for operators whose other senses are preoccupied in a demanding task.

A primary task can be thought of as a physical task a human operator performs with minimal cognitive loading and a secondary task is a cognitive task where the human operator receives and processes data. For example, a car driver performs a primary task by steering the car in the correct direction. A secondary task would be receiving direction information from a GPS.

In this paper, an experiment is designed and conducted to evaluate the performance of participants for a combination of two sensory feedback modes in a primary task. The combined sensory modalities for the primary task are audio-visual, haptic-visual or audio-haptic. A secondary task is also designed to evaluate workload of each feedback mode and the effect of different levels of workload on task completion time and task accuracy. The sensory modalities the participants are exposed to in the secondary task are audio, haptic or visual.

2 Related Work

Haptic, Audio and Visual Environments (HAVEs) involve the reproduction of sensory cues via computer peripherals. HAVEs range from simple single-sensory environments to sophisticated multi-sensory and multi-dimensional environments. Complex HAVE systems usually consist of three modalities: haptic technology, binaural sound, and 3D visuals. Each component has different effects that should be handled to build a virtual environment that is efficient and realistic. Additionally, virtual systems usually involve various kinds of navigation and selection tasks.

There has been significant research to quantitatively and qualitatively evaluate single-modality and human performance in HAVEs. In [2], auditory sensory feedback was evaluated in a collaborative visual and haptic environment quantitatively by measuring the task completion time and also qualitatively by interviewing subjects. It has been shown that adding auditory feedback has enhanced the performance of participants in an object manipulation application. In [3], a standard Fitts` task was used to evaluate human performance in a selections task with and without haptic feedback. Researchers in [4] have evaluated the addition of visual feedback in haptic-enabled Virtual Environment (VR) in the influence of visually observing object deformation in the user perception of static and dynamic friction. Recently, haptic feedback has received a wide attention from researchers. In a motion tracking application, for example, where a vibro-tactile feedback is provided for users if they deviate from the desired trajectory, performance was studied [5]. Although the addition of haptic feedback in [5] has not reduced motion errors, it enhanced user experience reported by survey responses. Haptic feedback has also been recruited in training simulators. The ability of trainees to learn faster by adding haptic feedback to computer simulation for coursework teaching and driving training was evaluated in [6, 7] respectively. Haptic feedback has also been used as an assistance in a writing training application between an expert and a beginner [8]. In a tele-operated tasks, the added haptic feedback has improved the user`s performance to remotely manipulate objects using a robotic arm [9].

Additionally, a few studies have looked into the evaluation of dual-sensory-modality performance in HAVEs. In a collaborative environment, verbal communication and haptic feedback were evaluated in a navigation task as well as a combined haptic and verbal in terms of task completion time [10]. Results show that participants under haptic only condition take longer to complete the task. A hockey game was tested using different combinations of dual-sensory-modality and three-sensory-modality in a collaborative environment. Dual-sensory modality and three-sensory-modality combinations were also studied in [11] by presenting stimuli and the user responds by pressing the corresponding button.

Most of the current research focuses on evaluating quantitative task performance [9, 1218], tele-presence and social presence [12, 1921], subjective performance [2225], user experience [1, 6, 26] and subjective perceived safety [1]. In this paper, subjective measures and objective measures are used to assess different combined sensory-modality. This paper also evaluates the amount of cognitive workload for different single-sensory modalities investigates the effects of workload on task performance.

3 Methods

3.1 Participants

Twenty-nine participants volunteered for this experiment. All participants are male and female undergraduate and graduate students from the University of Waterloo. All participants were regular computer users. Demographic data has not been collected from participants.

3.2 Experimental Design

A between-subject design is utilized in this experiment. There are three trials; the first trial consists of a primary task only, the second trial adds a secondary cognitive, and the third trial is conducted under the same conditions as the second trial. The independent variables are the indications sent to the participant. The indications differ according to the trial in which the participant is volunteering. There are three different trials with differing primary tasks; Audio-Visual (AV), Haptic-Visual (HV) and Audio-Haptic (AH). The dependent variables are the time it takes the participants to press the virtual button (response time measured) and the number of times the participant presses the correct button (accuracy).

3.3 Procedure

The implementation of this experiment includes two different tasks: a primary task and a secondary task. During each experiment, each participant has to perform three trials. The first trial includes the primary task only. The second and the third trials include the primary task and the secondary task. The purpose of the third trials is to indicate the learning effect for each condition.

The primary task includes three different feedback modalities; they are auditory, visual and haptic. In the primary task, two of the modalities are combined to form the indications to press the virtual buttons; the combined modalities are Audio-Visual (AV), Haptic-Visual (HV) and Audio-Haptic (AH). During each primary task, every participant is introduced to only one of the combinations.

When the first trial is started, the participant has a one-minute training to get familiar with the operation of the haptic device. The participant also has a one-minute training in the second trial. For each primary task, each participant presses the virtual button that is indicated by the feedback combination. Depending on the combined feedback modes, the participant receives a flash of light from the button itself, a tone from the headset or a force from the haptic device.

The direction of the light, tone or force indicates the button to press. For instance, in the AV condition, if a participant receives a flash of light from the up button and a tone from both earbuds, the participant is required to press the up button. Figure 3(A), (B) and (C) shows the all possible visual, auditory and haptic feedback modes for this experiment. For each trial, a button is chosen randomly at 3 s interval. Participants are required to press all indicated buttons in the first trial. In the second and third trials, participants are required to press the buttons while performing a secondary task which is described below. All trials are 5 min long with 1 min of training prior to the first and second trial.

The secondary task includes three different feedback modalities; auditory, visual and haptic. In the secondary task, one of the modalities is used to form the information channel. The only trials that include the secondary task are the second trial and third trial.

After the first trial is completed, the participant has a one-minute training to get familiar with the operation of the haptic device and the procedure of recognizing and writing the codes. For the secondary task, each participant performs the primary task (pressing a virtual button) while a Morse code (e.g. •— •• — — •) is conveyed to the participant through other remaining sensory feedback. Depending on the feedback mode, the participant receives flashes of light from the button itself, tones from the headset or vibrations from the haptic device. For example, a participant receives flashes of flight from the virtual button when the visual feedback is chosen. All codes are between two to six bits of size. For the secondary task, a code is chosen randomly every 60 s. Participants are required to recognize and write the code by pausing the experiment (pressing the space bar). The experiment is automatically paused every 60 s to give the participant a chance to write the code on the paper provided to the participant.

The quantitative data collected in this experiment are the primary task response time (the time it takes a participant to press the appropriate button), response time accuracy (the number of correct button presses), the secondary task response time (the time it takes a participant to recognize the correct code), the secondary task accuracy (the number of correct codes) (Fig. 1).

Fig. 1.
figure 1

Haptic feedback mode (A) A force to the left (B) A force Up (C) A force to the right

4 Results

The results are analyzed using a between-subject Analysis of Variance (ANOVA) and multiple two-sample t-tests. The analysis is done at 0.05 significance level. In some cases, the analysis is done at 0.10 or 0.15 significance level. The significance level is stated when it is higher than 0.05. The average response time, accuracy, and subjective performance and subjective workload are calculated for all trials and conditions after eliminating all outlying data points (Fig. 2).

Fig. 2.
figure 2

Auditory feedback mode (A) A tone from the left earbud (B) A tone from both earbuds (C) A tone from the right earbud.

Fig. 3.
figure 3

The graphical user interface (GUI) and visual feedback mode (A) Left virtual button flashing (B) Up virtual button flashing (C) Right virtual button flashing.

4.1 Human Performance

In this section, the results of human performance analysis are presented. Figure 4 summarizes average primary task response time and standard deviation among trials. Figure 4 shows a significant difference in response time between the AH condition and the HV condition in the Trial 1. A slight difference between the HV condition and AV condition in trial 1 can also be seen in Fig. 4. The results of the ANOVA test, p = 0.0102, validate the alternate hypothesis that there is a difference between different sensory modalities for the first trial involving only the primary task.

Fig. 4.
figure 4

Average primary task response time

Because of the introduction of secondary task, there is a slight increase in response time especially apparent in AV condition and HV condition. In the second trial, the null hypothesis is also rejected at 0.05 alpha level. P-value of 0.0148 confirms that there is a difference in response time between conditions

According to Fig. 4, there is a difference in response time between the AH and AV conditions in the third trial at a significance level of 0.15. The ANOVA test p-value is 0.1364, and the two-sample t-test p-value for the difference between the AH and AV conditions is 0.0290. It is worthwhile to note that the response time considerably increased for the HV condition from the second trial to the third trial. The two sample t-test does not show any evidence for a difference between the AV and HV conditions (p > > 0.05) and HV and AH conditions (p > > 0.05).

Figure 5 shows the average primary task accuracy among all trials. The accuracy for all conditions in the first trial is considerably high. The accuracy for the AV and AH are 97 % and 98 % with a negligible increase of 1 %. The accuracy in the second trial is consistent for all conditions. 97 % accuracy is achieved for all conditions in trial 2.

Fig. 5.
figure 5

Average primary task accuracy

Fig. 6.
figure 6

Average secondary task response time

In the third trial, only the AV condition decreased to 93 %. ANOVA tests and multiple two-sample t-tests does not reveal any difference in accuracy between conditions.

4.2 Cognitive Workload

This section discusses the second part of the experiment which evaluates the human cognitive workload. Response time of secondary task for the second and third trials is plotted in Fig. 7. In both trials, auditory feedback is generally faster in terms of response time. Participants exposed to visual feedback in the secondary task take more time to recognize the codes. In the third trial, however, visual feedback response time is faster than haptic feedback, yet higher than auditory feedback mode. The auditory feedback mode has the highest accuracy of all conditions in trial 2 and trial 3. The visual feedback mode has a faster accuracy than the haptic and auditory feedback modes.

Fig. 7.
figure 7

Average secondary task accuracy

One-way ANOVA analysis does not show any differences in the average response time among conditions in both the second trial and the third trial (p-value > > 0.05). Figure 8 depicts the secondary average task accuracy between feedback modalities for all trials. As can be seen from the figure, there is a difference between the visual condition and the auditory condition in terms of accuracy in identifying the codes (p-value = 0.0224). The auditory condition shows a higher accuracy than the visual condition. A difference is not detected between the visual and haptic conditions or between the auditory condition and the haptic condition. In the third trial, the ANOVA test shows the same trend at a 0.10 significance level (p = 0.0709).

Two-sample t-test analysis of the secondary task accuracy supports the one-way ANOVA. The two-sample t-test shows that there is a difference between the haptic and auditory conditions (p-value = 0.0223) and the visual and auditory conditions (p-value = 0.0183).

5 Discussion

5.1 Human Performance

The lack of effective feedback is arguably one of the most common problems in the designing of interfaces. The results of this study provide some of the basic foundations on the design for human-in-the-loop applications. Most of the current implementations either utilize visual only or auditory only sensory modalities to provide feedback to users. Moreover, some studies has shown that only using one sensory modality is inadequate [10, 27]. To address these problems, sensory modality replacement or sensory modality addition to current and future interfaces may be adapted. This research study can potentially be used in HCI, tele-operation, collaboration, communication and medical applications (Fig. 6).

Many researchers find that only using audio and visual communication is ineffective. In [27], it is found that haptic communication increases presence and enhances user experience. Similarly, it is can be seen that haptic coupled with visual feedback has the lowest response time in the primary task. Although haptic feedback can increase human performance, the absence of visual feedback can be problematic [1].

In terms of tele-operation, the aviation industry is a promising area for multi-model feedback to be implemented. For instance, the ground control station of an Unmanned Aerial Vehicles (UAVs) can also be enhanced with an added haptic feedback. While most commercial and military airplanes provide haptic feedback by nature (except some fly-by-wire airplanes) since they are mechanically operated, UAVs do not have haptic feedback. The addition of haptic feedback to a UAVs’ ground controller can enhance the user experience and increase the performance. It has been proven that virtual forces increase the accuracy and decrease the time to complete the tele-operated task [9] in an assembly task.

The results of this research study can also be used in the design of HMI. Most of the warning in computers are visual warnings. As a result, the visual sensory modality of a computer user is already challenged. If the auditory sensory modality is being reserved for receiving auditory content, visual messages, such as “low battery”, can disturb the user. To solve this, messages can be haptically sent through available computer peripherals such as a mouse or a trackpad. Considering the limits of our senses, using one additional sensory feedback modality can be more suitable. For example, car engine sound provides a lot of information about its condition. However, humans might not have the ability to distinguish between a healthy engine and an ill engine. Therefore, a change in engine sound can be transmitted as visual information for the car operator to examine.

Our findings support the use of more than one sensory feedback in virtual environments to improve task performance. These findings complement existing findings in literature. For example, results from a collaborative virtual environment study show that participants under haptic only condition take longer to complete the task than haptic and verbal condition [10].

5.2 Cognitive Workload

Recent research supports the use of haptic as a communication medium [2830]. Instead of overloading the visual and auditory communication mediums, haptic cues are conveyed to users fixated in a demanding task. For instance, a haptic turn-taking protocol is suggested by [30].

Additionally, utilizing visual feedback poses more cognitive workload as graphical user interface become more complex [31]. It is found that increasing the dimensionality from 2D to 3D in visual interfaces decreases the ability for participants to locate, interact and manipulate objects. To address this problem, the implementation of another feedback mode is necessary to reduce the amount of cognitive workload and ensure effective communication.

This study also investigates the effects of cognitive loading on users’ judgement and the ability of users to recognize an encoded message when they are engaged in another demanding task. The results of the study on workload can specifically be used on communication. In other words, the type of sensory modality used to communicate can, depending on the context, be chosen to effectively convey a message. For instance, a pilot crew is frequently engaged with multiple communications at a time; they communicate with a pilot, in a multi-crew setting, and they communicate with Air Traffic Control (ATC) using an auditory channel. Therefore, a haptic turn-taking protocol, similar to the turn-taking protocol defined in [30], can be used for pilots to relinquish control of the airplane and to acknowledge control of the airplane.

6 Conclusion and Future Work

The information age has inundated communities with information. Thus, humans have to make more decisions very rapidly. Many tools, such as notepads, personal digital assistance (PDAs), and cellphones, have been invented to improve the brain or off-load some of its resources to external functions. Despite the fact that the human brain capacity can store much information, humans have retrieval limitations. Consequently, in highly-demanding safety-critical tasks, a proper feedback and communication model should be implemented to increase human performance and decreased the perceived workload. Our goal in this study is to find the ultimate feedback and communication model. The results of this study supports the hypothesis that there exists combined sensory modalities that has a higher response time and higher accuracy. However, there were some drawbacks in this experiment. The experiment only investigated 2D environments, the second level of collaboration, and a limited range of codes. In the next stage, the environment will be modified to include higher dimensions, higher levels of collaboration and an extended range of code. The future system will provide more information about the differences on different feedback and communication modes.