Keywords

1 Introduction

1.1 AR Content Display

Augmented Reality (AR) is a technology used to project artificial visual content into the real world. Using an optical head-mounted display (HMD), users can see both the real world and the superimposed AR content simultaneously without turning their head or moving their eyes to a different display. AR content can be displayed to both eyes (binocular) or one eye (monocular, left or right eye).

1.2 Binocular Display

When a binocular display is used, users benefit from (a) being able to use both eyes to focus on the same content, avoiding binocular rivalry [1, 2] or eye dominance concerns [3]. Binocular rivalry is a phenomenon of visual perception in which perception alternates between different images presented to each eye. Eye dominance is the tendency to prefer visual input from one eye over the other. Another user benefit of binocular displays is (b) the possibility of depth perception.

On the other hand, the vergence-accommodation conflict [4], a well-known problem for head-mounted stereoscopic displays that force the user’s brain to unnaturally adapt to conflicting depth cues, increases the fusion time of views while simultaneously reducing fusion accuracy.

1.3 Monocular Display

Previous research indicates that switching focal depth between virtual content and the physical world is easier and faster when monocular displays are used [5]. Additionally, with a monocular view it is possible to have a larger virtual information overlay in one eye while also not occluding or blocking the view of real-world surfaces and objects in the other eye.

1.4 Study Objectives

In this study, we focus on the quality of experience (QoE) of users performing manual tasks by following real-time, step-by-step instructions for 2D tasks such as drawing cartoon characters and 3D tasks. In these tasks, the instructions are presented in an AR HMD through various occlusion conditions (binocular, monocular while masking the other eye, and monocular without masking the other eye). For these tasks, the participants need to switch their attention between the virtual instruction overlay (the displayed instructions) and the real-world 2D surface or 3D object (on which they perform the drawing and assembly tasks). We then compare the QoE described by users for various occlusion conditions.

2 Methodology

2.1 Participants

Testing sessions are conducted in the Lenovo Research Lab located in RTP, North Carolina, USA. Twelve adults between 23–61 years of age, with normal or corrected-to-normal vision, participated in this study.

2.2 Tasks

Participants performed a 2D task of drawing cartoon characters (Mickey Mouse or Goofy) on a piece of paper by following real-time, step-by-step instructions [7, 8]. The participants also performed a 3D task of assembling LEGO bricks [9]. In both tasks, the instructions are presented in an AR HMD (ODG R-7 Smartglasses) through various occlusion conditions (binocular, monocular while masking the other eye, and monocular without marking the other eye) (Figs. 1 and 2).

Fig. 1.
figure 1

2D task – drawing cartoon characters

Fig. 2.
figure 2

3D task – assemble a kit of plastic building brick pieces

2.3 Procedures

The procedure begins with the user preference for configuration of the size and placement of the virtual AR content in the HMD display. The ODG R-7 Smartglasses have a display resolution of 1280 × 720 pixels per eye, and the user was allowed to choose the AR content size between the native full-screen resolution (1280 × 720 pixels), large (960 × 540 pixels), medium (640 × 360 pixels) and small (320 × 180 pixels). For sizes other than full-screen, they were further allowed to choose the placement of the AR content in the HMD display (upper-left, upper-middle, upper-right, center-left, center-middle, center-right, lower-left, lower-middle, or lower-right). In addition, for the monocular display the user chose which eye they preferred to see the display (left or right). While the participant was selecting their preferences, they were watching a portion of YouTube instruction videos for the tasks.

The participant was then instructed to perform the 2D drawing task, following the drawing instructions. Participants were instructed to draw Mickey Mouse using one display format (monocular or binocular) and then to draw Goofy using the other display format. The order of display formats was randomly selected for every participant. For the monocular display, participants started with both eyes open, and half-way through the task, participants were asked to mask the eye without a display by wearing an eye patch [10], shown in Fig. 3.

Fig. 3.
figure 3

Eye mask used in experiment

For the 3D assembly task, participants randomly started with one display format (monocular or binocular) and switched to the other display format in the middle of the assembly. Again, for the monocular display, participants started with both eyes open, and after enough assembly steps were executed, participants were asked to mask the eye without a display.

The instruction video for the 3D task ran at a fast pace even after lowering the play speed down to 25% of the normal speed. When a participant’s physical assembly fell behind the instructions, he or she was allowed to request a pause or rewind of the instructional video.

After the tasks were completed, the participants were asked to answer the following QoE questions:

  1. 1.

    Do you prefer binocular or monocular AR display in terms of the experience of performing the tasks? Why?

  2. 2.

    For monocular AR display, do you prefer to have the other eye masked or un-masked? Why?

3 Results and Discussion

3.1 Results

Table 1 lists the QoE (user preference) responses for binocular vs. monocular views. Table 2 lists the QoE (user preference) responses for using the mask for the monocular view.

Table 1. Participants’ QoE responses for binocular vs. monocular view.
Table 2. Participants’ QoE responses for with or without mask in the monocular view.

For both 2D and 3D tasks, the majority participants (83% for the 2D task, and 100% for the 3D task) rated the binocular display as the best experience. They provided the following reasons:

  • “Two eyes have better viewing quality.”

  • “Both eyes can focus on the same spot. That is more relaxing.”

  • “Two eyes provide a much better experience.”

For the 2D task, the majority of the participants (67%) rated the experience of using the monocular display with the other eye masked better than the experience of using the monocular display without masking the other eye. They listed the following reasons:

  • “With the mask, there is no competition between the two eyes.” (no rivalry)

  • “With the mask, the eye seeing the display feels more comfortable.”

  • “There is a better focus with the mask on.”

  • “With the mask, I was able to see the bottom part of the Mickey Mouse better, while without the mask the whole drawing area is positioned lower so I cannot see.”

  • “Without mask, I got a very serious duplicated view, and the eye without the display is not comfortable.”

  • “With mask, I have much clear view compared to the view without mask.”

For the 2D task a small subset of participants (25%) rated the experience of using the monocular display without masking the other eye as better than the experience of using the monocular display with the other eye masked. They listed the following reasons:

  • “To wear the mask made my eyes hurt, and I had watering eyes quickly after starting the task.”

  • “When the other eye was blocked, the eye being used felt very tired, two eyes were not coordinated and focused together. That made me feel uncomfortable.”

  • “For single-eye with the mask, the video’s position sometimes blocks my drawing hand, maybe a smaller video display might resolve this problem.”

On the other hand, for the 3D task the majority of the participants (67%) rated the experience of using the monocular display without masking the other eye better than the experience of using the monocular display with the other eye masked. They listed the following reasons:

  • “For 3D task, I needed depth perception so seeing different contents in each eye can be tolerated because of the gain for depth perception.” This comment was given by multiple participants.

  • “Without the mask, I can find the LEGO parts faster.”

  • “With the mask on, I lost depth perception, so I could not pick up pieces very well, even confused on the colors of the LEGO pieces.”

  • “With the mask on, the eye behind the mask started to see some statics, like snow flakes on a TV channel without broadcast program, …”

For the 3D task, another small subset of participants (25%) who preferred the experience of using the monocular display with the other eye masked over the experience of using the monocular display without masking the other eye. They listed the following reasons:

  • “I experienced serious duplicated view. That made me feel very uncomfortable in the other eye where there was nothing to display.”

  • “Without mask, the image becomes very blurred and the single most important factor is that I cannot live with that vision quality and with the mask, the quality is much better and clear.”

  • “Without mask, the view becomes blurred. With mask, it is much better.”

We had one participant (8%) who rated the two monocular experiences (with and without mask) as the same for both the 2D and 3D tasks. She stated the following reason:

  • “Single eye with or without mask feels the same for me. My left eye has better vision than the right eye. That maybe a factor.”

3.2 Discussion

For the experiments performed in reference [5, 6], AR images covered background figures displayed on the monitors, and observers were asked to trace star-shaped frame patterns on the background figures. For these experiments where AR images and background images overlapped onto each other, the superiority of the monocular AR presentation was demonstrated through the wider UFOV (Useful Field of View) and better tracing accuracy.

For our experiments, all participants reported that they chose to separate the AR content display area and the working area where they drew the cartoon characters or assembled plastic building pieces by tilting their head so that both views could be seen without any overlap or occlusion. Most participants chose to shift their visual attention back and forth in this way between the working area and the AR overlay area, as depicted in Fig. 4.

Fig. 4.
figure 4

A depiction of placing the virtual content adjacent to the physical workspace. Nearly all participants chose to arrange the content in this non-overlapping manner.

However, we did have one subject who wanted to overlap the instructions with his workspace, and asked if we could change the brightness of the AR display window so he could better see the working area through the HMD display.

Given the fact that nearly all participants naturally chose to position the AR display at a non-overlapping but close position to the working area, the majority of them found the binocular display to be the best because they can avoid binocular rivalry. The vergence-accommodation conflict is also minimized because the two attention centers (the AR content area and the drawing/assembling areas) do not overlap or interact in any way.

The experiment comparing monocular display with or without the mask provided interesting results. This experiment tests whether the user prefers to avoid the binocular rivalry problem, or to have their depth perception of the physical space. A majority of participants (75%) preferred to include the mask for the 2D task since depth perception isn’t particularly helpful for this task, while at the same time a similar majority (67%) preferred not to wear the mask for the 3D task since the need for depth perception outweighed the annoyance of binocular rivalry. However, a significant subset (25%) of participants could not tolerate the “pain” of the binocular rivalry and were willing to give up the benefits of depth perception.

Table 3 in the appendix listed additional User Experience (UX) comments given by 12 test subjects. As we can see, users have very diversified reactions and preferences over the visual experiences.

4 Conclusion

While performing manual tasks by following real-time, step-by-step instructions for 2D tasks such as drawing cartoon characters on a piece of paper and 3D tasks such as assembling plastic building pieces, participants naturally chose to separate the AR content display from the physical workspace and put them in adjacent locations in the field of view. By shifting the attention between the AR content display and the drawing/assembling area, participants preferred binocular display on the head mounted devices. For a monocular display, participants need to balance the benefit of the monocular display and the annoyance of binocular rivalry. While a majority of users can tolerate binocular rivalry, a significant subset of users has a low tolerance of binocular rivalry and prefer to mask or close the other eye.