Keywords

1 Introduction

In 3D modeling systems, such as Autodesk Maya, users often use 2D input devices, typically mouse and keyboard, to manipulate objects in virtual environments. While it seems more appropriate to use 3D input devices, current technologies are not yet competitive for such applications. According to Hodson, the Leap Motion represents a huge step in the development of 3D input technologies, and enables the creation of 3D user interfaces that might eventually surpass the mouse [10].

Previous research on in-air interaction has pointed out several weaknesses, such as fatigue, lower accuracy, and in some cases slower interaction speed [16, 18]. Fatigue may also cause users to relax the poses needed for gestures, increasing the chances of interpretation errors [16] and decreasing pointing precision [3].

One way to address this is to build hybrid user interfaces that combine the freedom of in-air interaction with the precision of 2D devices. We draw upon the concept of casual and focused interaction. Casual interaction targets a different level of engagement, at which users want to or are able to interact with the system [14, 15], which is well suited for the Leap Motion. A hybrid solution might then enable users to perform operations with in-air interaction that are potentially inefficient with 2D input devices, such as coarse-scale 3D rotations, followed by fine-adjustments with a 2D input device. This will also address the fatigue associated with prolonged use of in-air interaction.

Here, we investigate the Leap Motion in comparison to a keyboard and mouse setup for 2D manipulation tasks. We examine related work on in-air interaction, previous approaches for handling 3D and 2D input, and lastly models for determining the efficiency of in-air interaction and transitions between input devices. The goal is to investigate if a hybrid interface is a viable and efficient solution.

1.1 Related Work

Leap Motion. The Leap Motion controller is a 3D interaction device that allows users to interact with a system through free-hand motions and gestures. The device detects a user’s hands, and is able to detect and distinguish between unimanual and bimanual interaction, including the orientation of the hands and individual fingers. Previous research examined the accuracy and reliability of the Leap Motion during interaction [8, 19] and also identified fatigue issues. Other work compared it to touch or mouse interaction and found it to be less effective, in terms of accuracy and selection speed [16, 18] as well as pointing throughput [3]. In a direct comparison with the mouse and with the mouse wheel for (discrete) depth control, the Leap Motion was about half as fast for multiple 3D selection tasks [6]. Han and Gold found that the “normal” upright orientation of the controller seems to deliver the most consistent results in terms of tracking capabilities, followed by placing the controller at an \(\sim \)45\(^{\circ }\) angle [9].

Un-Instrumented In-Air Interaction. To determine appropriate control schemes for un-instrumented (free-hand) in-air interactions, this section examines research on six degrees of freedom (6 DOF) input devices. We examine three main approaches for six DOF devices. The first approach focuses on absolute control, i.e., a one-to-one mapping between motions [1]. The second approach uses relative control, i.e., an indirect mapping. In one exemplar scheme, hand tilt controls the velocity of object tilt [5]. Schlattmann et al. used the direction of the index finger and also found that rotational mappings were preferred and bimanual interaction generated more fatigue than unimanual interaction [17]. The third approach uses more abstract mappings, involving gestures or keyboard clutches. Pareek and Sharma’s work for 3D CAD still required switching gestures, which made it more time consuming to go through several stages of manipulation [13].

Interaction with Two Degrees of Freedom Devices. The most common two DOF interaction device is the mouse. While it is often used to perform two DOF tasks, a combination of modifier keys and movements enables work in multiple DOF. This section examines how 2D devices can interact and control environments with more than two DOF. Applications that support full 3D interaction, such as Autodesk Maya and Unity 3D, have all adapted similar control schemes for scaling, rotating and translating of objects in six DOF. Different combinations of keyboard modifier keys (also known as clutches) together with various mouse actions enable manipulation of different DOFs. Zhao et al. used the (discrete) mouse wheel to control a third DOF for rotation [20].

Fitts’ Law. Fitts’ law models human movement and predicts the time required for performing a movement to a target area, such as moving a finger or cursor to a target and selecting it. The model is a function of the distance to the target and the target size. Fitts’ law uses an index of difficulty (ID) to describe the difficulty of the motor task. The equation for ID is given in Eq. 1 [7, 12].

$$\begin{aligned} Index\, of\, Difficulty\, (ID) = log_{2}(\frac{amplitude}{width} + 1) \end{aligned}$$
(1)

The movement time (MT) is then a linear expression of the ID, i.e., MT = a + b * ID. Fitts’ law has been used to compare a variety of different input techniques. Bérard et al. [2] presented another application of Fitts’ model. They developed a measure of device human resolution (DHR) for three input devices to determine the smallest possible target a user of a given input device can select with reasonable effort.

The Keystroke-Level Model. One of the main models used to predict the performance of keyboard and mouse interfaces is the Keystroke-Level Model (KLM) [4]. This model predicts the completion time of error free tasks. For this, the interaction is split into a sequence of simple operators, each with a time estimate. The total predicted time for a task is then the sum of operators. KLM has been adapted for many interfaces through new time estimates and adding new operators that describe interaction parts for that specific system. Holleis et al. identified that the homing operator, for transitions between keyboard and mouse, is irrelevant for mobile phones [11]. The homing operator is relevant for hybrid interfaces, as adding a device to a system that necessitates additional transitions can impact performance. Here, we are examining transitions between mouse, keyboard and in-air interaction to determine the transition costs.

2 Methods and Materials

Here we explore in-air interaction through three user studies. The initial two pretests collect information about the users and the device, which then informed the design of our main study. The first pretest focuses on uni- and bimanual gestural interaction when interacting with the Leap Motion, as well as the preferred position of the Leap Motion, extending the work by Han and Gold, who investigated only orientation [9]. The second pretest measures the DHR for the Leap Motion to identify a reasonable minimum object size for interaction. The findings from these pilots inform the main user study in terms of the best physical position of the Leap Motion, gestures for the manipulation of objects, and reasonable target deviation thresholds for the main user study.

2.1 First Pretest

First, we focused on uni- and bimanual interactions with Leap Motion. This pretest had two phases. In the first one users have to pluck petals, with the device placed at four different positions (in front, behind, left, and right of the keyboard), to identify the most efficient position for the Leap Motion. The second phase was an elicitation study where we prompted users to use gestures for object manipulation (rotate, scale, translate, and select) with the device. The pretest had six right-handed participants, one female, with ages from 21 to 28 years old (M = 25.7, SD = 2.49), who all had experience with 3D interaction devices.

We did not compare unimanual and bi-manual operation directly. Completion times were significantly different for both unimanual (\(F_{3,15} = 3.38\), \(p < 0.05\), \(\eta ^2 = 0.33\)) and bimanual (\(F_{3,15} = 3.28\), \(p < 0.05\), \(\eta ^2 = 0.33\)) interaction. There was also a significant difference between the behind and the left position. There was a significant interaction between the in front and right positions for unimanual and the front position and both the left and right positions for bimanual. The participants rated both the front and behind positions higher than the left and right positions. These differences were statistically significant (\(\chi ^2(3) = 10.50\), \(p < 0.05\)). Both in front and behind positions were significantly better than the left, but not compared to the right.

Fig. 1.
figure 1

Exemplary hand gestures proposed by users: (A, B) Unimanual translation. (C, D) Unimanual selection. (A, B, C, D) Users mirror unimanual gestures for bimanual. (E) Unimanual rotation. (F) Bimanual rotation. (G) Unimanual scale. (H) Bimanual scale.

Overall, for both uni- and bimanual interactions the smallest interaction times occurred when the device was placed in front or behind the keyboard. In addition, these positions were preferred and observations confirmed that interaction poses were also more relaxed for these conditions.

For the second phase we elicited gestures for various 3D manipulations. Illustrative examples of the resulting gestures are shown in Fig. 1. In general, users preferred unimanual interaction for tasks involving selection, rotation, and translation, while scaling had equal preference for uni- and bimanual interaction.

The findings of this pretest informed our main study as follows. (1) We position the Leap Motion in front of the keyboard. (2) We use the following gestures for the manipulation of targets in the main user study: For translation tasks, we use the position of the hand to move the targets, as in Fig. 1(A) and (B). For rotation, we orient objects with the rotation of the users’ hand and wrist, as in gesture E. For scaling, we use the distance between the fingers and thumb and scale targets proportionally, as in G. No bimanual gestures were used. This also enables us to use the keyboard as a clutch for the Leap Motion in the main study, which lets users control when the Leap Motion should detect interactions.

2.2 Second Pretest - Device Human Resolution

The purpose of this pretest is to determine the DHR [2] of the Leap Motion controller. This determines the minimum usable target size and enables comparisons to the DHR of other devices. We replicated the setup used by Bérard et al. [2]. Participants had to align a pointer within a one-dimensional target area, which decreased in size.

Six male volunteers, with ages 21 to 26 years (M = 23.33, SD = 1.63) participated. There was a sequence of seven target sizes in decreasing order, with a width of 32, 24, 16, 8, 4, 2, and 1 ticks, each repeated 20 times and with 250 ticks distance from the starting point. The interface was displayed on a 1920\(\,\times \,\)1080 monitor, and a single tick corresponded to moving four pixels on screen.

Table 1. Results of the second pretest

The results of the experiment are summarized in Table 1. We calculated a linear regression for the data to analyze the deviation from Fitts’ model, for each subset of three successive IDs each (except for the first and last). Following Fitts’ law, we would expect the slope to remain close to constant. A significant increase would indicates a DHR threshold, but no subset slope deviated significantly from the overall one (0.66). Yet, higher ID’s show higher amounts of variability, with a great increase for the last ID (where no other comparison point exists). This is similar to the free-space device results in Bérard et al.’s work [2]. For the average error rate per task, shown in Fig. 2, we see a growth between ID 6 and 7, followed by a larger increase. Thus we can expect a reasonable failure rate up to \(\sim \)2 mm target size. We performed a Friedman ranked sum test on both the error and time data. The two smallest targets have significantly higher errors than the rest (\(\chi ^2(6) = 28.32\), \(p \ll 0.001\)). This is followed by 4 and 24 ticks target size, followed by the remaining three. In terms of timings, there are significant differences between all pairs (\(\chi ^2(6) = 35.43\), \(p \ll 0.001\)), except for 32 and 16 tick widths.

Fig. 2.
figure 2

Average error distance for each ID with standard error bars.

We conclude from this pretest that with the Leap Motion target sizes should not go below 1.2 mm (ID of 6). A reasonably low error rate can be achieved for target sizes of 2.4 mm (ID of 5) and above. Thus and to ensure comparable difficulties, we set the task thresholds in our main study to 0.036 mm for the mouse and 2.4 mm for the Leap Motion.

2.3 Main User Study

The purpose of this study is to determine the cost of transitions between the Leap Motion and a keyboard and mouse setup. We also aim to develop a model of transition times for a three device setup, illustrated in Fig. 3. Further, we examine two-dimensional interaction to identify the differences between devices.

Participants. 31 volunteers were recruited from the local university. Ages ranged from 21 to 36 years (M = 24.6, SD = 3.56). Five were female. All participants were right-handed, regular users of computers, and experienced with pointing devices and uninstrumented interactions.

Fig. 3.
figure 3

Illustration of hybrid interface with Leap Motion, keyboard, and mouse. The arrows indicate transitions between center points of individual devices. * Indicates the special mouse to mouse transition, see text.

Apparatus and Materials. The experiment was conducted on laptops with 15.6"screens at 1366\(\,\times \,\)768. We used a Leap Motion controller in the standard configuration and a mouse at 1800 DPI with acceleration disabled. Participants were allowed to relocate the mouse and Leap Motion for a comfortable working posture. Figure 3 shows the setup. Distances between the centers of the devices were measured after each participant had completed the test. The average distances were: keyboard and mouse (M = 36.4 cm, SD = 3.8 cm), keyboard and Leap Motion (M = 22.6 cm, SD = 2.8 cm), and finally Leap Motion and mouse (M = 33.4 cm, SD = 5.9 cm).

Fig. 4.
figure 4

Illustration of the three main tasks. The rotation task matches the target orientation, assisted by the transparent overlay. In the translation task user drag the green box to the red one. The rightmost image shows the scaling task, where participants had to scale the green box to match the dimensions of the red one.

Procedure. Before the experiment, each participant was given a short demographics questionnaire and an introduction to the Leap Motion. The following training session used tasks similar to those in the experiment and familiarized users with all control schemes and tasks. Training was repeated at least twice or until participants felt confident in the tasks. Each task involved rotating, scaling or translating an object, as shown in Fig. 4. In order to perform each task, the user used a key on the keyboard with one hand (typically the non-dominant one) to “clutch” the tool, and then depending on the task, used the mouse or the Leap Motion with the other (dominant) hand. To complete a task, the user-controlled object needed to match the target within a certain threshold, as determined by the pretest. As previous research indicated that larger numbers of DOF reduce both accuracy and selection time [6] for the Leap Motion, we deliberately restricted manipulation to two DOF. We measured the completion time of each task, the transition time between keyboard, Leap Motion, and mouse in each direction, how precisely the participant matched the target, and finally the amount of times the clutch was engaged during each individual task. The whole test took approximately 10 min.

Design. The experiment was a 3\(\,\times \,\)2 \(\,\times \,\) 2 within-subject design, with the main task type (scale, rotation, translation), primary interaction device (mouse and Leap Motion), and alternate tasks (involving either the keyboard or mouse) as independent variables. The three primary tasks are shown in Fig. 4. The main dependent variables were the task completion time, the transition times between the different devices, and the amount of times that the clutch was activated. A timeline illustrating the transitions is shown in Fig. 5.

Fig. 5.
figure 5

A timeline illustrating the different transitions in the main study.

We used alternate tasks to create situations where participants had to transition between input devices; this ensured that we measured times consistently. We measured only single hand transitions and enforced this by having the users activate a keyboard clutch with the other hand while interacting. There were two types of alternate tasks. One required a transition to the mouse, the other to the keyboard. For the mouse alternate task, the users had to press an on-screen soft key with the mouse (always with the same movement distance). The keyboard alternate task prompted the users to press both control keys on the keyboard to continue. Transition times from the main task devices (Leap Motion and mouse) to the alternate task devices (keyboard and mouse), were measured from the time the participant had completed the main task until they had completed the alternate one. Transitions in the opposite direction, from the alternate task device to the main devices, were measured from when the participant had completed an alternate task until they started a main task. To reduce the potential influence of mental preparation, we deliberately designed the tasks to be simple and effectively routine by the time participants had completed training.

The gestures for manipulating targets for a given task type were based on absolute and relative mappings, see Sect. 1.1. The translation tasks manipulated the x- and y- coordinates, i.e., needed only 2D input, with a relative mapping. The keyboard clutch (the control key) enabled participant to reposition their hand, e.g., when moving outside of the Leap Motion’s tracking area. The scaling tasks were visually 2D, but the scaling was uniform, effectively making this a 1D task, with an absolute mapping of the distance between fingertips. The rotation tasks are visually a rotation of a 3D target, inspired by Zhao et al. [20]. Yet, in our study the object rotates only around two axes and the interaction used a relative mapping with the keyboard clutch to avoid over-rotation of the wrist.

Participants received all tasks with the same device in a block, with the device order being counterbalanced across participants. The order of tasks within the two device blocks was randomized. Each of the 31 participants performed a total of 18 trials (2 input devices \(\,\times \,\) 3 tasks \(\,\times \,\) 3 difficulty levels).

Results. Through a repeated measures ANOVA test, we found that the device significantly affected task completion time (\(F_{1,30} = 196.72\), \(p \ll 0.001\), \(\eta ^2 = 0.15\)). The Leap Motion device used significantly more time (M = 8.75, SD = 0.72) than the mouse (M = 4.8, SD = 2.97), see Table 2. The task type did not significantly affect the data.

Table 2. The first entry in each box is the mean completion time in seconds and the one in brackets is the standard deviation.

Looking at the transition times from the main interaction device (Leap Motion or mouse) to the secondary one (keyboard or mouse), we identify a significant effect of the main device (\(F_{1,30} = 258.26\), \(p \ll 0.001\), \(\eta ^2 = 0.19\)). A small effect of the target device is present (\(F_{1,30} = 21.52\), \(p \ll 0.001\), \(\eta ^2 = 0.02\)). There is also an interaction between the two factors (\(F_{1,30} = 142.79\), \(p \ll 0.001\), \(\eta ^2 = 0.11\)). This is likely due to the lack of transitions for the mouse-to-mouse case, which thus measures only mouse travel and reaction times. See Table 3 for the mean transition times. In the transitions from the secondary interaction device to the main one, we found that the transition time was affected to a lesser degree by the main interaction device (\(F_{1,30} = 62.11\), \(p \ll 0.001\), \(\eta ^2 = 0.06\)).

Table 3. The first entry in each box shows the mean transition time in seconds and the one in brackets the standard deviation. The transitions are from the Main Device (MD), either the Leap Motion or the Mouse, to one of the two alternative tasks, or in the opposite direction.

Comparing both transition types, we see that the main interaction device has a small effect (\(F_{1,30} = 236.89\), \(p \ll 0.001\), \(\eta ^2 = 0.10\)). The direction of the transition has also a small effect (\(F_{1,30} = 98.26\), \(p \ll 0.001\), \(\eta ^2 = 0.05\)). A Friedman rank sum test found that the interaction device significantly affects the amount of clutch actions (\(\chi ^2(1) = 25.14\), \(p \ll 0.001\)). The Leap Motion needed significantly more clutching. The main task type did not have a significant effect on clutching. See Table 4 for the mean number of clutch actions.

Table 4. First entry in each box is the average amount of clutch activations for each combination of task type and device, and in brackets the standard deviation thereof. The minimum amount of clutch actions needed to complete any task with the Leap Motion was one, and zero with the mouse.

After completing the study, we gave participants a short questionnaire, which asked about fatigue (on a five-point Likert scale) for interacting with the Leap Motion, as well as their preferred interaction device. Twelve participants (39 %) stated that they experienced no fatigue and the rest experienced a moderate amount. Users stated that fatigue was not an issue for short sessions, but might become an issue for longer ones. Several participants placed their elbow on the table to reduce fatigue. 25 participants (80 %) expressed a strong preference towards the mouse, three chose the Leap Motion (10 %), and the remaining three had no preference (10 %). This difference is significant (\(\chi ^2(2) = 35.68\), \(p \ll 0.01\)). When asked to elaborate, participants mentioned previous experience and precision for the mouse. Others mentioned lack of fatigue as a factor for their preference towards the mouse. Many mentioned that further experience with the Leap Motion might improve their performance and preference. Several identified the Leap Motion as being fun, engaging, and a new experience.

3 Discussion

The results of our user study showed that the mouse input significantly outperformed in-air interaction in terms of completion time. The high variance for Leap Motion suggests that further training could reduce times. Further exploration is necessary to identify tasks that are better performed with in-air input.

In the current study, we examined the transition times between mouse and keyboard input and the Leap Motion. There was a significantly higher transition cost for the in-air device. However, these differences are not very large. Using the transitions between mouse and keyboard as a baseline, the transitions between the keyboard and the Leap Motion were only 0.48 s (16 % increase) longer, see Table 3. Transitions between the Leap Motion and the mouse took 1.26 s longer (47 % increase). Thus the average extra transition cost to and from the Leap Motion was only 0.87 s (32 % increase), relative to a mouse-keyboard transition. Subtracting the reaction time and mouse travel time (identified from the mouse-to-mouse case) we get an average mouse-keyboard transition of 0.37 s. This is comparable with the 0.4 s homing time from Card et al. [4], which partially validates our methodology.

In the direct comparison with the mouse, the Leap Motion was slower. As transition times are not that long, it is still worthwhile to investigate the role an in-air device could play in a hybrid setup. One suggestion is to use the Leap Motion only for coarse adjustments. This way fewer transitions would be needed and the impact of transition times would be lessened. The tasks in our experiment involved only two DOF, which favors the mouse over the Leap Motion. Tasks that require more DOF could balance this out as the Leap Motion can provide (at least) six DOF for the hand or a single finger and potentially more when multiple fingers are used.

We identified that the amount of clutch activations was higher for in-air interaction. For the translation tasks users had to translate objects from one side of the screen to the other, which required clutching at least once. Also, the tracking of the Leap Motion was worse in the outer reaches of the interaction area. This may have affected precision and encouraged clutching. Yet we also can see that coarse interaction, e.g., putting an object into an approximate position or a “general” orientation, is easier with the Leap Motion. Conversely, in hybrid interfaces, precise fine-tuning is better performed with the mouse. Such a hybrid approach also implicitly limits the amount of time that users spend interacting with the Leap Motion, thereby reducing fatigue.

The second pretest revealed that the Leap Motion could be used to select targets as small as eight ticks without a significant increase in effort. This is consistent with Bérard’s findings [2], which partially validates our methodology. Yet, for the Leap Motion the increase in both movement time and error rates happened between one and two target sizes less than the free-space device used by Bérard. Thus the Leap Motion was more precise and could select targets of a smaller size.

4 Conclusion

We evaluated transition times within a three-device hybrid setup, which included a keyboard, a mouse and the Leap Motion. As expected, the Leap Motion was slower to complete 2D tasks than the mouse. Yet, we found that transition times were only slightly affected by the input device, which is a positive result as the Leap Motion could be used together with the mouse without introducing an overly large transition cost between devices. This implies that it is feasible to design hybrid interaction setups, where coarse-scale manipulation tasks are done with the Leap Motion and the mouse is then used for precision work.