Keywords

1 Introduction

Immersive virtual reality (IVR) has existed for decades, but 2016 is likely to become the year where it truly becomes accessible to consumers for the first time. This is an exciting prospect, but also one that involves a series of challenges. One activity which is likely to pose a challenge is virtual travel. Virtual travel, or locomotion, is regarded as one of the most common and universal activities occurring during interaction with three-dimensional (3D) computer-generated environments [3]. Throughout the following we use terms travel and locomotion interchangeably. Generally, travel can be understood as the low level actions performed in order to get from one point to another within a virtual (or real) environment; e.g., controlling the orientation, position and velocity of the virtual viewpoint [3].

In this paper we focus on a specific approach to facilitating virtual travel which appears to be ideally suited for use in relation to consumer IVR; namely Walking-in-Place (WIP) techniques. These techniques enable user’s to travel through virtual worlds by performing stepping-like moments on the spot that serve as a proxy for real steps. Particularly, we present a taxonomy of virtual travel tecniques, review past work on WIP locomotion, and summarize our recent work which has sought to increase the perceived naturalness of WIP locomotion [1420].

2 A Taxonomy of Virtual Travel Techniques

A plethora of different virtual travel techniques have been proposed—all uniquely suited for completing particular tasks and useful within specific contexts. Consequently, classification and categorization of interaction techniques has become a common theme within 3D interaction research, and several different, yet complementary, taxonomies classifying and categorizing interaction techniques for virtual travel have been proposed [3]. Our general taxonomy for describing virtual travel techniques [16] is inspired by existing categorisations [3, 23, 25, 30] and organizes virtual travel techniques into three orthogonal classifications: user mobility, virtual movement source, and metaphor plausibility.

Metaphor plausibility: First, virtual travel techniques may qualify as either mundane (virtual movement based on a metaphor adopted from real-world travel) or magical (virtual movement based on a metaphor that is not limited by real world constraints; e.g., the laws of physics, biological evolution, or the current state of technological development).

Virtual movement source: Second, one may distinguish between travel techniques that simulate body-centric travel (virtual movement is generated by directly exerting forces to the environment; e.g., simulation of walking, swimming, or flying) or vehicular travel (forces are indirectly produced through interaction a virtual vehicle or interface; e.g., the throttle and steering wheel).

User Mobility: Finally, it is possible to distinguish between travel techniques where the user is mobile (physical movement is necessary for virtual travel) or stationary (the user remains stationary while moving virtually).

Most of the travel techniques belonging to each of the eight sub-categories of the taxonomy (Fig. 1) have their merits in that they provide the users with a means to navigate virtual environments. Nevertheless, not all are equally viable in relation to consumer IVR, and the nature of the individual techniques makes them useful only to a limited set of applications [16]. In relation to metaphor plausibility there are important difference between techniques that qualify as either magical or mundane. For one, Bowman et al. [4] have suggested that magical techniques in many cases can be designed so as to offer superior task performance compared to mundane techniques. To exemplify, if the user is required to traverse great distances in the VR, then teleportation is likely to be much more efficient than virtual walking. However, the superior task performance can come at the expense of familiarity [4] and the given application may itself call for a technique based on a mundane metaphor; i.e., any scenario taking place in a world adhering to the same rules as physical reality. IVR has been used to simulate a range of different types of mundane forms of vehicular transport, and vehicle simulators have arguably used to provide some of the most compelling IVR experiences [5]. However, we frequently navigating our surroundings on foot and walking is generally regarded as a natural and promising approach to virtual travel [24]. Thus, it seems likely that many applications for consumer IVR will also involve body-centric modes of locomotion, such as walking and running. Turning to the question of user mobility, allowing users to physically walk through virtual environments provides a number of advantages; e.g., the physical translation produces vestibular self-motion information which furthers the walker’s spatial understanding [3]. Indeed, real walking has been highlighted as the most obvious and direct technique for virtual travel [3]. However, real walking poses a considerable problem since the virtual environment is likely to be larger than the physical interaction space. A number of mobile travel techniques have sought so minimize this issue. Most notably, redirection techniques that makes it possible to discretely or continuously, reorient or reposition the user through overt or subtle manipulation of the stimuli used to represent the virtual world (for an overview of redirection techniques see [25]). While such solutions seem very promising they do require the user to physically move and therefore do not seem feasible for consumer IVR where the spatial constraints are prominent. Several stationary approaches to virtual walking have been proposed, including but not limited to, omnidirectional treadmills [8], human-sized hamster balls [13] and friction-free platforms, [26]. In principle such systems could be deployed in the homes of consumers, but most current solutions require a considerable amount of space and even the cheaper alternatives [7] come at a relatively high price. Walking-in-Place techniques, constitute a practical and inexpensive alternative which can implemented using off-the-shelf hardware. The advantages of WIP locomotion include, cost-effectiveness and convenience [9], good performance on simple spatial orienting tasks [33], proprioceptive feedback similar, though not identical, real walking [22], and the ability to elicit a stronger sense of presence than more traditional peripherals [29]. Combined, these potential benefits suggest the need for finding the best possible WIP technique.

Fig. 1.
figure 1

Illustration of our taxonomy [16] which organizes virtual travel techniques based on virtual movement source (vertical axis), metaphor plausibility (horizontal axis), and user movement (division of each cell).

3 Walking-in-Place Techniques

It is possible to break down the process of producing virtual walking from steps in place into three steps: (1) proxy step detection, (2) speed estimation, and (3) steering [32]. The following review focuses on different approaches to these three steps (for a more comprehensive review of proposed WIP techniques and the evaluations of these please refer to [16]).

3.1 Proxy Step Detection

Proxy step detection can be performed using a variety of different hardware based on tracking of different body parts and varying properties of the performed movement. Generally it is possible to distinguish between systems that detect discrete gait events (e.g., foot-ground contact) and systems that track continuous movement (e.g., foot position and velocity) [14].

Physical walking platforms fall into the former category. Bouguila et al. [2] describe such an interface which is able to detect the user’s stepping speed based on four load sensors embedded in the platform. Moreover, this platform is able to reorient the user towards the visual display, and it is able to simulate surface inclines and declines via three air cylinders mounted underneath the platform. Similarly, the Walking Pad is a physical platform that is able to detect the user’s step frequency based on 60 iron switch sensors embedded on a 45 cm\(\times \)45 cm plexiglass surface [1]. This high number of sensors also makes it possible to detect the user’s orientation when both feet are grounded. Notably, this type of interaction has also been accomplished using commercially available hardware. Specifically, Wii Balance Boards, which are embedded with four pressure sensors, have been used to detect user’s steps during virtual locomotion [33]. The technique dubbed Shadow Walking [35] takes a very different approach to proxy step detection; i.e., a camera is used to track the shadows cast by the users’ feet onto the floor of an under-floor projection system within a six-sided CAVE, and based on this information the stepping speed is derived.

Interestingly, one of the earliest WIP techniques—the Virtual Treadmill [22]—did not involve tracking of the lower limbs. Instead, this technique relied on electromagnetic tracking of the user’s head movements and used a neural network to determine if the user was walking in place. The technique Low-Latency, Continuous-Motion Walking-in-Place (LLCM-WIP) relied on magnetic tracking to determine the vertical velocity of the user’s heels [9]. The successor of the LLCM-WIP—called Gait-Understanding-Driven Walking-In-Place (GUD-WIP)—similarly derived walking speeds from the velocity of the user’s vertical heel movement, but did so using an optical motion capture system [31]. One of the most recent WIP techniques—Speed-Amplitude-Supported Walking-in-Place (SAS-WIP)— also used optical motion tracking but relied on the footstep amplitude rather than heel-motion velocity [6]. Continuous tracking of the user’s movement can also be achieved using commercially available hardware. Particularly, the technique Sensor-Fusion Walking-in-Place (SF-WIP) is based on the acceleration and magnetic sensors embedded within two smart phones in combination with a magnet [10], and the skeletal data provided by Microsoft’s Kinect has also been used to facilitating WIP locomotion [12].

Finally, a combination of discrete and continuous tracking has also been used to facilitate WIP locomotion; e.g., the Gaiter WIP technique, allowed users to control virtual movement through a combination of force sensors embedded in shoe insoles and magnetic [27] or optical motion tracking [28].

3.2 Speed Estimation

Much of the literature on WIP techniques do not provide detailed accounts of how virtual speeds are produced from the user’s input [9]. Perhaps as a consequence, there is no generally accepted way of doing so. Based on a personal correspondence with one of the creators of the Virtual Treadmill, Feasel et al. [9] describe that this early WIP technique produced discrete viewpoint displacement; i.e., when the neural network registered a step, the viewpoint abruptly jumped a full step length forward. Moreover, this technique suffered from noticeable starting and stopping latency because movement was not instigated until four steps in place were detected, and movement would not be terminated unless no steps were detected for two full gait cycles. The LLCM-WIP, developed by Feasel et al. [9], was designed in order to provide low starting and stopping latency, continuous motion between steps, control of the speed during steps, and minimize erroneous movement during turns on the spot. LLCM-WIP does as mentioned take the velocity of the user’s heel movement as input (derived from the positional tracking through numerical differentiation). In very general terms the algorithm produces the virtual speed by smoothing the heel velocities of each foot (low-pass filtering), summing the resulting signals, and scaling this sum so that the output speed on average matches the users’ real walking speeds. The GUD-WIP algorithm reportedly outperforms the LLCM-WIP and differs from its predecessors in that it is informed by human biomechanics and produce walking speeds that better correspond with those of real walking. Moreover, it determines the virtual speed based on a biomechanics-inspired state machine that can estimate the step frequency several times per step. Since real walking speeds can be estimated from the height of an individual and a given step frequency, this permits the algorithm to produce realistic walking speeds (for implementation details see [31]). SAS-WIP does as mentioned rely on foot-step amplitudes, rather than step frequencies, for producing virtual speeds. This approach was chosen since steps in place, unlike real walking, predominantly involve vertical motions and each step may also take less time to complete. Specifically, the virtual velocity is calculated through multiplication of the foot speed and a scale factor based on the foot amplitude, and movement is stopped when both feet are grounded for more than an amount of time which is varied based on the foot speed [6]. Finally, Langbehn et al. [12] have proposed Leaning-Amplified-Speed Walking-in-Place (LAS-WIP). This technique is not interesting because of the way virtual speeds are derived from steps in place. Instead, it involves a novel way of scaling the speed derived from the steps in place; i.e., the user is able to increase the speed by leaning the torso forward.

3.3 Steering

In relation to virtual travel steering amount to the continuous manipulation of the direction of heading [3]. The direction of heading can either be derived from the data used for proxy step detection, as was the case in relation to the physical platforms described in subsection 3.1, or it can be obtained from additional trackers mounted on the user. At least five different approaches to steering during WIP locomotion have been used: joystick-controlled steering, gaze-directed steering, torso-directed steering, hip-directed steering, and feet-directed steering [9, 14, 33, 34]. A potential limitation of using joysticks or similar peripherals for steering is that this will deprive the user of the proprioceptive and kinesthetic feedback produced by whole body turns [28]. An advantage of gaze-directed steering, which translates the virtual position in the viewing direction, is that one does not need sensors besides from the ones used for head tracking. A limitation of this approach is that it limits the user’s ability to look around the environment while walking [33]. Nevertheless, it has been documented that gaze-directed steering may be experinced as preferable and perform better than torso-directed steering in regards to certain spatial orienting tasks [34]. Notably, in relation to torso-directed steering, trackers are often placed on the chest [9]. However, it has been suggested that placement of the tracker near the waist may be preferable [3] (i.e., something akin to hip-directed steering).

4 A Question of Naturalness

The novelty of proposed WIP techniques often derives from the particular hardware or algorithms used to enable virtual movement, and the evaluations usually involve comparisons with existing WIP techniques or other approaches to virtual locomotion (for a more detailed account of common measures please refer to [16]). Improvements to hardware and algorithms are undoubtedly crucial. However, considering that the general aim of WIP techniques is to provide an alternative to real walking, it seems meaningful for research on WIP locomotion to focus more explicitly on how we increase the perceived naturalness of the walking experience; i.e., how we can make the experience of navigating through virtual worlds using WIP techniques feel more like the real thing. Specifically, we have argued that when striving to increase the perceived naturalness of WIP locomotion, it is meaningful to take as the point of departure, the degree of correspondence between the sensorimotor loop of real walking and walking in place [15]. This view has led us to focus on two distinct, albeit interconnected questions: (1) How can we increase the perceived naturalness of the actions perform by the user during WIP locomotion? (2) How can we increase the perceived naturalness of the perception of the virtual environment resulting from said actions? In what follows we address each of these two general research questions and present the work we have performed thus far in order to address them.

5 Perceptually Natural Actions

The question of how to facilitate natural actions may be subdivided into at least two different, albeit interconnected, challenges; namely, the challenge of finding the gestural input which is perceived as the most natural by the user, and the challenge of how to provide the user with the most natural method for steering.

5.1 Gestural Input for WIP Locomotion

While a few exceptions exist [27], it would seem that most WIP techniques generally take the same gesture as input; i.e., a stepping gesture where the user alternately lifts each leg as if marching on the spot. However, it was the belief that this gesture might be less than optimal for two reasons: (1) It appeared to be more physically straining than real walking which may decrease perceived naturalness. (2) When used in combination with a head-mounted display (HMD), the user tends to physically drift in the direction of heading [32]. This motivated us to perform two within-subjects studies exploring alternative gestural input for WIP locomotion. The first study (n=27) [14] compared three gestures: the common WIP gesture, Wiping (alternately bending each knee as if wiping the feet on a doormat), and Tapping (alternately tapping each heel against the ground). The second study [15] (n=20) was focused on gestures devoid of explicit leg motion and compared four gestures: the common WIP gesture, Hip Movement (alternately swinging the hips to the left and right), Arm Swinging (alternately swinging each arm back and forth), and keyboard input (while standing the user pressed a button to move). In both studies the participants performed a simple walking task requiring them to walk along a predefined path within a scenic virtual environment. The visuals were presented using a HMD and the users’ movements tracked using an optical motion capture system. The different types of gestural input were among other things compared based on self-reported measures of how natural they were (to what extent did they feel like real walking) and how physically straining they were compared to real walking. The amount of physical drift was logged during all walks. The results of the first study revealed that Tapping was perceived to be as natural as the traditional gesture and corresponded best with real walking in terms of perceived exertion. Also, Tapping led to significantly less drift than Wiping and the traditional gesture. The second study revealed that Arm Swining and the traditional gesture were perceived to be the most natural, and Arm Swinging provided the best match with real walking in terms of physical strain. The fact that Arm Swinging prevents walkers from interacting with their hands while walking combined with the ratings of naturalness across the two studies led us to believe that Tapping probably would be preferable for most applications. Even though Tapping, or a variation of this gesture, seems promising, it does not solve the problem of how to enable backwards and lateral movement. While a few exeptions exist [27, 35], most work on WIP locomotion has focused on forwards movement. Thus, it is necessary for future research to explore the gestural input and algorithms, that can produce perceptually natural movement in other directions.

5.2 Perceptually Natural Steering

The question of what body part to rely on when deriving the user’s orientation is still an open question. While gaze-directed steering may be superior on certain spatial orienting tasks, this steering method will presumably be perceived as less natural since it differs notably from how steering is performed during real walking. At first glance, the difference between torso-directed and hip-directed steering seems negligible. However we have informally observed that torso-directed steering using a tracker on the chest may be less natural compared to feet-based or hip-based steering since users may slightly turn their upper bodies while looking around the environment and thereby veer off course. Future studies should compare these steering methods in order to determine which ones are the most natural and how they affect performance and spatial perception.

6 Perceptually Natural Self-Motion and Limb-Movement

The question of how to facilitate natural perception, may also be subdivided into at least two challenges; i.e., the challenge of facilitating natural self-motion perception and natural movement of virtual limbs.

6.1 Motion Perception During WIP Locomotion

Existing WIP techniques have aspired to produce realistic walking speeds [9, 31], and intuitively one might expect realistic speeds to be preferable. However, studies have shown that individuals tend to underestimate visually presented speeds when walking on a linear treadmill; i.e., visual speeds mathcing the treadmill speed feel too slow (for examples see references in [17]). If the same is true of WIP locomotion, then it is necessary to establish what speeds are perceived as natural during this form of locomotion. We performed seven studies and two meta-analyses in order to determine if speeds are indeed misperceived during WIP locomotion, and explore what factors that influence this misperception [1720]. Common to all seven studies was that the participants would walk in place and walk on a treadmill down a virtual corridor at a fixed step frequency (1.8 steps per second) while a HMD displayed a range of visual gains; i.e., scalar multiples of their normal walking speed (1.0 would correspond to their normal walking speed). They were then asked to determine at what gains the speed was natural; i.e., it matched the movement they were performing. Across the studies three different gain presentation methods (GPMs) were used, implying that there were differences in terms of how the gains were presented and how the participants provided their judgements. (1) Randomized Order (RO): Each gain was repeated twice and they were presented in randomized order, and the participants judged if each gain was ‘too slow’, ‘natural’, or ‘too fast’. (2) Reversed Staircases (RS): Each gain was repeated twice, but either in an ascending or descending series, and judgements were made as in relation to RO. (3) User Adjustment (UA): the speed would either start at the lowest or the highest gain, the participants controlled the visual speed using a scroll wheel and had to identify the upper and lower limits of the speeds they found natural. The three methods were adapted from existing psychophysical methods (the method of constant stimuli, the method of limits and the method of adjustment) All seven studies (S1-S7) relied on within-subjects designs and compared WIP and treadmill locomotion. S1 involved two additional movement types (Tapping and no leg movement), S2 compared four different display field of view (FOV), S3 compared three different geometric FOV, S4 compared three different degrees of peripheral occlusion, S5 compared two different HMD weights, S6 compared three different step frequencies, and S7 compared the three different gain presentation methods outlined above. Table 1 presents the number of participants, gain presentation method, range of gains and conditions used in S1 to S7.

Table 1. No. of participants, GPMs, range of gains and conditions of S1 to S7.

S1 did not reveal significant differences between the traditional WIP gesture, Tapping, treadmill walking and no leg movement. However, we were able to demonstrate that underestimation of visually presented walking speeds may indeed occur during WIP locomotion, and there appear to exist a range of gains that are perceived as natural. S2 revealed significant differences between the different display FOV across both WIP and treadmill locomotion, suggesting that the size of the FOV may be inversely proportional to the degree of underestimation. In other words, the misperception appear to decrease as the FOV of the display becomes larger. In relation to S3, a similar effect was observed with respect to the geometric field of view. S4 and S5 found no significant effects in relation to varying degrees of peripheral occlusion and increased HMD weight. S6 provided some indication that high step frequencies may be accompanied by an increased underestimation of the visually presented speeds. Finally, S7 revealed that the choice of gain presentation method may affect the upper and lower bounds of the gains which participants find natural. While S1 did not suggest that the underestimation of speeds varies across WIP and treadmill locomotion, we were able to provide evidence that there may be a difference through meta-analyses of the data from all seven studies [16]. Particularly, the meta-analyses suggested that individuals tend to find slightly higher speeds natural when walking on a treadmill compared to when they are walking in place.

6.2 Self-Perception During WIP Locomotion

The sensation of virtual body-ownership may be crucial to compelling IVR experiences [21]. However, it may prove difficult to sustain this illusion during WIP locomotion if the virtual legs exhibit normal gait behaviour in response to the user’s steps in place. This would produce visuomotor asynchrony which is believed to break the illusion of ownership of the virtual body [11]. Thus, it will be necessary for future work to investigate if there are ways to produce a sense of virtual body-ownership during WIP locomotion.

7 Conclusions

Throughout this paper we presented arguments suggesting that WIP locomotion may prove to be a meaningful way of facilitating virtual walking in relation to consumer IVR. However, there are still challenges that needs to be met. While, WIP techniques have improved greatly since Slater et al. [22] proposed the Virtual Treadmill, it remains important to try and improve techniques with respect to the virtual-locomotion speed-control goals introduced by Feasel et al. [9]: smooth between-step locomotion speed, continuous within-step speed control, real-world turning and manoeuvring, and low starting and stopping latency. With respect to perceptually natural actions, future work should try to determine what gestures that provide the most natural experience of walking forward, backward and laterally, and what steering methods will be perceived as the most natural. With respect to natural perception, we still do not know exactly what causes underestimations of visually presented walking speeds, or if this perceptual distortion will be eliminated once we get HMDs of even higher fidelity. As a consequence it may be necessary to establish HMD specific guidelines describing what gains to apply in order to produce perceptually natural motion perception. Moreover, it has yet to be documented whether virtual body-ownership can be sustained during WIP locomotion. Obviously, the limitations of the available tracking and display systems will constrain the degree of naturalness developers can opt for. To exemplify, systems such as the HTC Vive currently do do not support full body tracking, making it impossible generate self-motion and virtual leg movement from tracking of the lower extremities, and it precludes torso or hip-directed steering. Fortunately, tracking solutions such as Microsoft’s Kinect or Sixsense’s STEM System could resolve this issue.

The experience of WIP locomotion will probably never become truly mistakable for real walking. Nevertheless, if the challenges outlined in the current paper are addressed, it seems possible that this type of virtual travel may serve as meaningful substitute in relation to consumer IVR. Our hope is that the findings outlined in the current paper will help bring WIP locomotion one step closer to this goal.