Abstract
Voice recognition systems provide a method of hands-free control of robotic systems that may be helpful in law enforcement or military domains. However, the constraints of the operational environment limit the capabilities of the on-board voice recognition system to a keyword-based command system. To effectively use the system, the users must learn the available commands and practice pronunciation to ensure accurate recognition. Virtual reality simulation provides users the opportunity to train with the voice recognition system and the robot platform in realistic interactive scenarios. Training using virtual reality with a head-mounted display may increase immersion and sense of presence compared to using a keyboard and monitor. A small pilot study compared user experience in the desktop mode and the virtual reality mode of our voice recognition system training tool. Participants controlled a simulated unmanned ground vehicle in two different modes across four different environments. The results revealed no significant differences in simulator sickness, sense of presence, or perceived usability. However, when asked to choose between the desktop mode and the head-mounted display mode, results indicate users’ overall preference for the head-mounted display. However, the users also perceive the head-mounted display to be more complex, less consistent, and more difficult to learn to use. The desktop mode was perceived as easier to use and users reported being more confident when using it.
Supported by the Center for Advanced Vehicular Systems, Mississippi State University and by the Slovak Research and Development Agency - APVV-15-0731, APVV-SK-TW-2017-0005, Ministry of Education, Science, Research and Sport of the Slovak Republic under the research project VEGA 1/0511/17 and by Cultural and Educational Grant Agency of the Slovak Republic, grant No. 009TUKE-4/2019.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Voice recognition systems provide a powerful potential method of control for robotic systems. In law enforcement, communication between team members is verbal and gestural. By providing a verbal interface for a small unmanned ground vehicle (sUGV) for special weapons and tactics (SWAT) operations, team members can operate the sUGV hands-free and maintain situation awareness [1]. However, the constraints of the operational environment limit the network connectivity and on-board computational power available to the voice recognition system and thereby limit its capabilities to a keyword-based command system. In the keyword-based command system, the officers must learn the available commands and how to pronounce them to ensure proper recognition. Then, the officers must accurately recall the commands and say them correctly in highly stressful and dynamic situations. In an early test of our voice recognition system, officers failed to recall the commands [1]. To address recall failures and to assist with recognizable pronunciation of commands, we developed training tools that allowed officers to practice issuing verbal commands to the voice recognition system [2, 3]. Our most recent virtual environment training tool includes an operating environment, a simulated sUGV, and supports both virtual reality and desktop computer-based training [3].
Virtual reality provides a more immersive training environment that increases engagement and retention [4, 5]. However, VR requires users to wear a head-mounted display, isolate themselves from their surroundings, and set aside time dedicated to VR training. The desktop training system has lower computational requirements, requires less start-up time, and better supports drop-in/drop-out training. It is not known whether the benefits of VR training outweigh the limitations.
2 Related Work
Virtual reality using head-mounted displays is an emerging technology with many potential applications [5,6,7,8,9,10]. The most recent generation of VR HMDs has made significant progress in addressing technological issues that have previously limited adoption of the technology. Improved tracking, reduced latency, high resolution displays, and advanced graphics capability have converged to provide powerful, immersive simulations. The technology is not only effective for gaming or data visualization. The technology has been rapidly adopted for education and training by the military, industry, and sports [4, 8]. VR training can increase retention of knowledge and improve task performance.
Despite its advantages, there are drawbacks to VR HMDs that may limit use of the technology. HMDs can cause user discomfort in many forms: eyestrain, heat, neck and head pain, fatigue, and simulator sickness [11,12,13,14,15]. Many of the advances in HMD technology have helped to address factors known to contribute to simulator sickness: low frame rates, low quality displays, high latency, poor tracking. However, movement through virtual spaces can lead to simulator sickness and there is no solution available that fully addresses the issue. Often, the user has limited space to move either because of the limitations of the physical space (room size, obstacles) or because of the limitations of the VR system. Many of the methods that allow the user to move through a virtual space larger than the physical space will contribute to simulator sickness [15, 16]. Steering movement, with a joystick in VR and/or with a keyboard and mouse in games, often leads to simulator sickness in VR [14]. Some methods, like teleportation or portaling [17,18,19], modify how the user moves through space. Some methods apply visual effects to reduce simulator sickness [20, 21]. Others use physical motions to drive virtual motions and provide physical cues to the user’s sensory system [22, 23]. Another popular technique, redirected walking, takes advantage of control of all visual inputs to the user and manipulates the user into walking in circles while believing they are walking straight [24]. The different methods available each have strengths and weaknesses that may depend on the context and the tasks to be performed in the virtual environment.
VR HMDs are also not always convenient. The HMD is not an integral part of the computer. It is an optional add-on purchased for special applications. Typically, a user will use a keyboard and monitor for most tasks and then start their specific VR application, wear the HMD, and then interact with the virtual environment. In the VR HMD, the user is often blind to the outside environment and may have difficulty communicating with those in their physical space [25]. For any task outside of the specific application, the user may have to remove the HMD, perform the task with the keyboard and mouse, put the HMD back on, and return to the VR application. This switching cost may also reduce the perceived usability of a VR training tool or application.
The differences in VR and desktop modes for training and learning have been previously explored by many researchers, but the results have been inconsistent. In some cases, there is no difference found in quantitative assessments, but participants self-report special benefits (improved spatial insights, more realistic) and increased difficulties in the VR mode [26]. Others show only slight improvements for VR in quantitative metrics [27]. In a navigation task, users reportedly prefer the VR mode but measures of performance are better in a desktop mode [28]. These results suggest that the strengths of VR may be offset by the weaknesses associated with VR. The advantages of VR may be context dependent and limited to benefits in specific aspects of the training task.
The current study compares VR and desktop modes for a training tool to evaluate potential differences in simulator sickness, sense of presence, usability, and user preferences for the two modes.
3 Apparatus
We developed the desktop and VR training tool using Unity 2017. The tool was designed to provide more realistic and immersive training with the voice recognition system. In the training tool, participants were directed to search virtual environments for boxes containing contraband (e.g., drugs) and find and disarm a small bomb. Participants interacted with the simulation in VR using an HMD and on a desktop system using a standard display.
3.1 Robot and Environment
We imported a virtual sUGV model based on Dr. Robot’s Jaguar V4 Mobile Robotic Platform [29]. A physical robot of the same design is used in our laboratory and in training activities with local law enforcement officers. Four virtual environments were used in the study. We acquired two complete virtual environments from the Unity Asset Store: a desert city environment [30] and a shooting range [31]. We developed two additional environments for the project: a school environment consisting of a single hallway lined with lockers and two classrooms and an office space with three rows of cubicle desks. See Fig. 1 for top-down renderings of the four virtual environments.
For this study, participants were told to search for boxes of contraband and a bomb (see Fig. 2). We placed two boxes of contraband and a single bomb in each of the environments. In each environment, the items were placed in two configurations: one for the VR mode and one for the desktop mode.
3.2 Command and Control
The basic functions of the robot (move forward, backward, turn left, turn right, activate lights, activate sirens, etc.) were implemented using both physical controls and voice commands. Participants used voice commands to activate systems on the robot. Table 1 lists the voice commands available to participants during the study. In the study, participants used a ‘push-to-confirm’ model. In this model, the recognizer was always running and attempting to interpret utterances made by the participant. Participants used a keyword, ‘Apple’, to indicate to the recognizer that a command was being issued to the robot. The word(s) following the keyword were interpreted as a command to the robot. If no command was recognized in an utterance, the utterance was ignored. If a command was recognized, the command was displayed to the participant via the voice command user interface. The participant then must confirm the voice command and only then will the action be performed. The ‘push-to-confirm’ model reduced the chances of an accidental activation of one of the robot systems.
Participants used physical controls to select menu items, to drive the robot, and to activate special commands. To accommodate differences in the VR and desktop systems, controls were varied slightly between the two modes.
In VR, participants wore the HMD (Oculus Rift virtual reality headset) and held two controllers (Oculus Touch controllers). Participants used the built-in microphone on the HMD to issue voice commands. Three Oculus cameras were used to provide full 360-degree tracking of the participants. Participants selected menu commands by pointing at the menu items and pressing the left controller joystick. Once in the environment, participants directed movement of the robot using a joystick on the controller held in their left hand. Locomotion in VR could lead to simulator sickness. However, the study environments are large and include multiple rooms. For this study, we chose to use a common method of movement in VR: teleportation. Participants press down on the right controller’s joystick, point to where they want to move to, and release the joystick. Upon release, the participant’s camera is instantly re-positioned above the target position. This method allows participants to control their view and minimizes simulator sickness. At times, the robot could become stuck in the environment. Participants could reset the robot position by pressing and holding the right controller’s grip button.
In the desktop mode, participants used a keyboard and touchpad for the physical controls. Participants wore a headset microphone (Logitech Wireless Gaming Headset G930) to issue voice commands and used the touchpad only at the start of a scenario to make their selections from menus. They directed the movements of the robot using the ‘W-A-S-D’ keys, a common configuration for gaming. A significant difference from the VR environments was that the participant’s point of view was always locked to the robot’s position. Participants could select between two views: a first-person view as if they were viewing the scene through the robot’s camera and a third-person view as if they were viewing from a chase camera just behind and above the robot. Participant’s used the ‘Z’ key to switch between the views. On the desktop system, participants reset the robot using the ‘R’ key (Table 2).
4 Method
4.1 Participants
Participants were recruited from the general population in and around Starkville, MS. Five participants completed the preliminary study (3 men, 2 women). The average age of participants was 27.4 (SD: 7.16). All of the participants reported familiarity with virtual reality and reported at least some experience playing video games (80% sometimes play and 20% often play). Two participants wore corrective lenses. With regard to frequency of simulation or motion sickness, 1 reported that it occurred often, 2 sometimes, and 2 never.
4.2 Procedure
All procedures were reviewed and approved by the Mississippi State University Institutional Review Board. We observed participants as they completed training in both environments (desktop and VR) to evaluate user preferences and usability of VR training compared to desktop training for learning voice controls for a sUGV in a law enforcement domain.
Participants completed a short demographics survey and an initial simulator sickness questionnaire (SSQ) [15]. The initial SSQ score provided a baseline score for comparison. Participants were randomly assigned to start with the VR mode or the desktop mode. Participants opened the training tool using a shortcut on the desktop. In VR mode, participants put on the HMD and picked up the controllers. In desktop mode, participants put on the headset microphone. In both the VR mode and desktop mode, participants began by completing an unscored trial in the desert city environment to familiarize themselves with the display and the controls used in the current mode. The remaining three environments were presented in random order. In each trial, participants searched the environment for two boxes of contraband and a single bomb. We instructed participants to perform the following tasks: (1) find the items, (2) use the robot’s ‘scan’ function to verify that the object was contraband or a bomb, (3) take a photo using the robot’s ‘photo’ function, and (4) in the case of a bomb, use the robot’s ‘disarm’ command to disable the bomb. We further instructed participants that the highest priority was to find and disarm the bomb. Participants were given up to eight (8) minutes to search the environment. When participants disarmed the bomb, the trial ended, whether they had discovered the contraband boxes or not.
After each trial, participants removed the HMD or the headset microphone and completely closed the training tool application. Participants then completed a SSQ and a system usability survey (SUS) [32]. After completing all four trials in VR mode or desktop mode, participants completed a 30-question presence survey [33, 34] then switched to the other mode. After completing all trials for both modes, participants were asked to indicate their preferred mode: VR, desktop, or both on 10 usability items (adapted from the SUS) [32].
5 Results
Survey data was collected on-site using a Qualtrics web-based survey. Overall, the results revealed no significant differences between the desktop and HMD modes for simulator sickness, sense of presence, or perceived usability. When participants were asked to choose between the desktop mode and the HMD mode, results indicated that, overall, participants preferred the head-mounted display. However, participants also reported that the head-mounted display was more complex, less consistent, and more difficult to learn to use. The desktop mode was perceived as easier to use and participants reported being more confident when using it.
5.1 Simulator Sickness
The SSQ consists of 16 items that describe symptoms associated with simulator sickness (e.g., headache, eyestrain, etc) [15]. Participants responded by indicating their current feelings with respect to the symptoms with possible responses including None (0), Slight (1), Moderate (2), and Severe (3). We calculated the total simulator sickness score according to [15] for each trial. The average and maximum total score for the VR and desktop are listed in Table 3. There was no significant difference in simulator sickness symptoms between baseline, VR mode, and desktop mode, F(2,8) = .942, p = .48.
5.2 Presence
The presence survey consisted of 30 items that taken together attempt to assess the level of immersion in the virtual environment. Our survey was based on [33] with two questions related to haptic interaction removed. A presence survey was completed at the completion of the VR mode trials and the desktop mode trials. Table 4 lists descriptive statistics for the presence survey. As with the SSQ, there was no significant difference between the VR and desktop modes.
5.3 Usability
Participants were asked about usability of the VR mode and the desktop mode in two different ways: First, participants completed the SUS [32] after each trial. Second, after all trials were completed, participants were asked to select their preferred mode for 10 items based on the SUS items. The SUS is a 10-item survey designed to evaluate the usability of a system. We scored the SUS for each trial and then combined the VR and desktop scores to compare the overall means. As with the SSQ results, there was not a significant difference between the mean reported usability for the VR and desktop systems, t(4) = −1.793, p = .147. In Table 5, there did appear to be a large difference in minimum reported usability. In our preliminary data set, there was a single outlier participant that particularly disliked the VR system (M = 11.67 SUS) but appeared to find the desktop more usable (M = 31.67 SUS). This is the only participant with a large difference in SUS scores for the two modes. This difference also was only observed for SUS; there was not a large difference in participant’s SSQ and presence results for the two modes. For all other participants, the mean SUS for VR and desktop were roughly the same.
After completing all trials, participants were asked a series of questions based on the SUS items. For each of the 10 items, participants chose between the VR mode, the desktop mode, or both. Table 6 lists the item text and the proportion of responses for each item.
6 Discussion
This preliminary study revealed clear differences in user perception of the VR and desktop modes of the training tool. Both modes of training (VR and desktop) showed no signs of simulator sickness despite requiring participants to explore four separate scenarios, each one lasting up to 8 min. Neither virtual environment imparted any significant symptoms of simulator sickness to the participants and was likely not a factor in their perceived usability of the system.
When comparing the VR and desktop versions, the majority of participants preferred to use VR or both systems. Only one participant preferred only the desktop version. The increased complexity of the VR mode reported by participants was likely due to the added complexity of the navigation system used by the participant to move in the VR environment. VR was also perceived to be inconsistent and to be poorly integrated into the system. Again, the added complexity of the movement system in VR likely contributed to this perception. In addition, the mapping of actions to controller buttons could also be improved. There was some inconsistency in use of the joystick button for menu selection (push button + pull trigger) and for movement (push button + release button). This may also have contributed to perception of complexity in the VR mode.
The increased complexity of the VR training tool likely contributed to the increase in the participants expectation that additional support and learning would be required to use the VR system for training. The combination of these factors likely contributed to the overall sense that, in comparison to the VR training, the desktop mode was perceived as easier to use and imparted a higher sense of overall user confidence.
Overall, the users were able to use both modes to interact with the voice recognition system and the training tool appears to have potential regardless of which mode users prefer.
7 Conclusions and Future Work
This small pilot study compared participant experience in two modes: VR and desktop. We believed that the VR mode would provide additional immersion and sense of presence to participants but would also be more difficult to use and could cause participants to suffer symptoms of simulator sickness. Participants’ responses indicated that the two modes provided similar sense of presence and usability. When asked to select between the systems, participants’ responses indicated a preference for the VR mode but also identified challenges that may limit use of the VR mode. Overall, the training tool scored well on usability. Future work should expand the sample size. The single participant that reported a poor experience could be a true outlier or could represent a minority group that would strongly prefer the desktop mode. Future research should also evaluate participant performance with the voice recognition system, progress throughout training, and long-term retention and transfer from the training tool to the real world.
References
Pleva, M., Juhar, J., Cizmar, A., Hudson, C., Carruth, D., Bethel, C.: Implementing English speech interface to Jaguar robot for SWAT training. In: IEEE 15th International Symposium on Applied Machine Intelligence and Informatics (SAMI), Herl’any, Slovakia, pp. 105–110. IEEE (2017)
Hudson, C.R., Carruth, D.W., Bethel, C., Pleva, M., Juhar, J., Ondas, S.: A training tool for speech driven human-robot interaction applications. In: 15th International Conference on Emerging eLearning Technologies and Applications (ICETA), Stary Smokovec, Slovakia, pp. 1–6. IEEE (2017)
Hudson, C.R., Bethel, C.L., Carruth, D.W., Pleva, M., Ondas, S., Juhar, J.: Implementation of a speech enabled virtual reality training tool for human-robot interaction. In: 2018 World Symposium on Digital Intelligence for Systems and Machines (DISA), Kosice, Slovakia, pp. 309–314, IEEE (2018)
Bailenson, J.: Experience on Demand: What Virtual Reality is, How it Works, and What it Can Do. W. W. Norton & Company, New York (2018)
Stratos, A., Loukas, R., Dimitris, M., Konstantinos, G., Dimitris, M., Chryssolouris, G.: A virtual reality application to attract young talents to manufacturing. Proc. CIRP 57, 134–139 (2016)
Bout, M., Brenden, A. P., Klingegård, M., Habibovic, A., Böckle, M.-P.: A head-mounted display to support teleoperations of shared automated vehicles. In: 9th International Conference on Automotive User Interfaces and Interactive Vehicular Applications Adjunct, pp. 62–66, ACM, New York (2017)
Feeman, S.M., Wright, L.B., Salmon, J.L.: Exploration and evaluation of CAD modeling in virtual reality. Comput.-Aided Des. Appl. 15(6), 892–904 (2018)
Feloni, R.: Walmart is using virtual reality to train its employees. http://www.businessinsider.com/walmart-usingvirtual-reality-employee-training-2017-6. Accessed 6 Apr 2018
Jensen, L., Konradsen, F.: A review of the use of virtual reality head-mounted displays in education and training. Educ. Inf. Technol. 23(4), 1515–1529 (2018)
Clifford, R.M.S., Khan, H., Hoermann, S., Billinghurst, M., Lindeman, R.W.: The effect of immersive displays on situation awareness in virtual environments for aerial firefighting air attack supervisor training. In: 2018 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), pp. 1–2, IEEE (2018)
Souchet, A. D., Philippe, S., Zobel, D., Ober, F., Lévêque, A., Leroy, L.: Eyestrain impacts on learning job interview with a serious game in virtual reality: a randomized double-blinded study. In: 24th ACM Symposium on Virtual Reality Software and Technology, pp. 15:1–15:12, ACM, New York (2018)
Wang, Z., Chen, K., He, R.: Study on thermal comfort of virtual reality headsets. In: Ahram, T. (ed.) AHFE 2018. Advances in Intelligent Systems and Computing, vol. 795, pp. 180–186. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-94619-1_17
Sharples, S., Cobb, S., Moody, A., Wilson, J.R.: Virtual reality induced symptoms and effects (VRISE): comparison of head mounted display (HMD), desktop and projection display systems. Displays 29(2), 58–69 (2008)
McCauley, M.E., Sharkey, T.J.: Cybersickness: perception of self-motion in virtual environments. Presence: Teleoper. Virtual Environ. 1(3), 311–318 (1992)
Kennedy, R.S., Lane, N.E., Berbaum, K.S., Lilienthal, M.G.: Simulator sickness questionnaire: an enhanced method for quantifying simulator sickness. Int. J. Aviat. Psychol. 3(3), 203–220 (1993)
So, R.H., Lo, W.T., Ho, A.T.: Effects of navigation speed on motion sickness caused by an immersive virtual environment. Hum. Factors 43(3), 452–461 (2001)
Bozgeyikli, E., Raij, A., Katkoori, S., Dubey, R.: Point & teleport locomotion technique for virtual reality. In: 2016 Annual Symposium on Computer-Human Interaction in Play, pp. 205–216, Austin, Texas, USA (2016)
Weißker, T., Kunert, A., Fröhlich, B., Kulik, A.: Spatial updating and simulator sickness during steering and jumping in immersive virtual environments. In: IEEE Conference on Virtual Reality and 3D User Interfaces (VR), pp. 97–104 (2018)
Smink, K., Carruth, D.W., Swan, W., Davis, E.: A new traversal method for virtual reality: overcoming the drawbacks of commonly accepted methods. In: Human Computer Interaction International, Orlando, FL, USA (2019)
Norouzi, N., Bruder, G., Welch, G.: Assessing vignetting as a means to reduce VR sickness during amplified head rotations. In: SAP 2018: ACM Symposium on Applied Perception 2018, p. 8. ACM, Vancouver (2018)
Farmani, Y., Teather, R.J.: Viewpoint snapping to reduce cybersickness in virtual reality. In: Graphics Interface, pp. 1–8 (2018)
Whitton, M.C., et al.: Comparing VE locomotion interfaces. In: IEEE Virtual Reality 2005, VR 2005, pp. 123–130. IEEE, Bonn (2005)
Loup, G., Loup-Escande, E.: Effects of travel modes on performances and user comfort: a comparison between ArmSwinger and teleporting. Int. J. Hum.-Comput. Interact. (2018)
Razzaque, S., Kohn, Z., Whitton, M.C.: Redirected walking. In: EUROGRAPHICS 2001 (2001)
Chan, L., Minamizawa, K.: FrontFace: facilitating communication between HMD users and outsiders using front-facing-screen HMDs. In: 19th International Conference on Human-Computer Interaction with Mobile Devices and Services, pp. 22:1–22:5. ACM, New York (2017)
Greenwald, S.W., Corning, W., Funk, M., Maes, P.: Comparing learning in virtual reality with learning on a 2D screen using electrostatics activities. J. Univ. Comput. Sci. 24(2), 220–245 (2018)
Shu, Y., Huang, Y.-Z., Chang, S.-H., Chen, M.-Y.: Do virtual reality head-mounted displays make a difference? A comparison of presence and self-efficacy between head-mounted displays and desktop computer-facilitated virtual environments. In: Virtual Reality (2018)
Sousa Santos, B., et al.: Head-mounted display versus desktop for 3D navigation in virtual reality: a user study. Multimed. Tools Appl. 41(1), 161–181 (2009)
Dr Robot: Jaguar V4 Mobile Robot Platform. http://jaguar.drrobot.com/specification_V4.asp. Accessed 31 Jan 2019
Troth, S.: Desert Environment. http://devassets.com/assets/desert-environment/. Accessed 31 Jan 2019
Miguel, J.C.: Shooting Gallery. https://assetstore.unity.com/packages/3d/environ- ments/shooting-gallery-enviroment-pack-105306. Accessed 31 Jan 2019
Brooke, J.: SUS: a “quick and dirty” usability scale. In: Jordan, P.W., Thomas, B., Weerdmeester, B.A., McClelland, A.L. (eds.) Usability Evaluation in Industry. Taylor and Francis, London (1996)
Witmer, B.G., Jerome, C.J., Singer, M.J.: The factor structure of the presence questionnaire. Presence: Teleoper. Virtual Environ. 14(3), 298–312 (2005)
Witmer, B.G., Singer, M.J.: Measuring presence in virtual environments: a presence questionnaire. Presence: Teleoper. Virtual Environ. 7(3), 225–240 (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Carruth, D.W., Hudson, C.R., Bethel, C.L., Pleva, M., Ondas, S., Juhar, J. (2019). Using HMD for Immersive Training of Voice-Based Operation of Small Unmanned Ground Vehicles. In: Chen, J., Fragomeni, G. (eds) Virtual, Augmented and Mixed Reality. Applications and Case Studies . HCII 2019. Lecture Notes in Computer Science(), vol 11575. Springer, Cham. https://doi.org/10.1007/978-3-030-21565-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-21565-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21564-4
Online ISBN: 978-3-030-21565-1
eBook Packages: Computer ScienceComputer Science (R0)