Skip to main content
Log in

The effects of spatial auditory and visual cues on mixed reality remote collaboration

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

Collaborative Mixed Reality (MR) technologies enable remote people to work together by sharing communication cues intrinsic to face-to-face conversations, such as eye gaze and hand gestures. While the role of visual cues has been investigated in many collaborative MR systems, the use of spatial auditory cues remains underexplored. In this paper, we present an MR remote collaboration system that shares both spatial auditory and visual cues between collaborators to help them complete a search task. Through two user studies in a large office, we found that compared to non-spatialized audio, the spatialized remote expert’s voice and auditory beacons enabled local workers to find small occluded objects with significantly stronger spatial perception. We also found that while the spatial auditory cues could indicate the spatial layout and a general direction to search for the target object, visual head frustum and hand gestures intuitively demonstrated the remote expert’s movements and the position of the target. Integrating visual cues (especially the head frustum) with the spatial auditory cues significantly improved the local worker’s task performance, social presence, and spatial perception of the environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. https://dottyar.com/.

  2. https://placeholder-software.co.uk/dissonance/.

  3. https://creator.magicleap.com/learn/guides/soundfield-user-guide-for-unity.

  4. https://www.sheffield.ac.uk/polopoly_fs/1.714575!/file/stcp-marshall-FriedmanS.pdf.

  5. https://www.sheffield.ac.uk/polopoly_fs/1.714573!/file/stcp-marshall-WilcoxonS.pdf.

References

  1. Adcock M, Anderson S, Thomas B (2013) RemoteFusion: real time depth camera fusion for remote collaboration on physical tasks. In: Proceedings of the 12th ACM SIGGRAPH international conference on virtual-reality continuum and its applications in industry. pp 235–242

  2. Alem L, Li J (2011) A study of gestures in a video-mediated collaborative assembly task. Adv Hum-Comput Interact 2011:1

    Article  Google Scholar 

  3. Bai H, Sasikumar P, Yang J, Billinghurst M (2020) A user study on mixed reality remote collaboration with eye gaze and hand gesture sharing. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–13

  4. Barde A, Ward M, Helton WS, Billinghurst M, Lee G (2016) Attention redirection using binaurally spatialised cues delivered over a bone conduction headset. In: Proceedings of the human factors and ergonomics society annual meeting, vol 60. SAGE Publications Sage CA, Los Angeles, pp 1534–1538

  5. Beck S, Kunert A, Kulik A, Froehlich B (2013) Immersive group-to-group telepresence. IEEE Trans Visual Comput Graph 19(4):616–625

    Article  Google Scholar 

  6. Billinghurst M, Bowskill J, Jessop M, Morphett J (1998) A wearable spatial conferencing space. In: Digest of papers. Second International Symposium on Wearable Computers (Cat. No. 98EX215). IEEE, pp 76–83

  7. Billinghurst M, Kato H (2002) Collaborative augmented reality. Commun ACM 45(7):64–70

    Article  Google Scholar 

  8. Blessenohl S, Morrison C, Criminisi A, Shotton J (2015) Improving indoor mobility of the visually impaired with depth-based spatial sound. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 26–34

  9. Brewster S, Walker V (2000) Non-visual interfaces for wearable computers. In: IEE Colloquium (Digest). IEE; 1999, p 6

  10. Brooke J (2013) SUS: a retrospective. J Usability Stud 8(2):29–40

    Google Scholar 

  11. Brooke J et al (1996) SUS-a quick and dirty usability scale. Usability Eval Ind 189(194):4–7

    Google Scholar 

  12. Buxton W, Moran T (1990) Europarc’s integrated interactive intermedia facility (IIIF): early experiences. In: Multi-user interfaces and applications, vol 11, p 34

  13. Chen H, Lee AS, Swift M, Tang JC (2015) 3D collaboration method over hololens\(^{\text{TM}}\) and skype\(^{\text{ TM }}\) end points. In: Proceedings of the 3rd international workshop on immersive media experiences. ACM, pp 27–30

  14. DeVincenzi A, Yao L, Ishii H, Raskar R (2011) Kinected conference: augmenting video imaging with calibrated depth and audio. In: Proceedings of the ACM 2011 conference on computer supported cooperative work, pp 621–624

  15. Fussell SR, Kraut RE, Siegel J (2000) Coordination of communication: effects of shared visual context on collaborative work. In: Proceedings of the 2000 ACM conference on computer supported cooperative work, pp 21–30

  16. Gao L, Bai H, Piumsomboon T, Lee G, Lindeman RW, Billinghurst M (2017) Real-time visual representations for mixed reality remote collaboration

  17. Gauglitz S, Nuernberger B, Turk M, Höllerer T (2014) World-stabilized annotations and virtual scene navigation for remote collaboration. In: Proceedings of the 27th annual ACM symposium on user interface software and technology, pp 449–459

  18. Gross M, Gross M, Würmlin S, Naef M, Lamboray E, Spagno C, Kunz A, Koller-Meier E, Svoboda T, Van Gool L et al (2003) blue-c: a spatially immersive display and 3D video portal for telepresence. In: ACM Transactions on Graphics, vol 2. ACM, pp 819–827

  19. Gupta K, Lee GA, Billinghurst M (2016) Do you see what I see? The effect of gaze tracking on task space remote collaboration. IEEE Trans Visual Comput Graph 22(11):2413–2422

    Article  Google Scholar 

  20. Härmä A, Jakka J, Tikander M, Karjalainen M, Lokki T, Hiipakka J, Lorho G (2004) Augmented reality audio for mobile and wearable appliances. J Audio Eng Soc 52(6):618–639

    Google Scholar 

  21. Harms C, Biocca F (2004) Internal consistency and reliability of the networked minds measure of social presence

  22. Higuch K, Yonetani R, Sato Y (2016) Can eye help you? Effects of visualizing eye fixations on remote collaboration scenarios for physical tasks. In: Proceedings of the 2016 CHI conference on human factors in computing systems. ACM, pp 5180–5190

  23. Huang W, Kim S, Billinghurst M, Alem L (2018) Sharing hand gesture and sketch cues in remote collaboration. J Visual Commun Image Represent 58:428–438

    Article  Google Scholar 

  24. Joachimczak M Liu J, Ando H (2018) Downsizing: the effect of mixed-reality person representations on stress and presence in telecommunication. In: 2018 IEEE international conference on artificial intelligence and virtual reality, pp 140–143

  25. Kim S, Lee G, Sakata N, Billinghurst M (2014) Improving co-presence with augmented visual communication cues for sharing experience through video conference. In: 2014 IEEE international symposium on mixed and augmented reality, pp 83–92

  26. Kim S, Lee GA, Sakata N, Dünser A, Vartiainen E, Billinghurst M (2013) Study of augmented gesture communication cues and view sharing in remote collaboration. In: 2013 IEEE international symposium on mixed and augmented reality, pp 261–262

  27. Koleva N, Hoppe S, Moniri MM, Staudte M, Bulling A (2015) On the interplay between spontaneous spoken instructions and human visual behaviour in an indoor guidance task. In: CogSci

  28. Lee GA, Teo T, Kim S, Billinghurst M (2017) Mixed reality collaboration through sharing a live panorama. In: SIGGRAPH Asia 2017 mobile graphics & interactive applications, pp 1–4

  29. Lee GA, Teo T, Kim S, Billinghurst M (2018) A user study on mr remote collaboration using live 360 video. In: 2018 IEEE international symposium on mixed and augmented reality, pp 153–164

  30. Loomis JM, Golledge RG, Klatzky RL (1998) Navigation system for the blind: auditory display modes and guidance. Presence 7(2):193–203

    Article  Google Scholar 

  31. Müller R, Helmert JR, Pannasch S (2014) Limitations of gaze transfer: without visual context, eye movements do not to help to coordinate joint action, whereas mouse movements do. Acta Psychol 152:19–28

    Article  Google Scholar 

  32. Orts-Escolano S, Rhemann C, Fanello S, Chang W, Kowdle A, Degtyarev Y, Kim D, Davidson PL, Khamis S, Dou M et al (2016) Holoportation: virtual 3D teleportation in real-time. In: Proceedings of the 29th annual symposium on user interface software and technology. ACM, pp 741–754

  33. Piumsomboon T, Lee GA, Hart JD, Ens B, Lindeman RW, Thomas BH, Billinghurst M (2018) Mini-me: an adaptive avatar for mixed reality remote collaboration. In: Proceedings of the 2018 CHI conference on human factors in computing systems. ACM, New York, NY, USA, pp 46:1–46:13

  34. Sandra HG (2006) Development of NASA TLX: result of empirical and theoretical research. San Jose State University, California

  35. Sawhney N, Schmandt C (2000) Nomadic radio: speech and audio interaction for contextual messaging in nomadic environments. ACM Trans Comput-Hum Interact 7(3):353–383

    Article  Google Scholar 

  36. Smith HJ, Neff M (2018) Communication behavior in embodied virtual reality. In: Proceedings of the 2018 CHI conference on human factors in computing systems pp 1–12

  37. Sodnik J, Tomazic S, Grasset R, Duenser A, Billinghurst M (2006) Spatial sound localization in an augmented reality environment. In: Proceedings of The 18th Australia conference on computer–human interaction: design: activities, artefacts and environments, pp 111–118

  38. Sousa M, dos Anjos RK, Mendes D, Billinghurst M, Jorge J (2019) Warping deixis: distorting gestures to enhance collaboration. In: Proceedings of the 2019 CHI conference on human factors in computing systems. ACM, New York, NY, USA, pp 608:1–608:12

  39. Tang TJ, Li WH (2014) An assistive eyewear prototype that interactively converts 3D object locations into spatial audio. In: Proceedings of the 2014 ACM international symposium on wearable computers, pp 119–126

  40. Tecchia F, Alem L, Huang W (2012) 3D helping hands: a gesture based MR system for remote collaboration. In: Proceedings of the 11th ACM SIGGRAPH international conference on virtual-reality continuum and its applications in industry, pp 323–328

  41. Teo T, Lawrence L, Lee GA, Billinghurst M, Adcock M (2019) Mixed reality remote collaboration combining 360 video and 3D reconstruction. In: Proceedings of the 2019 CHI conference on human factors in computing systems. ACM, New York, NY, USA, pp 201:1–201:14

  42. Teo T, Lee GA, Billinghurst M, Adcock M (2018) Hand gestures and visual annotation in live 360 panorama-based mixed reality remote collaboration. In: Proceedings of the 30th Australian conference on computer–human interaction, pp 406–410

  43. Tikander M, Karjalainen M, Riikonen V (2008) An augmented reality audio headset. In: Proceedings of the 11th International Conference on Digital Audio Effects (DAFx-08), Espoo, Finland

  44. Tomczak M, Tomczak E (2014) The need to report effect size estimates revisited. An overview of some recommended measures of effect size. Trends Sport Sci 21(1):19–25

    MathSciNet  Google Scholar 

  45. Velamkayala ER, Zambrano MV, Li H (2017) Effects of hololens in collaboration: a case in navigation tasks. In: Proceedings of the human factors and ergonomics society annual meeting, vol 61. SAGE Publications Sage CA: Los Angeles, CA, pp 2110–2114

  46. Villegas J, Cohen M (2010) “Gabriel”: geo-aware broadcasting for in-vehicle entertainment and localizability. In: AES 40th International Conference

  47. Vorderer P, Wirth W, Gouveia FR, Biocca F, Saari T, Jäncke L, Böcking S, Schramm H, Gysbers A, Hartmann T et al (2004) Mec spatial presence questionnaire. Retrieved 18 Sept 2015

  48. Walker A, Brewster S (2000) Spatial audio in small screen device displays. Pers Technol 4(2–3):144–154

    Article  Google Scholar 

  49. Walker A, Brewster S, McGookin D, Ng A (2001) Diary in the sky: a spatial audio display for a mobile calendar. In: People and computers XV—interaction without frontiers. Springer, pp 531–539

  50. Yang J, Frank Y, Sörös G (2019) Hearing is believing: synthesizing spatial audio from everyday objects to users. In: Proceedings of the 10th augmented human international conference. ACM, p 28

  51. Yang J, Sörös G (2018) Spatial audio for human-object interactions in small AR workspaces. In: Proceedings of the 16th annual international conference on mobile systems, applications, and services. ACM, p 518

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Yang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, J., Sasikumar, P., Bai, H. et al. The effects of spatial auditory and visual cues on mixed reality remote collaboration. J Multimodal User Interfaces 14, 337–352 (2020). https://doi.org/10.1007/s12193-020-00331-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-020-00331-1

Keywords

Navigation