Video Sonification to Support Visually Impaired People: The VISaVIS Approach

Onofrei, Marius; Castellini, Fabio; Pravadelli, Graziano; Drioli, Carlo; Setti, Francesco

doi:10.1007/978-3-031-43153-1_42

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14234))

Included in the following conference series:

International Conference on Image Analysis and Processing

608 Accesses
1 Altmetric

Abstract

In this paper we present a preliminary study about an assistive technology to support blind and visually impaired people (BVIP) in perceiving and navigating indoor environments. In the VISaVIS project we aim at designing the proof-of-concept of a new wearable device to help BVIPs in recognizing the form of the surrounding environment, thus facilitating their movements. In particular, the device is intended to create, at run-time, a sound representation of the environment captured by a head mounted RGBD camera. The underpinning idea is that, through the sonification of the video images captured by the camera, the user will progressively learn to associate the perceived sound to information like the distance, the dimension, and the format of the obstacles he/she is framing. We qualitatively validated our proposal in two challenging and general scenarios, and we grant access to demo videos to prove the effectiveness of our sonification strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://octomap.github.io/.
2.
https://tonejs.github.io/.

References

Algazi, V., Duda, R., Thompson, D., Avendano, C.: The CIPIC HRTF database. In: WASPAA (2001). https://doi.org/10.1109/ASPAA.2001.969552
Banf, M., Blanz, V.: Sonification of images for the visually impaired using a multi-level approach. In: AH (2013). https://doi.org/10.1145/2459236.2459264
Borenstein, J., Ulrich, I.: The GuideCane-a computerized travel aid for the active guidance of blind pedestrians. In: ICRA (1997). https://doi.org/10.1109/ROBOT.1997.614314
Bresin, R., Mancini, M., Elblaus, L., Frid, E.: Sonification of the self vs. sonification of the other: differences in the sonification of performed vs. observed simple hand movements. Int. J. Hum.-Comput. Stud. 144, 102500 (2020). https://doi.org/10.1016/j.ijhcs.2020.102500
Dasgupta, S., Fang, K., Chen, K., Savarese, S.: Delay: robust spatial layout estimation for cluttered indoor scenes. In: CVPR (2016)
Google Scholar
Fontana, F., Järveläinen, H., Favaro, M.: Is an auditory event more takete? In: SMC (2021). https://doi.org/10.5281/ZENODO.5038640
Geronazzo, M., Bedin, A., Brayda, L., Campus, C., Avanzini, F.: Interactive spatial sonification for non-visual exploration of virtual maps. Int. J. Hum Comput Stud. 85, 4–15 (2016). https://doi.org/10.1016/j.ijhcs.2015.08.004
Article Google Scholar
Gholamalizadeh, T., Pourghaemi, H., Mhaish, A., Ince, G., Duff, D.J.: Sonification of 3d object shape for sensory substitution: an empirical exploration. In: ACHI (2017)
Google Scholar
Hamilton-Fletcher, G., Alvarez, J., Obrist, M., Ward, J.: Soundsight: a mobile sensory substitution device that sonifies colour, distance, and temperature. J. Multimod. User Interfaces 16, 107–123 (2022). https://doi.org/10.1007/s12193-021-00376-w
Hoffmann, R., Spagnol, S., Kristjánsson, A., Unnthorsson, R.: Evaluation of an audio-haptic sensory substitution device for enhancing spatial awareness for the visually impaired. Optom. Vis. Sci. 95, 757–765 (2018). https://doi.org/10.1097/OPX.0000000000001284
Article Google Scholar
Jeong, G.Y., Yu, K.H.: Multi-section sensing and vibrotactile perception for walking guide of visually impaired person. Sensors 16(7), 1070 (2016). https://doi.org/10.3390/s16071070
Article Google Scholar
Jóhannesson, Ó.I., Balan, O., Unnthorsson, R., Moldoveanu, A., Kristjánsson, Á.: The sound of vision project: on the feasibility of an audio-haptic representation of the environment, for the visually impaired. Brain Sci. 6(3), 20 (2016)
Article Google Scholar
Khan, S., Nazir, S., Khan, H.U.: Analysis of navigation assistants for blind and visually impaired people: a systematic review. IEEE Access 9, 26712–26734 (2021). https://doi.org/10.1109/ACCESS.2021.3052415
Article Google Scholar
Kristjánsson, Á., Moldoveanu, A.D.B., Jóhannesson, Ó.I., Balan, O., Spagnol, S., Valgeirsdóttir, V.V., Unnthorsson, R.: Designing sensory-substitution devices: principles, pitfalls and potential. Restor. Neurol. Neurosci. 34, 769–787 (2016)
Google Scholar
Labbé, M., Michaud, F.: RTAB-Map as an open-source lidar and visual simultaneous localization and mapping library for large-scale and long-term online operation. J. Field Robot. 36(2), 416–446 (2019). https://doi.org/10.1002/rob.21831
Article Google Scholar
Li, B., Munoz, J.P., Rong, X., Xiao, J., Tian, Y., Arditi, A.: ISANA: wearable context-aware indoor assistive navigation with obstacle avoidance for the blind. In: ECCV (2016). https://doi.org/10.1007/978-3-319-48881-3_31
Li, J., Stevenson, R.L.: Indoor layout estimation by 2d lidar and camera fusion. arXiv preprint arXiv:2001.05422 (2020)
Loomis, J., Golledge, R., Klatzky, R., Marston, J.: Assisting Wayfinding in Visually Impaired Travelers, pp. 179–202. Lawrence Erlbaum Associates, Inc. (2007). https://doi.org/10.4324/9781003064350-7
Lukierski, R., Leutenegger, S., Davison, A.J.: Room layout estimation from rapid omnidirectional exploration. In: ICRA (2017)
Google Scholar
Márkus, N., Arató, A., Juhász, Z., Bognár, G., Késmárki, L.: MOST-NNG: An accessible GPS navigation application integrated into the mobile slate talker (MOST) for the blind. In: ICCHP (2010). https://doi.org/10.1007/978-3-642-14100-3_37
Martinez-Sala, A.S., Losilla, F., Sánchez-Aarnoutse, J.C., García-Haro, J.: Design, implementation and evaluation of an indoor navigation system for visually impaired people. Sensors 15(12), 32168–32187 (2015). https://doi.org/10.3390/s151229912
Article Google Scholar
Mascetti, S., Ahmetovic, D., Gerino, A., Bernareggi, C., Busso, M., Rizzi, A.: Robust traffic lights detection on mobile devices for pedestrians with visual impairment. Comput. Vis. Image Underst. 148, 123–135 (2016). https://doi.org/10.1016/j.cviu.2015.11.017
Article Google Scholar
Meijer, P.: An experimental system for auditory image representations. IEEE Trans. Biomed. Eng. 39(2), 112–121 (1992). https://doi.org/10.1109/10.121642
Article Google Scholar
Munoz, R., Rong, X., Tian, Y.: Depth-aware indoor staircase detection and recognition for the visually impaired. In: ICME Workshops (2016)
Google Scholar
Nie, Y., Han, X., Guo, S., Zheng, Y., Chang, J., Zhang, J.J.: Total3DUnderstanding: Joint layout, object pose and mesh reconstruction for indoor scenes from a single image. In: CVPR (2020)
Google Scholar
Osiński, D., Łukowska, M., Hjelme, D.R., Wierzchoń, M.: Colorophone 2.0: A wearable color sonification device generating live stereo-soundscapes-design, implementation, and usability audit. Sensors 21(21) (2021). https://doi.org/10.3390/s21217351
Penrod, W., Corbett, M.D., Blasch, B.: Practice report: a master trainer class for professionals in teaching the UltraCane electronic travel device. J. Visual Impairment Blindness 99(11), 711–715 (2005)
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: CVPR (2016)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. NeurIPS (2015)
Google Scholar
Ren, Y., Li, S., Chen, C., Kuo, C.C.J.: A coarse-to-fine indoor layout estimation (cfile) method. In: ACCV (2017)
Google Scholar
Ribeiro, F., Florêncio, D., Chou, P.A., Zhang, Z.: Auditory augmented reality: object sonification for the visually impaired. In: MMSP (2012). https://doi.org/10.1109/MMSP.2012.6343462
Ross, D.A., Lightman, A.: Talking braille: a wireless ubiquitous computing network for orientation and wayfinding. In: ASSETS (2005). https://doi.org/10.1145/1090785.1090805
Takahashi, M., Ji, Y., Umeda, K., Moro, A.: Expandable YOLO: 3D object detection from RGB-D images. In: REM (2020)
Google Scholar
Tapu, R., Mocanu, B., Zaharia, T.: Wearable assistive devices for visually impaired: a state of the art survey. Pattern Recogn. Lett. 137, 37–52 (2020). https://doi.org/10.1016/j.patrec.2018.10.031
Article Google Scholar
Ulrich, I., Borenstein, J.: The GuideCane - applying mobile robot technologies to assist the visually impaired. IEEE Trans. Syst. Man Cybern. 31(2), 131–136 (2001). https://doi.org/10.1109/3468.911370
Article Google Scholar
Villamizar, L.H., Gualdron, M., González, F., Aceros, J., Rizzo-Sierra, C.V.: A necklace sonar with adjustable scope range for assisting the visually impaired. In: EMBC (2013). https://doi.org/10.1109/EMBC.2013.6609784
Wahab, M.H.A., Talib, A.A., Kadir, H.A., Johari, A., Sidek, R.M., Mutalib, A.A.: Smartcane: Assistive cane for visually-impaired people. arXiv preprint arXiv:1110.5156 (2011). https://doi.org/10.48550/arXiv.1110.5156
Yoshida, T., Kitani, K.M., Koike, H., Belongie, S., Schlei, K.: Edgesonic: Image feature sonification for the visually impaired. In: AH (2011). https://doi.org/10.1145/1959826.1959837
Zhang, C., Cui, Z., Zhang, Y., Zeng, B., Pollefeys, M., Liu, S.: Holistic 3d scene understanding from a single image with implicit representation. In: CVPR (2021)
Google Scholar
Zhang, W., Zhang, W., Gu, J.: Edge-semantic learning strategy for layout estimation in indoor environment. IEEE Trans. Cybern. 50(6), 2730–2739 (2019). https://doi.org/10.1109/TCYB.2019.2895837
Article Google Scholar

Download references

Acknowledgements

This work is supported by European Comfort S.r.l. and University of Verona through the Joint Research funding scheme with the project “Vis-a-Vis”. The authors would like to thank Mr. Giambattista Bersanelli and Marco Delucca for the valuable support in the problem definition and functional design of the proposed solution.

Author information

Authors and Affiliations

Department of Mathematics, Computer Science and Physics, University of Udine, Udine, Italy
Marius Onofrei & Carlo Drioli
Department of Engineering for Innovation Medicine, University of Verona, Verona, Italy
Fabio Castellini, Graziano Pravadelli & Francesco Setti

Authors

Marius Onofrei
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Castellini
View author publications
You can also search for this author in PubMed Google Scholar
Graziano Pravadelli
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Drioli
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Setti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Setti .

Editor information

Editors and Affiliations

University of Udine, Udine, Italy
Gian Luca Foresti
University of Udine, Udine, Italy
Andrea Fusiello
University of York, York, UK
Edwin Hancock

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Onofrei, M., Castellini, F., Pravadelli, G., Drioli, C., Setti, F. (2023). Video Sonification to Support Visually Impaired People: The VISaVIS Approach. In: Foresti, G.L., Fusiello, A., Hancock, E. (eds) Image Analysis and Processing – ICIAP 2023. ICIAP 2023. Lecture Notes in Computer Science, vol 14234. Springer, Cham. https://doi.org/10.1007/978-3-031-43153-1_42

Download citation

DOI: https://doi.org/10.1007/978-3-031-43153-1_42
Published: 05 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43152-4
Online ISBN: 978-3-031-43153-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Video Sonification to Support Visually Impaired People: The VISaVIS Approach