Development of a wearable guide device based on convolutional neural network for blind or visually impaired persons

Hsieh, Yi-Zeng; Lin, Shih-Syun; Xu, Fu-Xiong

doi:10.1007/s11042-020-09464-7

Development of a wearable guide device based on convolutional neural network for blind or visually impaired persons

Published: 11 August 2020

Volume 79, pages 29473–29491, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

873 Accesses
Explore all metrics

Abstract

This study proposes a design for a wearable guide device for blind or visually impaired persons on the basis of video streaming and deep learning. This work mainly aims to provide supplementary assistance to white canes used by visually impaired persons and offer them increased freedom of movement and independence using the proposed wearable device. The considerable amount of environmental information provided by the device also ensures enhanced safety for its users. Computer vision in the proposed device uses an RGB camera instead of the RGBD camera commonly used in computer vision. Deep learning is applied to convert RGB images into depth images and calculate the plane for detecting indoor objects and safe walking routes. A convolutional neural network (CNN) is adopted, and its neural network structure, which is similar to that of the human brain, simulates a neural transmission mechanism similar to that triggered in human learning. Therefore, this system can learn a large number of feature routes and then generate a model from the learning result. The proposed system can help blind or visually impaired persons identify flat and safe walking routes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 11

Empowering Individuals with Visual Impairments: A Deep Learning-Based Smartphone Navigation Assistant

KrNet: A Kinetic Real-Time Convolutional Neural Network for Navigational Assistance

SightAid: empowering the visually impaired in the Kingdom of Saudi Arabia (KSA) with deep learning-based intelligent wearable vision system

Article 29 March 2024

References

Achar S, Bartels JR, Whittaker WLR, Kutulakos KN, Narasimhan SG (2017) Epipolar time-of-flight imaging. ACM Trans Graph 36(4):37:1–37:8
Article Google Scholar
Azenkot S, Feng C, Cakmak M (2016) Enabling building service robots to guide blind people a participatory design approach. In: 2016 11th ACM/IEEE international conference on human-robot interaction (HRI), pp 3–10
Bai J, Lian S, Liu Z, Wang K, Liu D (2018) Virtual-blind-road following-based wearable navigation device for blind people. IEEE Trans Consum Electron 64(1):136–143
Article Google Scholar
Baig MH, Jagadeesh V, Piramuthu R, Bhardwaj A, Di W, Sundaresan N (2014) Im2depth: scalable exemplar based depth transfer. In: IEEE Winter conference on applications of computer vision, pp 145–152
Caltagirone L, Scheidegger S, Svensson L, Wahde M (2017) Fast lidar-based road detection using fully convolutional neural networks. In: 2017 IEEE intelligent vehicles symposium (IV), pp 1019–1024
Chin LC, Basah SN, Yaacob S, Din MY, Juan YE (2015) Accuracy and reliability of optimum distance for high performance kinect sensor. In: 2015 2nd international conference on biomedical engineering (ICoBE), pp 1–7
Diamantas S, Astaras S, Pnevmatikakis A (2016) Depth estimation in still images and videos using a motionless monocular camera. In: 2016 IEEE international conference on imaging systems and techniques (IST), pp 129–134
Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. In: Proceedings of the 27th international conference on neural information processing systems, vol 2, pp 2366–2374
Fabrizio F, Luca AD (2017) Real-time computation of distance to dynamic obstacles with multiple depth sensors. IEEE Robot Autom Lett 2(1):56–63
Article MathSciNet Google Scholar
Fernandes LA, Oliveira MM (2008) Real-time line detection through an improved hough transform voting scheme. Pattern Recognit 41(1):299–314
Article MATH Google Scholar
Forouher D, Besselmann MG, Maehle E (2016) Sensor fusion of depth camera and ultrasound data for obstacle detection and robot navigation. In: 2016 14th international conference on control, automation, robotics and vision (ICARCV), pp 1–6
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Hoiem D, Efros AA, Hebert M (2005) Automatic photo pop-up. ACM Trans Graph 24(3):577–584
Article Google Scholar
Islam MA, Bruce N, Wang Y (2016) Dense image labeling using deep convolutional neural networks. In: 2016 13th Conference on computer and robot vision (CRV), pp 16–23
Islam MM, Sadi MS, Zamli KZ, Ahmed MM (2019) Developing walking assistants for visually impaired people: a review. IEEE Sens J 19 (8):2814–2828
Article Google Scholar
Jin Y, Li J, Ma D, Guo X, Yu H (2017) A semi-automatic annotation technology for traffic scene image labeling based on deep learning preprocessing. In: 2017 IEEE international conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC), pp 315–320
Karsch K, Liu C, Kang SB (2014) Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans Pattern Anal Mach Intell 36 (11):2144–2158
Article Google Scholar
Khoshelham K (2011) Accuracy analysis of kinect depth data. In: International archives of the photogrammetry, remote sensing and spatial information sciences, pp 133–138
Kuznietsov Y, Stückler J, Leibe B (2017) Semi-supervised deep learning for monocular depth map prediction. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 2215–2223
Lee HS, Lee KM (2013) Simultaneous super-resolution of depth and images using a single camera. In: 2013 IEEE conference on computer vision and pattern recognition, pp 281–288
Liaquat S, Khan US, Ata-Ur-Rehman (2015) Object detection and depth estimation of real world objects using single camera. In: 2015 Fourth international conference on aerospace science and engineering (ICASE), pp 1–4
Liu F, Shen C, Lin G, Reid I (2016) Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans Pattern Anal Mach Intell 38(10):2024–2039
Article Google Scholar
Liu S, Yu M, Li M, Xu Q (2019) The research of virtual face based on deep convolutional generative adversarial networks using tensorflow. Phys A: Stat Mech Appl 521:667–680
Article Google Scholar
Liu S, Li M, Li M, Xu Q (2020) Research of animals image semantic segmentation based on deep learning. Concurr Comput: Pract Exp 31 (1):e4892
Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3431–3440
Maurer M (2012) White cane safety day: a symbol of independence. National Federation of the Blind
Michels J, Saxena A, Ng AY (2005) High speed obstacle avoidance using monocular vision and reinforcement learning. In: Proceedings of the 22nd international conference on machine learning, pp 593–600
Naseer T, Burgard W (2017) Deep regression for monocular camera-based 6-dof global localization in outdoor environments. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 1525–1530
Saxena A, Chung SH, Ng AY (2005) Learning depth from single monocular images. In: Proceedings of the 18th international conference on neural information processing systems, pp 1161–1168
Saxena A, Sun M, Ng AY (2009) Make3d: learning 3d scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell 31(5):824–840
Article Google Scholar
Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from rgbd images. In: Proceedings of the 12th European conference on computer vision—volume part V, pp 746–760
Sokic E, Ferizbegovic M, Zubaca J, Softic K, Ahic-Djokic M (2015) Design of ultrasound-based sensory system for environment inspection robots. In: 2015 57th international symposium ELMAR (ELMAR), pp 141–144
Stejskal M, Mrva J, Faigl J (2016) Road following with blind crawling robot. In: 2016 IEEE international conference on robotics and automation (ICRA), pp 3612–3617
Straub J, Freifeld O, Rosman G, Leonard JJ, Fisher JW (2018) The manhattan frame model—manhattan world inference in the space of surface normals. IEEE Trans Pattern Anal Mach Intell 40(1):235–249
Article Google Scholar
Tian H, Zhuang B, Hua Y, Cai A (2014) Depth inference with convolutional neural network. In: 2014 IEEE visual communications and image processing conference, pp 169–172
Toha SF, Yusof HM, Razali MF, Halim AHA (2015) Intelligent path guidance robot for blind person assistance. In: 2015 International conference on informatics, electronics vision (ICIEV), pp 1–5
Štrbac M, Marković M, Popović DB (2012) Kinect in neurorehabilitation: computer vision system for real time hand and object detection and distance estimation. In: 11th Symposium on neural network applications in electrical engineering, pp 127–132
Xu Q (2013) A novel machine learning strategy based on two-dimensional numerical models in financial engineering. Math Probl Eng 2013:1–6
Google Scholar
Xu Q, Li M (2019) A new cluster computing technique for social media data analysis. Clust Comput 22:2731–2738
Article Google Scholar
Xu Q, Wu J, Chen Q (2014) A novel mobile personalized recommended method based on money flow model for stock exchange. Math Probl Eng 2014:1–9
MathSciNet MATH Google Scholar
Xu Q, Li M, Li M, Liu S (2018a) Energy spectrum ct image detection based dimensionality reduction with phase congruency. J Med Syst 42(49):1–14
Xu Q, Wang Z, Wang F, Li J (2018b) Thermal comfort research on human ct data modeling. Multimed Tools Appl 77(5):6311–6326
Xu Q, Li M, Yu M (2019a) Learning to rank with relational graph and pointwise constraint for cross-modal retrieval. Soft Comput 23:9413–9427
Xu Q, Wang F, Gong Y, Wang Z, Zeng K, Li Q, Luo X (2019b) A novel edge-oriented framework for saliency detection enhancement. Image Vis Comput 87:1–12
Xu Q, Wang Z, Wang F, Gong Y (2019c) Multi-feature fusion cnns for drosophila embryo of interest detection. Phys A: Stat Mech Appl 531:121808
Xu Q, Huang G, Yu M, Guo Y (2020) Fall prediction based on key points of human bones. Phys A: Stat Mech Appl 540:123205
Article MathSciNet Google Scholar
Yin LS, Sheng YK, Soetedjo A (2008) Developing a blind robot: study on 2d mapping. In: 2008 IEEE conference on innovative technologies in intelligent systems and industrial applications, pp 12–14
žbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. J Mach Learn Res 17(1):2287–2318
MATH Google Scholar
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6230–6239

Download references

Acknowledgements

This research was supported in part by the Ministry of Science and Technology (contracts MOST-108-2221-E-019-038-MY2, MOST-107-2221-E-019-039-MY2, MOST-109-2634-F-008-007, and MOST-109-2634-F-019-001) of Taiwan. This research was also funded by the University System of Taipei Joint Research Program (contract USTP-NTUT-NTOU-109-01), Taiwan.

Author information

Authors and Affiliations

National Taiwan Ocean University, Keelung, Taiwan
Yi-Zeng Hsieh, Shih-Syun Lin & Fu-Xiong Xu

Authors

Yi-Zeng Hsieh
View author publications
You can also search for this author inPubMed Google Scholar
Shih-Syun Lin
View author publications
You can also search for this author inPubMed Google Scholar
Fu-Xiong Xu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Shih-Syun Lin.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hsieh, YZ., Lin, SS. & Xu, FX. Development of a wearable guide device based on convolutional neural network for blind or visually impaired persons. Multimed Tools Appl 79, 29473–29491 (2020). https://doi.org/10.1007/s11042-020-09464-7

Download citation

Received: 23 September 2019
Revised: 24 June 2020
Accepted: 28 July 2020
Published: 11 August 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11042-020-09464-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development of a wearable guide device based on convolutional neural network for blind or visually impaired persons

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Empowering Individuals with Visual Impairments: A Deep Learning-Based Smartphone Navigation Assistant

KrNet: A Kinetic Real-Time Convolutional Neural Network for Navigational Assistance

SightAid: empowering the visually impaired in the Kingdom of Saudi Arabia (KSA) with deep learning-based intelligent wearable vision system

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now