Abstract
This study proposes a design for a wearable guide device for blind or visually impaired persons on the basis of video streaming and deep learning. This work mainly aims to provide supplementary assistance to white canes used by visually impaired persons and offer them increased freedom of movement and independence using the proposed wearable device. The considerable amount of environmental information provided by the device also ensures enhanced safety for its users. Computer vision in the proposed device uses an RGB camera instead of the RGBD camera commonly used in computer vision. Deep learning is applied to convert RGB images into depth images and calculate the plane for detecting indoor objects and safe walking routes. A convolutional neural network (CNN) is adopted, and its neural network structure, which is similar to that of the human brain, simulates a neural transmission mechanism similar to that triggered in human learning. Therefore, this system can learn a large number of feature routes and then generate a model from the learning result. The proposed system can help blind or visually impaired persons identify flat and safe walking routes.


















Similar content being viewed by others
References
Achar S, Bartels JR, Whittaker WLR, Kutulakos KN, Narasimhan SG (2017) Epipolar time-of-flight imaging. ACM Trans Graph 36(4):37:1–37:8
Azenkot S, Feng C, Cakmak M (2016) Enabling building service robots to guide blind people a participatory design approach. In: 2016 11th ACM/IEEE international conference on human-robot interaction (HRI), pp 3–10
Bai J, Lian S, Liu Z, Wang K, Liu D (2018) Virtual-blind-road following-based wearable navigation device for blind people. IEEE Trans Consum Electron 64(1):136–143
Baig MH, Jagadeesh V, Piramuthu R, Bhardwaj A, Di W, Sundaresan N (2014) Im2depth: scalable exemplar based depth transfer. In: IEEE Winter conference on applications of computer vision, pp 145–152
Caltagirone L, Scheidegger S, Svensson L, Wahde M (2017) Fast lidar-based road detection using fully convolutional neural networks. In: 2017 IEEE intelligent vehicles symposium (IV), pp 1019–1024
Chin LC, Basah SN, Yaacob S, Din MY, Juan YE (2015) Accuracy and reliability of optimum distance for high performance kinect sensor. In: 2015 2nd international conference on biomedical engineering (ICoBE), pp 1–7
Diamantas S, Astaras S, Pnevmatikakis A (2016) Depth estimation in still images and videos using a motionless monocular camera. In: 2016 IEEE international conference on imaging systems and techniques (IST), pp 129–134
Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. In: Proceedings of the 27th international conference on neural information processing systems, vol 2, pp 2366–2374
Fabrizio F, Luca AD (2017) Real-time computation of distance to dynamic obstacles with multiple depth sensors. IEEE Robot Autom Lett 2(1):56–63
Fernandes LA, Oliveira MM (2008) Real-time line detection through an improved hough transform voting scheme. Pattern Recognit 41(1):299–314
Forouher D, Besselmann MG, Maehle E (2016) Sensor fusion of depth camera and ultrasound data for obstacle detection and robot navigation. In: 2016 14th international conference on control, automation, robotics and vision (ICARCV), pp 1–6
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Hoiem D, Efros AA, Hebert M (2005) Automatic photo pop-up. ACM Trans Graph 24(3):577–584
Islam MA, Bruce N, Wang Y (2016) Dense image labeling using deep convolutional neural networks. In: 2016 13th Conference on computer and robot vision (CRV), pp 16–23
Islam MM, Sadi MS, Zamli KZ, Ahmed MM (2019) Developing walking assistants for visually impaired people: a review. IEEE Sens J 19 (8):2814–2828
Jin Y, Li J, Ma D, Guo X, Yu H (2017) A semi-automatic annotation technology for traffic scene image labeling based on deep learning preprocessing. In: 2017 IEEE international conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC), pp 315–320
Karsch K, Liu C, Kang SB (2014) Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans Pattern Anal Mach Intell 36 (11):2144–2158
Khoshelham K (2011) Accuracy analysis of kinect depth data. In: International archives of the photogrammetry, remote sensing and spatial information sciences, pp 133–138
Kuznietsov Y, Stückler J, Leibe B (2017) Semi-supervised deep learning for monocular depth map prediction. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 2215–2223
Lee HS, Lee KM (2013) Simultaneous super-resolution of depth and images using a single camera. In: 2013 IEEE conference on computer vision and pattern recognition, pp 281–288
Liaquat S, Khan US, Ata-Ur-Rehman (2015) Object detection and depth estimation of real world objects using single camera. In: 2015 Fourth international conference on aerospace science and engineering (ICASE), pp 1–4
Liu F, Shen C, Lin G, Reid I (2016) Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans Pattern Anal Mach Intell 38(10):2024–2039
Liu S, Yu M, Li M, Xu Q (2019) The research of virtual face based on deep convolutional generative adversarial networks using tensorflow. Phys A: Stat Mech Appl 521:667–680
Liu S, Li M, Li M, Xu Q (2020) Research of animals image semantic segmentation based on deep learning. Concurr Comput: Pract Exp 31 (1):e4892
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3431–3440
Maurer M (2012) White cane safety day: a symbol of independence. National Federation of the Blind
Michels J, Saxena A, Ng AY (2005) High speed obstacle avoidance using monocular vision and reinforcement learning. In: Proceedings of the 22nd international conference on machine learning, pp 593–600
Naseer T, Burgard W (2017) Deep regression for monocular camera-based 6-dof global localization in outdoor environments. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 1525–1530
Saxena A, Chung SH, Ng AY (2005) Learning depth from single monocular images. In: Proceedings of the 18th international conference on neural information processing systems, pp 1161–1168
Saxena A, Sun M, Ng AY (2009) Make3d: learning 3d scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell 31(5):824–840
Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from rgbd images. In: Proceedings of the 12th European conference on computer vision—volume part V, pp 746–760
Sokic E, Ferizbegovic M, Zubaca J, Softic K, Ahic-Djokic M (2015) Design of ultrasound-based sensory system for environment inspection robots. In: 2015 57th international symposium ELMAR (ELMAR), pp 141–144
Stejskal M, Mrva J, Faigl J (2016) Road following with blind crawling robot. In: 2016 IEEE international conference on robotics and automation (ICRA), pp 3612–3617
Straub J, Freifeld O, Rosman G, Leonard JJ, Fisher JW (2018) The manhattan frame model—manhattan world inference in the space of surface normals. IEEE Trans Pattern Anal Mach Intell 40(1):235–249
Tian H, Zhuang B, Hua Y, Cai A (2014) Depth inference with convolutional neural network. In: 2014 IEEE visual communications and image processing conference, pp 169–172
Toha SF, Yusof HM, Razali MF, Halim AHA (2015) Intelligent path guidance robot for blind person assistance. In: 2015 International conference on informatics, electronics vision (ICIEV), pp 1–5
Štrbac M, Marković M, Popović DB (2012) Kinect in neurorehabilitation: computer vision system for real time hand and object detection and distance estimation. In: 11th Symposium on neural network applications in electrical engineering, pp 127–132
Xu Q (2013) A novel machine learning strategy based on two-dimensional numerical models in financial engineering. Math Probl Eng 2013:1–6
Xu Q, Li M (2019) A new cluster computing technique for social media data analysis. Clust Comput 22:2731–2738
Xu Q, Wu J, Chen Q (2014) A novel mobile personalized recommended method based on money flow model for stock exchange. Math Probl Eng 2014:1–9
Xu Q, Li M, Li M, Liu S (2018a) Energy spectrum ct image detection based dimensionality reduction with phase congruency. J Med Syst 42(49):1–14
Xu Q, Wang Z, Wang F, Li J (2018b) Thermal comfort research on human ct data modeling. Multimed Tools Appl 77(5):6311–6326
Xu Q, Li M, Yu M (2019a) Learning to rank with relational graph and pointwise constraint for cross-modal retrieval. Soft Comput 23:9413–9427
Xu Q, Wang F, Gong Y, Wang Z, Zeng K, Li Q, Luo X (2019b) A novel edge-oriented framework for saliency detection enhancement. Image Vis Comput 87:1–12
Xu Q, Wang Z, Wang F, Gong Y (2019c) Multi-feature fusion cnns for drosophila embryo of interest detection. Phys A: Stat Mech Appl 531:121808
Xu Q, Huang G, Yu M, Guo Y (2020) Fall prediction based on key points of human bones. Phys A: Stat Mech Appl 540:123205
Yin LS, Sheng YK, Soetedjo A (2008) Developing a blind robot: study on 2d mapping. In: 2008 IEEE conference on innovative technologies in intelligent systems and industrial applications, pp 12–14
žbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. J Mach Learn Res 17(1):2287–2318
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6230–6239
Acknowledgements
This research was supported in part by the Ministry of Science and Technology (contracts MOST-108-2221-E-019-038-MY2, MOST-107-2221-E-019-039-MY2, MOST-109-2634-F-008-007, and MOST-109-2634-F-019-001) of Taiwan. This research was also funded by the University System of Taipei Joint Research Program (contract USTP-NTUT-NTOU-109-01), Taiwan.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hsieh, YZ., Lin, SS. & Xu, FX. Development of a wearable guide device based on convolutional neural network for blind or visually impaired persons. Multimed Tools Appl 79, 29473–29491 (2020). https://doi.org/10.1007/s11042-020-09464-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09464-7