Object Detection to Assist Visually Impaired People: A Deep Neural Network Adventure

Bashiri, Fereshteh S.; LaRose, Eric; Badger, Jonathan C.; D’Souza, Roshan M.; Yu, Zeyun; Peissig, Peggy

doi:10.1007/978-3-030-03801-4_44

Object Detection to Assist Visually Impaired People: A Deep Neural Network Adventure

Fereshteh S. Bashiri^25,26,
Eric LaRose²⁵,
Jonathan C. Badger²⁵,
Roshan M. D’Souza²⁶,
Zeyun Yu²⁶ &
…
Peggy Peissig²⁵

Conference paper
First Online: 10 November 2018

2126 Accesses
23 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11241))

Abstract

Blindness or vision impairment, one of the top ten disabilities among men and women, targets more than 7 million Americans of all ages. Accessible visual information is of paramount importance to improve independence and safety of blind and visually impaired people, and there is a pressing need to develop smart automated systems to assist their navigation, specifically in unfamiliar healthcare environments, such as clinics, hospitals, and urgent cares. This contribution focused on developing computer vision algorithms composed with a deep neural network to assist visually impaired individual’s mobility in clinical environments by accurately detecting doors, stairs, and signages, the most remarkable landmarks. Quantitative experiments demonstrate that with enough number of training samples, the network recognizes the objects of interest with an accuracy of over 98% within a fraction of a second.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Ahmetovic, D., et al.: Achieving practical and accurate indoor navigation for people with visual impairments. In: Proceedings of the 14th Web for All Conference on The Future of Accessible Work, p. 31. ACM (2017)
Google Scholar
Bashiri, F.S., LaRose, E., Peissig, P., Tafti, A.P.: Mcindoor20000: a fully-labeled image dataset to advance indoor objects detection. Data Brief 17, 71–75 (2018)
Article Google Scholar
Berger, A., Vokalova, A., Maly, F., Poulova, P.: Google glass used as assistive technology its utilization for blind and visually impaired people. In: Younas, M., Awan, I., Holubova, I. (eds.) MobiWIS 2017. LNCS, vol. 10486, pp. 70–82. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65515-4_6
Chapter Google Scholar
BIRCatMCRI: Mcindoor20000. GitHub repository (2017)
Google Scholar
Bourne, R.R., et al.: Magnitude, temporal trends, and projections of the global prevalence of blindness and distance and near vision impairment: a systematic review and meta-analysis. Lancet Glob. Health 5(9), e888–e897 (2017)
Article Google Scholar
Erickson, W., Lee, C.G., von Schrader, S.: 2016 disability status reports: United states (2018)
Google Scholar
Gaudissart, V., Ferreira, S., Thillou, C., Gosselin, B.: Sypole: mobile reading assistant for blind people. In: 9th Conference Speech and Computer (2004)
Google Scholar
Gupta, D.S.: Architecture of convolutional neural networks (CNNs) demystified (2017)
Google Scholar
Havaei, M., Guizard, N., Larochelle, H., Jodoin, P.-M.: Deep learning trends for focal brain pathology segmentation in MRI. In: Holzinger, A. (ed.) Machine Learning for Health Informatics. LNCS (LNAI), vol. 9605, pp. 125–148. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50478-0_6
Chapter Google Scholar
Huang, J.: Accelerating AI with GPUs: A New Computing Model (2016)
Google Scholar
Jabnoun, H., Benzarti, F., Amiri, H.: A new method for text detection and recognition in indoor scene for assisting blind people. In: Ninth International Conference on Machine Vision (ICMV 2016), vol. 10341, p. 1034123. International Society for Optics and Photonics (2017)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Kruthiventi, S.S., Ayush, K., Babu, R.V.: Deepfix: a fully convolutional neural network for predicting human eye fixations. arXiv preprint arXiv:1510.02927 (2015)
Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
LeCun, Y., et al.: Handwritten digit recognition with a back-propagation network. In: Advances in Neural Information Processing Systems, pp. 396–404 (1990)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Manoj, B., Rohini, V.: A novel approach to object detection and distance measurement for visually impaired people. Int. J. Comput. Intell. Res. 13(4), 479–484 (2017)
Google Scholar
Mekhalfi, M.L., Melgani, F., Bazi, Y., Alajlan, N.: Fast indoor scene description for blind people with multiresolution random projections. J. Vis. Commun. Image Represent. 44, 95–105 (2017)
Article Google Scholar
Srinivas, S., Sarvadevabhatla, R.K., Mopuri, K.R., Prabhu, N., Kruthiventi, S.S., Babu, R.V.: A taxonomy of deep convolutional neural nets for computer vision. Front. Robot. AI 2, 36 (2016)
Article Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Tekin, E., Coughlan, J.M., Shen, H.: Real-time detection and reading of LED/LCD displays for visually impaired persons. In: Proceedings/IEEE Workshop on Applications of Computer Vision. IEEE Workshop on Applications of Computer Vision, p. 491. NIH Public Access (2011)
Google Scholar
Tekin, E., Vásquez, D., Coughlan, J.M.: SK smartphone barcode reader for the blind. In: Journal on technology and persons with disabilities:... Annual International Technology and Persons with Disabilities Conference, vol. 28, p. 230. NIH Public Access (2013)
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar

Download references

Acknowledgements

The authors greatly appreciate and acknowledge the contributions of Dr. Ahmad Pahlavan Tafti for his contributions on study design, data collection and drafting the manuscript. Our special thanks goes to Daniel Wall and Anne Nikolai at Marshfield Clinic Research Institute (MCRI) for their help and contributions in collecting the dataset and preparing the current paper. F.S. Bashiri would like to thank the Summer Research Internship Program (SRIP) at MCRI for financial support. Furthermore, we gratefully acknowledge the support of NVIDIA Corporation with the donation of the Quadro M5000 GPU used for this research.

Author information

Authors and Affiliations

Marshfield Clinic Research Institute, Marshfield, USA
Fereshteh S. Bashiri, Eric LaRose, Jonathan C. Badger & Peggy Peissig
University of Wisconsin-Milwaukee, Milwaukee, USA
Fereshteh S. Bashiri, Roshan M. D’Souza & Zeyun Yu

Authors

Fereshteh S. Bashiri
View author publications
You can also search for this author in PubMed Google Scholar
Eric LaRose
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan C. Badger
View author publications
You can also search for this author in PubMed Google Scholar
Roshan M. D’Souza
View author publications
You can also search for this author in PubMed Google Scholar
Zeyun Yu
View author publications
You can also search for this author in PubMed Google Scholar
Peggy Peissig
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fereshteh S. Bashiri .

Editor information

Editors and Affiliations

University of Nevada, Reno, USA
George Bebis
NASA Ames Research Center, Moffett Field, USA
Richard Boyle
University of Nevada, Reno, USA
Bahram Parvin
Desert Research Institute, Reno, USA
Darko Koracin
DARPA, Arlington, USA
Matt Turek
University of Utah, Salt Lake City, USA
Srikumar Ramalingam
National University of Defense Technology, Changsha, China
Kai Xu
Microsoft Research Asia, Beijing, China
Stephen Lin
Bosch Research, Farmington Hills, MI, USA
Bilal Alsallakh
University of North Carolina at Charlotte, Charlotte, USA
Jing Yang
Microsoft Research, Redmond, USA
Eduardo Cuervo
University of Colorado at Colorado Springs, Colorado Springs, USA
Jonathan Ventura

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bashiri, F.S., LaRose, E., Badger, J.C., D’Souza, R.M., Yu, Z., Peissig, P. (2018). Object Detection to Assist Visually Impaired People: A Deep Neural Network Adventure. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2018. Lecture Notes in Computer Science(), vol 11241. Springer, Cham. https://doi.org/10.1007/978-3-030-03801-4_44

Download citation

DOI: https://doi.org/10.1007/978-3-030-03801-4_44
Published: 10 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03800-7
Online ISBN: 978-3-030-03801-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics