Skip to main content

Detection and Recognition of Barriers in Egocentric Images for Safe Urban Sidewalks

  • Conference paper
  • First Online:
Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020)

Abstract

The impact of walking in modern cities has proven to be quite significant with many advantages especially in the fields of environment and citizens’ health. Although society is trying to promote it as the cheapest and most sustainable means of transportation, many road accidents have involved pedestrians and cyclists in the recent years. The frequent presence of various obstacles on urban sidewalks puts the lives of citizens in danger. Their immediate detection and removal are of great importance for maintaining clean and safe access to infrastructure of urban environments. Following the great success of egocentric applications that take advantage of the uninterrupted use of smartphone devices to address serious problems that concern humanity, we aim to develop methodologies for detecting barriers and other dangerous obstacles encountered by pedestrians on urban sidewalks. For this purpose a dedicated image dataset is generated and used as the basis for analyzing the performance of different methods in detecting and recognizing different types of obstacle using three different architectures of deep learning algorithms. The high accuracy of the experimental results shows that the development of egocentric applications can successfully help to maintain the safety and cleanliness of sidewalks and at the same time to reduce pedestrian accidents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Microsoft common objects in context, detection evaluation, metrics. https://cocodataset.org/#detection-eval. Accessed 16 July 2020

  2. Bano, S., Suveges, T., Zhang, J., McKenna, S.: Multimodal egocentric analysis of focused interactions. IEEE Access 6, 1–13 (2018)

    Article  Google Scholar 

  3. Bolaños, M., Dimiccoli, M., Radeva, P.: Toward storytelling from visual lifelogging: an overview. IEEE Trans. Hum. Mach. Syst. 47, 77–90 (2015)

    Google Scholar 

  4. Bolaños, M., Radeva, P.: Simultaneous food localization and recognition. In: 23rd International Conference on Pattern Recognition (ICPR), pp. 3140–3145 (2016)

    Google Scholar 

  5. Bullock, I.M., Feix, T., Dollar, A.M.: The yale human grasping dataset: grasp, object, and task data in household and machine shop environments. Int. J. Robot. Res. 34(3), 251–255 (2015)

    Article  Google Scholar 

  6. Climent-Pérez, P., Spinsante, S., Michailidis, A., Flórez-Revuelta, F.: A review on video-based active and assisted living technologies for automated lifelogging. Expert Syst. Appl. 139, 112847 (2020)

    Article  Google Scholar 

  7. Damen, D., et al.: Scaling egocentric vision: the epic-kitchens dataset. In: European Conference on Computer Vision (ECCV) (2018)

    Google Scholar 

  8. Dutta, A., Zisserman, A.: The VIA annotation software for images, audio and video. arXiv preprint arXiv:1904.10699 (2019)

  9. EU: 2018 road safety statistics: what is behind the figures? (2019). https://ec.europa.eu/commission/presscorner/detail/en/MEMO_19_1990. Accessed 14 July 2020

  10. Furnari, A., Farinella, G.M., Battiato, S.: Recognizing personal locations from egocentric videos. IEEE Trans. Hum. Mach. Syst. 47(1), 6–18 (2017)

    Google Scholar 

  11. Gopalakrishnan, K.: Deep learning in data-driven pavement image analysis and automated distress detection: a review. Data 3(3), 28 (2018)

    Article  Google Scholar 

  12. Gurrin, C., Joho, H., Hopfgartner, F., Zhou, L., Albatal, R.: Overview of ntcir-12 lifelog task (2016)

    Google Scholar 

  13. Herruzo, P., Portell, L., Soto, A., Remeseiro, B.: Analyzing first-person stories based on socializing, eating and sedentary patterns. In: Battiato, S., Farinella, G.M., Leo, M., Gallo, G. (eds.) ICIAP 2017. LNCS, vol. 10590, pp. 109–119. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70742-6_10

    Chapter  Google Scholar 

  14. Howard, A.G., et al.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  15. Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7310–7311 (2017)

    Google Scholar 

  16. Jain, S., Gruteser, M.: Recognizing textures with mobile cameras for pedestrian safety applications. IEEE Trans. Mob. Comput. 18(8), 1911–1923 (2019)

    Google Scholar 

  17. Jiang, B., Yang, J., Lv, Z., Song, H.: Wearable vision assistance system based on binocular sensors for visually impaired users. IEEE Internet Things J. 6(2), 1375–1383 (2019)

    Article  Google Scholar 

  18. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  19. Lin, T.-Y., et al.: Microsoft coco: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  20. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  21. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  22. Maeda, H., Sekimoto, Y., Seto, T., Kashiyama, T., Omata, H.: Road damage detection and classification using deep neural networks with smartphone images: road damage detection and classification. Comput. Aided Civ. Infrastruct. Eng. 33, 1127–1141 (2018)

    Google Scholar 

  23. Mayol, W.W., Murray, D.W.: Wearable hand activity recognition for event summarization. In: Ninth IEEE International Symposium on Wearable Computers (ISWC 2005), pp. 122–129 (2005)

    Google Scholar 

  24. Nesoff, E., Porter, K., Bailey, M., Gielen, A.: Knowledge and beliefs about pedestrian safety in an urban community: implications for promoting safe walking. J. Community Health 44, 103–111 (2018)

    Google Scholar 

  25. Oliveira-Barra, G., Bolaños, M., Talavera, E., Gelonch, O., Garolera, M., Radeva, P.: Lifelog retrieval for memory stimulation of people with memory impairment, pp. 135–158 (January 2019)

    Google Scholar 

  26. Organization, W.H.: A road safety manual for decision-makers and practitioners. Technical report (2013)

    Google Scholar 

  27. Organization, W.H.: Global status report on road safety 2018. Technical report (2018)

    Google Scholar 

  28. Penna, A., Mohammadi, S., Jojic, N., Murino, V.: Summarization and classification of wearable camera streams by learning the distributions over deep features of out-of-sample image sequences. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 4336–4344 (2017)

    Google Scholar 

  29. Prathiba, T., Thamaraiselvi, M., Mohanasundari, M., Veerelakshmi, R.: Pothole detection in road using image processing. Int. J. Manag. Inf. Technol. Eng. 3(4), 13–20 (2015)

    Google Scholar 

  30. Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput. 29(9), 2352–2449 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  31. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28, 91–99 (2015). NIPS

    Google Scholar 

  32. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  33. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

    Google Scholar 

  34. Sas-Bojarska, A., Rembeza, M.: Planning the city against barriers. Enhancing the role of public spaces. Procedia Eng. 161, 1556–1562 (2016)

    Google Scholar 

  35. Silva, A.R., Pinho, M., Macedo, L., Moulin, C.: A critical review of the effects of wearable cameras on memory. Neuropsychol. Rehabil. 28, 1–25 (2016)

    Google Scholar 

  36. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  37. Singh, K.K., Fatahalian, K., Efros, A.A.: Krishnacam: using a longitudinal, single-person, egocentric dataset for scene understanding tasks. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9 (2016)

    Google Scholar 

  38. Szarvas, M., Yoshizawa, A., Yamamoto, M., Ogata, J.: Pedestrian detection with convolutional neural networks, pp. 224–229 (July 2005)

    Google Scholar 

  39. Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  40. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

    Google Scholar 

  41. Theodosiou, Z., Lanitis, A.: Visual lifelogs retrieval: state of the art and future challenges. In: 2019 14th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP) (2019)

    Google Scholar 

  42. Theodosiou, Z., Partaourides, H., Atun, T., Panayi, S., Lanitis, A.: A first-person database for detecting barriers for pedestrians. In: Proceedings of 15th The International Conference on Computer Vision Theory and Applications, pp. 660–666 (2020)

    Google Scholar 

  43. Timmermans, C., Alhajyaseen, W., Reinolsmann, N., Nakamura, H., Suzuki, K.: Traffic safety culture of professional drivers in the state of qatar. IATSS Res. 43(4), 286–296 (2019)

    Google Scholar 

  44. Tung, Y., Shin, K.G.: Use of phone sensors to enhance distracted pedestrians’ safety. IEEE Trans. Mob. Comput. 17(6), 1469–1482 (2018)

    Article  Google Scholar 

  45. Wang, T., Cardone, G., Corradi, A., Torresani, L., Campbell, A.T.: Walksafe: a pedestrian safety app for mobile phone users who walk and talk while crossing roads. In: Proceedings of the Twelfth Workshop on Mobile Computing Systems & #38; Applications, pp. 5:1–5:6. HotMobile 2012 (2012)

    Google Scholar 

  46. Zhang, L., Yang, F., Zhang, Y., Zhu, Y.: Road crack detection using deep convolutional neural network (Sep 2016)

    Google Scholar 

  47. Zhao, Z.Q., Zheng, P., Xu, S.T., Wu, X.: Object detection with deep learning: a review. IEEE Trans. Neural Netw. Learn. Syst. 30(11), 3212–3232 (2019)

    Article  Google Scholar 

Download references

Acknowledgements

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 739578 complemented by the Government of the Republic of Cyprus through the Directorate General for European Programmes, Coordination and Development.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zenonas Theodosiou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Theodosiou, Z., Partaourides, H., Panayi, S., Kitsis, A., Lanitis, A. (2022). Detection and Recognition of Barriers in Egocentric Images for Safe Urban Sidewalks. In: Bouatouch, K., et al. Computer Vision, Imaging and Computer Graphics Theory and Applications. VISIGRAPP 2020. Communications in Computer and Information Science, vol 1474. Springer, Cham. https://doi.org/10.1007/978-3-030-94893-1_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94893-1_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94892-4

  • Online ISBN: 978-3-030-94893-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics