Abstract
AR is struggling to achieve its maturity for the mass market. Indeed, there are still many challenging issues that are waiting to be discovered and improved in AR related fields. Artificial Intelligence seems the more promising solution to overcome these limitations; indeed, they can be combined to obtain unique and immersive experiences. Thus, in this work, we focus on integrating DL models into the pipeline of AR development. This paper describes an experiment performed as comparative study, to evaluate if classification and/or object detection can be used an alternative way to track objects in AR. In other words, we implemented a mobile application that is capable of exploiting AI based model for classification and object detection and, at the same time, project the results in AR environment. Several off-the-shelf devices have been used, in order to make the comparison consistent, and to provide the community with useful insights over the opportunity to integrate AI models in AR environment and to what extent this can be convenient or not. Performance tests have been made in terms of both memory consumption and processing time, as well as for Android and iOS based applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bekele, M.K., Pierdicca, R., Frontoni, E., Malinverni, E.S., Gain, J.: A survey of augmented, virtual, and mixed reality for cultural heritage. J. Comput. Cul. Herit. (JOCCH) 11(2), 1–36 (2018)
Cantillo, D., Cervantes, B., Cardona, J.: HealthCam: machine learning models on mobile devices for unhealthy packaged food detection and classification. In: 2020 IEEE International Conference on E-health Networking, Application & Services (HEALTHCOM), pp. 1–6. IEEE (2021)
Chen, J.W., Lin, W.J., Cheng, H.J., Hung, C.L., Lin, C.Y., Chen, S.P.: A smartphone-based application for scale pest detection using multiple-object detection methods. Electronics 10(4), 372 (2021)
Clini, P., Frontoni, E., Quattrini, R., Pierdicca, R.: Augmented reality experience: from high-resolution acquisition to real time augmented contents. Adv. Multimedia 2014, 1–9 (2014)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Elhassouny, A., Smarandache, F.: Smart mobile application to recognize tomato leaf diseases using convolutional neural networks. In: 2019 International Conference of Computer Science and Renewable Energies (ICCSRE), pp. 1–4. IEEE (2019)
Gammeter, S., Gassmann, A., Bossard, L., Quack, T., Van Gool, L.: Server-side object recognition and client-side object tracking for mobile augmented reality. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp. 1–8. IEEE (2010)
Han, J., Zhang, D., Cheng, G., Liu, N., Xu, D.: Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Sig. Process. Mag. 35(1), 84–100 (2018)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Khan, M.A., Israr, S., Almogren, A.S., Din, I.U., Almogren, A., Rodrigues, J.J.: Using augmented reality and deep learning to enhance Taxila Museum experience. J. Real-Time Image Proc. 18(2), 321–332 (2021). https://doi.org/10.1007/s11554-020-01038-y
Lalonde, J.F.: Deep learning for augmented reality. In: 2018 17th Workshop on Information Optics (WIO), pp. 1–3. IEEE (2018)
Lampropoulos, G., Keramopoulos, E., Diamantaras, K.: Enhancing the functionality of augmented reality using deep learning, semantic web and knowledge graphs: a review. Vis. Inf. 4(1), 32–42 (2020)
Lin, P.H., Chen, S.Y.: Design and evaluation of a deep learning recommendation based augmented reality system for teaching programming and computational thinking. IEEE Access 8, 45689–45699 (2020)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, L., et al.: Deep learning for generic object detection: a survey. Int. J. Comput. Vis. 128(2), 261–318 (2020)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999)
Matsuda, Y., Hoashi, H., Yanai, K.: Recognition of multiple-food images by detecting candidate regions. In: 2012 IEEE International Conference on Multimedia and Expo, pp. 25–30. IEEE (2012)
Monteiro, P., Gonçalves, G., Coelho, H., Melo, M., Bessa, M.: Hands-free interaction in immersive virtual reality: a systematic review. IEEE Trans. Vis. Comput. Graph. 27(5), 2702–2713 (2021)
Muñoz Bocanegra, R., et al.: Aprendizaje profundo en dispositivo portable para el reconocimiento de frutas y verduras (2019)
Nasreen, J., Arif, W., Shaikh, A.A., Muhammad, Y., Abdullah, M.: Object detection and narrator for visually impaired people. In: 2019 IEEE 6th International Conference on Engineering Technologies and Applied Sciences (ICETAS), pp. 1–4. IEEE (2019)
Ngugi, L.C., Abdelwahab, M., Abo-Zahhad, M.: Tomato leaf segmentation algorithms for mobile phone applications using deep learning. Comput. Electron. Agric. 178, 105788 (2020)
Nguyen, M., Tran, H., Le, H., Yan, W.Q.: A tile based colour picture with hidden QR code for augmented reality and beyond. In: Proceedings of the 23rd ACM Symposium on Virtual Reality Software and Technology, pp. 1–4 (2017)
Park, K.B., Kim, M., Choi, S.H., Lee, J.Y.: Deep learning-based smart task assistance in wearable augmented reality. Robot. Comput. Integr. Manuf. 63, 101887 (2020)
Park, Y.J., Ro, H., Lee, N.K., Han, T.D.: Deep-care: projection-based home care augmented reality system with deep learning for elderly. Appl. Sci. 9(18), 3897 (2019)
Pescarin, S.: Digital heritage into practice. SCIRES-IT Sci. Res. Inf. Technol. 6(1), 1–4 (2016)
Pierdicca, R., Frontoni, E., Pollini, R., Trani, M., Verdini, L.: The use of augmented reality glasses for the application in industry 4.0. In: De Paolis, L.T., Bourdot, P., Mongelli, A. (eds.) AVR 2017. LNCS, vol. 10324, pp. 389–401. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60922-5_30
Puggioni, M., Frontoni, E., Paolanti, M., Pierdicca, R.: ScooIAR: an educational platform to improve students’ learning through virtual reality. IEEE Access 9, 21059–21070 (2021)
Rao, J., Qiao, Y., Ren, F., Wang, J., Du, Q.: A mobile outdoor augmented reality method combining deep learning object detection and spatial relationships for geovisualization. Sensors 17(9), 1951 (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)
Salunkhe, A., Raut, M., Santra, S., Bhagwat, S.: Android-based object recognition application for visually impaired. In: ITM Web of Conferences, vol. 40, p. 03001. EDP Sciences (2021)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Sanga, S., Mero, V., Machuve, D., Mwanganda, D.: Mobile-based deep learning models for banana diseases detection. arXiv preprint arXiv:2004.03718 (2020)
Sereno, M., Wang, X., Besançon, L., McGuffin, M.J., Isenberg, T.: Collaborative work in augmented reality: a survey. IEEE Trans. Vis. Comput. Graph. 28, 2530–2549 (2020)
Svensson, J., Atles, J.: Object detection in augmented reality. Master’s Theses in Mathematical Sciences (2018)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Tanzi, L., Piazzolla, P., Porpiglia, F., Vezzetti, E.: Real-time deep learning semantic segmentation during intra-operative surgery for 3d augmented reality assistance. Int. J. Comput. Assist. Radiol. Surg. 16(9), 1435–1445 (2021)
Zhao, Z.Q., Zheng, P., Xu, S., Wu, X.: Object detection with deep learning a review. IEEE Trans. Neural Netw. Learn. Syst. 30(11), 3212–3232 (2019)
Acknowledgment
This project has received funding from the European Union’s Horizon 2020 research and innovation programme through the XR4ALL project with grant agreement No 825545.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Pierdicca, R., Tonetto, F., Mameli, M., Rosati, R., Zingaretti, P. (2022). Can AI Replace Conventional Markerless Tracking? A Comparative Performance Study for Mobile Augmented Reality Based on Artificial Intelligence. In: De Paolis, L.T., Arpaia, P., Sacco, M. (eds) Extended Reality. XR Salento 2022. Lecture Notes in Computer Science, vol 13446. Springer, Cham. https://doi.org/10.1007/978-3-031-15553-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-15553-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15552-9
Online ISBN: 978-3-031-15553-6
eBook Packages: Computer ScienceComputer Science (R0)