Skip to main content

AI4AR: An AI-Based Mobile Application for the Automatic Generation of AR Contents

  • Conference paper
  • First Online:
Augmented Reality, Virtual Reality, and Computer Graphics (AVR 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12242))

Abstract

Augmented reality (AR) is the process of using technology to superimpose images, text or sounds on top of what a person can already see. Art galleries and museums started to develop AR applications to increase engagement and provide an entirely new kind of exploration experience. However, the creation of contents results a very time consuming process, thus requiring an ad-hoc development for each painting to be increased. In fact, for the creation of an AR experience on any painting, it is necessary to choose the points of interest, to create digital content and then to develop the application. If this is affordable for the great masterpieces of an art gallery, it would be impracticable for an entire collection. In this context, the idea of this paper is to develop AR applications based on Artificial Intelligence. In particular, automatic captioning techniques are the key core for the implementation of AR application for improving the user experience in front of a painting or an artwork in general. The study has demonstrated the feasibility through a proof of concept application, implemented for hand held devices, and adds to the body of knowledge in mobile AR application as this approach has not been applied in this field before.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/aimagelab/speaksee.

References

  1. Hacking the heist, ar(t) (2019). https://www.hackingtheheist.com

  2. Anderson, P., et al.: Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  3. Banfi, F., Brumana, R., Stanga, C.: Extended reality and informative models for the architectural heritage: from scan-to-BIM process to virtual and augmented reality (2019)

    Google Scholar 

  4. Bekele, M., Pierdocca, R., Frontoni, E., Malinverni, E., Gain, J.: A survey of augmented, mixed and virtual reality for cultural heritage. ACM J. Comput. Cultural Herit. 11(2), 1–36 (2018)

    Article  Google Scholar 

  5. BroadcastAR: 7 great examples of augmented reality in museums (2019). https://www.indestry.com/blog/2018/8/21/augmented-reality-museum-examples. Accessed 7 Feb 2019

  6. Chawla, K., Hiranandani, G., Jain, A., Madandas, V.P., Sinha, M.: Augmented reality predictions using machine learning, US Patent App. 15/868,531, 11 July 2019

    Google Scholar 

  7. Clini, P., Frontoni, E., Quattrini, R., Pierdicca, R.: Augmented reality experience: from high-resolution acquisition to real time augmented contents. Adv. Multimedia 2014, 9 (2014)

    Article  Google Scholar 

  8. Clini, P., Quattrini, R., Frontoni, E., Pierdicca, R., Nespeca, R.: Real/not real: pseudo-holography and augmented reality applications for cultural heritage. In: Handbook of Research on Emerging Technologies for Digital Preservation and Information Modeling, pp. 201–227. IGI Global (2017)

    Google Scholar 

  9. Cornia, M., Baraldi, L., Cucchiara, R.: Smart: training shallow memory-aware transformers for robotic explainability. arXiv preprint arXiv:1910.02974 (2019)

  10. Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: Paying more attention to saliency: image captioning with saliency and context attention. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 14(2), 1–21 (2018)

    Article  Google Scholar 

  11. Horie, T., et al.: Creating augmented reality self-portraits using machine learning, US Patent App. 16/177,408, 14 Mar 2019

    Google Scholar 

  12. Hossain, M.Z., Sohel, F., Shiratuddin, M.F., Laga, H.: A comprehensive survey of deep learning for image captioning. ACM Comput. Surv. (CSUR) 51(6), 1–36 (2019)

    Article  Google Scholar 

  13. Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  14. Kim, H.G., Lim, H.T., Ro, Y.M.: Deep virtual reality image quality assessment with human perception guider for omnidirectional image. IEEE Trans. Circuits Syst. Video Technol. 30(4), 917–928 (2019)

    Article  Google Scholar 

  15. Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis. 123(1), 32–73 (2017). https://doi.org/10.1007/s11263-016-0981-7

    Article  MathSciNet  Google Scholar 

  16. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  17. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  18. Lim, H.T., Kim, H.G., Ra, Y.M.: VR IQA Net: deep virtual reality image quality assessment using adversarial learning. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6737–6741. IEEE (2018)

    Google Scholar 

  19. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  20. Naspetti, S., et al.: Automatic analysis of eye-tracking data for augmented reality applications: a prospective outlook. In: De Paolis, L.T., Mongelli, A. (eds.) AVR 2016. LNCS, vol. 9769, pp. 217–230. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40651-0_17

    Chapter  Google Scholar 

  21. Paolanti, M., Romeo, L., Martini, M., Mancini, A., Frontoni, E., Zingaretti, P.: Robotic retail surveying by deep learning visual and textual data. Robot. Auton. Syst. 118, 179–188 (2019)

    Article  Google Scholar 

  22. Pauly, O., Diotte, B., Fallavollita, P., Weidert, S., Euler, E., Navab, N.: Machine learning-based augmented reality for improved surgical scene understanding. Comput. Med. Imaging Graph. 41, 55–60 (2015)

    Article  Google Scholar 

  23. Pescarin, S.: Digital heritage into practice. SCIRES IT Sci. Res. Inf. Tech. 6(1), 1–4 (2016)

    Google Scholar 

  24. Pierdicca, R., Frontoni, E., Zingaretti, P., Sturari, M., Clini, P., Quattrini, R.: Advanced interaction with paintings by augmented reality and high resolution visualization: a real case exhibition. In: De Paolis, L.T., Mongelli, A. (eds.) AVR 2015. LNCS, vol. 9254, pp. 38–50. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22888-4_4

    Chapter  Google Scholar 

  25. Pierdicca, R., Paolanti, M., Naspetti, S., Mandolesi, S., Zanoli, R., Frontoni, E.: User-centered predictive model for improving cultural heritage augmented reality applications: an HMM-based approach for eye-tracking data. J. Imaging 4(8), 101 (2018)

    Article  Google Scholar 

  26. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  27. Romeo, L., Loncarski, J., Paolanti, M., Bocchini, G., Mancini, A., Frontoni, E.: Machine learning-based design support system for the prediction of heterogeneous machine parameters in industry 4.0. Expert Syst. Appl. 140, 112869 (2020)

    Article  Google Scholar 

  28. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  29. Schreiber, A., Bock, M.: Visualization and exploration of deep learning networks in 3D and virtual reality. In: Stephanidis, C. (ed.) HCII 2019. CCIS, vol. 1033, pp. 206–211. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23528-4_29

    Chapter  Google Scholar 

  30. SFMOMA: Augmented reality meets fine art (2018). https://www.frogdesign.com/work/sf-moma

  31. Sulaiman, S., et al.: Museum informatics: a case study on augmented reality at Tanjung Balau fishermen museum. In: IEEE 9th International Conference on System Engineering and Technology (ICSET), pp. 79–83. IEEE (2019)

    Google Scholar 

  32. Svensson, J., Atles, J.: Object detection in augmented reality. Master’s theses in mathematical sciences (2018)

    Google Scholar 

  33. Tanskanen, A., Martinez, A.A., Blasco, D.K., Sipiä, L.: Artificial intelligence, augmented reality and mixed reality in cultural venues. Consolidated Assignments from Spring 2019, p. 80 (2019)

    Google Scholar 

  34. Tomei, M., Cornia, M., Baraldi, L., Cucchiara, R.: Art2real: unfolding the reality of artworks via semantically-aware image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5849–5859 (2019)

    Google Scholar 

  35. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  36. Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 652–663 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto Pierdicca .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pierdicca, R., Paolanti, M., Frontoni, E., Baraldi, L. (2020). AI4AR: An AI-Based Mobile Application for the Automatic Generation of AR Contents. In: De Paolis, L., Bourdot, P. (eds) Augmented Reality, Virtual Reality, and Computer Graphics. AVR 2020. Lecture Notes in Computer Science(), vol 12242. Springer, Cham. https://doi.org/10.1007/978-3-030-58465-8_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58465-8_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58464-1

  • Online ISBN: 978-3-030-58465-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics