Enriching videos with automatic place recognition in google maps

Fallucchi, Francesca; Di Stabile, Rosario; Purificato, Erasmo; Giuliano, Romeo; De Luca, Ernesto William

doi:10.1007/s11042-021-11253-9

Enriching videos with automatic place recognition in google maps

1201: Video on Demand over Over The Top Platform
Published: 29 July 2021

Volume 81, pages 23105–23121, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Francesca Fallucchi ORCID: orcid.org/0000-0002-3288-044X^1,4,
Rosario Di Stabile²,
Erasmo Purificato^3,4,
Romeo Giuliano¹ &
…
Ernesto William De Luca^1,2,4

262 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The availability of videos has grown rapidly in recent years. Finding and browsing relevant information to be automatically extracted from videos is not an easy task, but today it is an indispensable feature due to the immense number of digital products available. In this paper, we present a system which provides a process to automatically extract information from videos. We describe a system solution that uses a re-trained OpenNLP model to locate all the places and famous people included in a specific video. The system obtains information from the Google Knowledge Graph related to relevant named entities such as places or famous people. In this paper we will also present the Automatic Georeferencing Video (AGV) system developed by RAI (Radiotelevisione italiana, which is the national public broadcasting company of Italy, owned by the Ministry of Economy and Finance) Teche for the European Project “La Città Educante” (The Educating City: teaching and learning processes in cross-media ecosystem) Our system contributes to The Educating City project by providing the technological environment to create statistical models for automatic named entity recognition (NER), and has been implemented in the field of education, in Italian initially. The system has been applied to the learning challenges facing the world of educational media and has demonstrated how beneficial combining topical news content with scientific content can be in education.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering Geographic Regions in the City Using Social Multimedia and Open Data

On the tag localization of web video

Article 07 August 2014

Exploring a Large Dataset of Educational Videos Using Object Detection Analysis

Notes

https://www.avinteractive.com/features/comment/ai-technology-can-impact-in spire-students-higher-education-22-10-2019/. Last seen July 30, 2021.
http://videolectures.net/. Last seen July 30, 2021.
http://www.evalita.it/. Last seen July 30, 2021.
http://www.teche.rai.it/. Last seen July 30, 2021.
https://almacloud.inet2.org/GeoreferencingProject-1.0/index.html
https://www.google.com/maps. Last seen July 30, 2021.
https://www.bing.com/maps. Last seen July 30, 2021.
https://ffmpeg.org/. Last seen July 30, 2021.
https://opennlp.apache.org/. Last seen July 30, 2021.
https://developers.google.com/knowledge-graph. Last seen July 30, 2021.
https://www.postgresql.org/. Last seen July 30, 2021.
JavaScript Object Notation
https://www.rainews.it/tgr/rubriche/leonardo/. Last seen July 30, 2021.
The Ministry of Education, University and Research (in Italian: Ministero dell’Istruzione, dell’Università e della Ricerca or MIUR).

References

Basile P, Caputo A, Gentile AL, Rizzo G (2016) Overview of the evalita 2016 named entity recognition and linking in italian tweets (neel-it) task. In: the Final Workshop 7 December 2016, Naples, pp 40
Ceccarelli M, di Bisceglie M, Galdi C, Giangregorio G, Ullo SL (2008) Image registration using non-linear diffusion. In: IGARSS 2008 - IEEE International Geoscience and Remote Sensing Symposium, 5, pp 220–223
Chiu C-C, Sainath TN, Wu Y, Prabhavalkar R, Nguyen P, Chen Z, Kannan A, Weiss RJ, Rao K, Gonina E et al (2018) State-of-the-art speech recognition with sequence-to-sequence models. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 4774–4778. IEEE
Gaikwad SK, Gawali BW, Yannawar P (2010) A review on speech recognition technique. International Journal of Computer Applications 10(3):16–24
Article Google Scholar
Giuliano R, Cardarilli GC, Cesarini C, Nunzio LD, Fallucchi F, Fazzolari R, Mazzenga F, Re M, Vizzarri A (2020) Indoor localization system based on bluetooth low energy for museum applications. Electronics, pp 1055
Golubovic N, Krintz C, Wolski R, Lafia S, Hervey T, Kuhn W (2016) Extracting spatial information from social media in support of agricultural management decisions. In: Proceedings of the 10th Workshop on Geographic Information Retrieval, pp 1–2
Han KJ, Chandrashekaran A, Kim J, Lane I (2017) The capio 2017 conversational speech recognition system. arXiv preprint arXiv:1801.00059
Hendricks AL, Wang O, Shechtman E, Sivic J, Darrell T, Russell B (2017) Localizing moments in video with natural language. In: Proceedings of the IEEE international conference on computer vision, pp 5803–5812
Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 41(6):797–819
Article Google Scholar
Jung JJ (2012) Online named entity recognition method for microtexts in social networking services: A case study of twitter. Expert Syst Appl 39(9):8066–8070. https://doi.org/10.1016/j.eswa.2012.01.136, http://www.sciencedirect.com/science/article/pii/S0957417412001546
Article Google Scholar
Kelm P, Schmiedeke S, Sikora T (2012) Multimodal geo-tagging in social media websites using hierarchical spatial segmentation. LBSN ’12: Proceedings of the 5th ACM SIGSPATIAL International Workshop on Location-Based Social Networks, pp 32–39. https://doi.org/10.1145/2442796.2442805
Kotelly B (2003) Art and business of speech recognition: Creating the noble voice. Addison-Wesley Longman Publishing Co., Inc., USA
Book Google Scholar
Larson RR (1996) Geographic information retrieval and spatial browsing. Geographic information systems and libraries: patrons, maps, and spatial information [papers presented at the 1995 Clinic on Library Applications of Data Processing, April 10-12, 1995]
Liu X, Zhang S, Wei F, Zhou M (2011) Recognizing named entities in tweets. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp 359–367. Association for Computational Linguistics, Portland, Oregon, USA. https://www.aclweb.org/anthology/P11-1037
Liu Y, Albanie S, Nagrani A, Zisserman A (2019) Use what you have: Video retrieval using representations from collaborative experts. arXiv preprint arXiv:1907.13487
Messina A, Borgotallo R, Dimino G, Gnota DA, Boch L (2008) Ants: A complete system for automatic news programme annotation based on multimodal analysis. In: 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services, pp 219–222
Miech A, Zhukov D, Alayrac J-B, Tapaswi M, Laptev I, Sivic J (2019) Howto100m: Learning a text-video embedding by watching hundred million narrated video clips. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2630–2640
Mithun NC, Li J, Metze F, Roy-Chowdhury AK (2018) Learning joint embedding with multimodal cues for cross-modal video-text retrieval. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, pp 19–27
Mithun NC, Paul S, Roy-Chowdhury AK (2019) Weakly supervised video moment retrieval from text queries. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 11592–11601
Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Linguisticae Investigationes 30(1):3–26. Publisher: John Benjamins Publishing Company
Article Google Scholar
Nothman J, Ringland N, Radford W, Murphy T, Curran JR (2013) Learning multilingual named entity recognition from wikipedia. Artif Intell 194:151–175
Article MathSciNet Google Scholar
Patel BV, Meshram BB (2012) Content based video retrieval systems. International Journal of UbiComp (IJU) 3(2):13–30
Article Google Scholar
Purificato E, Rinaldi AM (2018) Multimedia and geographic data integration for cultural heritage information retrieval. Multimedia Tools and Applications 77(20):27447–27469
Article Google Scholar
Purves RS, Clough P, Jones CB, Hall MH, Murdock V (2018) Geographic information retrieval: progress and challenges in spatial search of text. Foundations and Trends in Information Retrieval 12(2-3):164–318
Article Google Scholar
Rae A, Kelm P (2012) Working notes for the placing task at mediaeval 2012. Santa Croce in Fossabanda, Pisa, Italy, October 4-5. MediaEval 2012 Working Notes Proceedings, available at http://ceur-ws.org/Vol-927/, pp 32–39
Raju N, Anita HB (2017) Text extraction from video images. Int J Appl Eng Res 12(24):14750–14754
Google Scholar
Ritter A, Clark S, Etzioni O, et al. (2011) Named entity recognition in tweets: an experimental study. In: Proceedings of the conference on empirical methods in natural language processing, pp 1524–1534. Association for Computational Linguistics
Ritter A, Clark S, Mausam, Etzioni O (2011) Named entity recognition in tweets: An experimental study. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp 1524–1534. Association for Computational Linguistics. Edinburgh, Scotland, UK. https://www.aclweb.org/anthology/D11-1141
Snoek CGM, Worring M (2008) Concept-based video retrieval. Foundations and trends in information retrieval 2(4):215–322
Article Google Scholar
Speranza M (2009) The named entity recognition task at evalita 2009. In: EVALITA 2009
Sundheim BM (1995) Overview of results of the MUC-6 evaluation. In: Sixth Message Understanding Conference (MUC-6): Proceedings of a Conference Held in Columbia, Maryland, November 6-8, 1995. https://www.aclweb.org/anthology/M95-1002
Tjong Kim Sang EF, De Meulder F (2003) Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp 142–147. https://www.aclweb.org/anthology/W03-0419
Torabi A, Tandon N, Sigal L (2016) Learning language-visual embedding for movie understanding with natural-language. arXiv preprint arXiv:1609.08124
Ullo SL, Khare SK, Bajaj V, Sinha GR (2020) Hybrid computerized method for environmental sound classification. IEEE Access 8:124055–124065
Article Google Scholar
Ullo SL, Sinha GR (2020) Advances in smart environment monitoring systems using iot and sensors. Sensors 20:3113. https://doi.org/10.3390/s20113113
Article Google Scholar
Veltkamp RC, Burkhardt H, Kriegel H-P (2013) State-of-the-art in content-based image and video retrieval. Springer Science & Business Media, 22
Wilhelm-Stein T, Herms R, Ritter M, Eibl M (2014) Improving transcript-based video retrieval using unsupervised language model adaptation. In: Kanoulas E, Lupu M, Clough P, Sanderson M, Hall M, Hanbury A, Toms E (eds) Information Access Evaluation. Multilinguality, Multimodality, and Interaction, pp 110–115. Springer International Publishing, Cham
Xu R, Xiong C, Chen W, Corso JJ (2015) Jointly modeling deep video and compositional text to bridge vision and language in a unified framework. In: Twenty-Ninth AAAI Conference on Artificial Intelligence
Zheng Y-T, Zha Z-J, Chua T-S (2011) Research and applications on georeferenced multimedia: a survey. Multimedia Tools and Applications 51(1):77–98
Article Google Scholar

Download references

Author information

Authors and Affiliations

Guglielmo Marconi University, via Plinio 44, Rome, Italy
Francesca Fallucchi, Romeo Giuliano & Ernesto William De Luca
RAI S.p.A, Viale Giuseppe Mazzini 14, Rome, Italy
Rosario Di Stabile & Ernesto William De Luca
Otto von Guericke University Magdeburg, Universitätsplatz 2, Magdeburg, Germany
Erasmo Purificato
Georg Eckert Institute, Celler Straße 3, Braunschweig, Germany
Francesca Fallucchi, Erasmo Purificato & Ernesto William De Luca

Authors

Francesca Fallucchi
View author publications
You can also search for this author in PubMed Google Scholar
Rosario Di Stabile
View author publications
You can also search for this author in PubMed Google Scholar
Erasmo Purificato
View author publications
You can also search for this author in PubMed Google Scholar
Romeo Giuliano
View author publications
You can also search for this author in PubMed Google Scholar
Ernesto William De Luca
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesca Fallucchi.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fallucchi, F., Di Stabile, R., Purificato, E. et al. Enriching videos with automatic place recognition in google maps. Multimed Tools Appl 81, 23105–23121 (2022). https://doi.org/10.1007/s11042-021-11253-9

Download citation

Received: 07 October 2020
Revised: 18 January 2021
Accepted: 08 July 2021
Published: 29 July 2021
Issue Date: July 2022
DOI: https://doi.org/10.1007/s11042-021-11253-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enriching videos with automatic place recognition in google maps

Abstract

Access this article

Similar content being viewed by others

Discovering Geographic Regions in the City Using Social Multimedia and Open Data

On the tag localization of web video

Exploring a Large Dataset of Educational Videos Using Object Detection Analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enriching videos with automatic place recognition in google maps

Abstract

Access this article

Similar content being viewed by others

Discovering Geographic Regions in the City Using Social Multimedia and Open Data

On the tag localization of web video

Exploring a Large Dataset of Educational Videos Using Object Detection Analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation