Skip to main content

Advertisement

Log in

Enriching videos with automatic place recognition in google maps

  • 1201: Video on Demand over Over The Top Platform
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The availability of videos has grown rapidly in recent years. Finding and browsing relevant information to be automatically extracted from videos is not an easy task, but today it is an indispensable feature due to the immense number of digital products available. In this paper, we present a system which provides a process to automatically extract information from videos. We describe a system solution that uses a re-trained OpenNLP model to locate all the places and famous people included in a specific video. The system obtains information from the Google Knowledge Graph related to relevant named entities such as places or famous people. In this paper we will also present the Automatic Georeferencing Video (AGV) system developed by RAI (Radiotelevisione italiana, which is the national public broadcasting company of Italy, owned by the Ministry of Economy and Finance) Teche for the European Project “La Città Educante” (The Educating City: teaching and learning processes in cross-media ecosystem) Our system contributes to The Educating City project by providing the technological environment to create statistical models for automatic named entity recognition (NER), and has been implemented in the field of education, in Italian initially. The system has been applied to the learning challenges facing the world of educational media and has demonstrated how beneficial combining topical news content with scientific content can be in education.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://www.avinteractive.com/features/comment/ai-technology-can-impact-inspire-students-higher-education-22-10-2019/. Last seen July 30, 2021.

  2. http://videolectures.net/. Last seen July 30, 2021.

  3. http://www.evalita.it/. Last seen July 30, 2021.

  4. http://www.teche.rai.it/. Last seen July 30, 2021.

  5. https://almacloud.inet2.org/GeoreferencingProject-1.0/index.html

  6. https://www.google.com/maps. Last seen July 30, 2021.

  7. https://www.bing.com/maps. Last seen July 30, 2021.

  8. https://ffmpeg.org/. Last seen July 30, 2021.

  9. https://opennlp.apache.org/. Last seen July 30, 2021.

  10. https://developers.google.com/knowledge-graph. Last seen July 30, 2021.

  11. https://www.postgresql.org/. Last seen July 30, 2021.

  12. JavaScript Object Notation

  13. https://www.rainews.it/tgr/rubriche/leonardo/. Last seen July 30, 2021.

  14. The Ministry of Education, University and Research (in Italian: Ministero dell’Istruzione, dell’Università e della Ricerca or MIUR).

References

  1. Basile P, Caputo A, Gentile AL, Rizzo G (2016) Overview of the evalita 2016 named entity recognition and linking in italian tweets (neel-it) task. In: the Final Workshop 7 December 2016, Naples, pp 40

  2. Ceccarelli M, di Bisceglie M, Galdi C, Giangregorio G, Ullo SL (2008) Image registration using non-linear diffusion. In: IGARSS 2008 - IEEE International Geoscience and Remote Sensing Symposium, 5, pp 220–223

  3. Chiu C-C, Sainath TN, Wu Y, Prabhavalkar R, Nguyen P, Chen Z, Kannan A, Weiss RJ, Rao K, Gonina E et al (2018) State-of-the-art speech recognition with sequence-to-sequence models. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 4774–4778. IEEE

  4. Gaikwad SK, Gawali BW, Yannawar P (2010) A review on speech recognition technique. International Journal of Computer Applications 10(3):16–24

    Article  Google Scholar 

  5. Giuliano R, Cardarilli GC, Cesarini C, Nunzio LD, Fallucchi F, Fazzolari R, Mazzenga F, Re M, Vizzarri A (2020) Indoor localization system based on bluetooth low energy for museum applications. Electronics, pp 1055

  6. Golubovic N, Krintz C, Wolski R, Lafia S, Hervey T, Kuhn W (2016) Extracting spatial information from social media in support of agricultural management decisions. In: Proceedings of the 10th Workshop on Geographic Information Retrieval, pp 1–2

  7. Han KJ, Chandrashekaran A, Kim J, Lane I (2017) The capio 2017 conversational speech recognition system. arXiv preprint arXiv:1801.00059

  8. Hendricks AL, Wang O, Shechtman E, Sivic J, Darrell T, Russell B (2017) Localizing moments in video with natural language. In: Proceedings of the IEEE international conference on computer vision, pp 5803–5812

  9. Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 41(6):797–819

    Article  Google Scholar 

  10. Jung JJ (2012) Online named entity recognition method for microtexts in social networking services: A case study of twitter. Expert Syst Appl 39(9):8066–8070. https://doi.org/10.1016/j.eswa.2012.01.136, http://www.sciencedirect.com/science/article/pii/S0957417412001546

    Article  Google Scholar 

  11. Kelm P, Schmiedeke S, Sikora T (2012) Multimodal geo-tagging in social media websites using hierarchical spatial segmentation. LBSN ’12: Proceedings of the 5th ACM SIGSPATIAL International Workshop on Location-Based Social Networks, pp 32–39. https://doi.org/10.1145/2442796.2442805

  12. Kotelly B (2003) Art and business of speech recognition: Creating the noble voice. Addison-Wesley Longman Publishing Co., Inc., USA

    Book  Google Scholar 

  13. Larson RR (1996) Geographic information retrieval and spatial browsing. Geographic information systems and libraries: patrons, maps, and spatial information [papers presented at the 1995 Clinic on Library Applications of Data Processing, April 10-12, 1995]

  14. Liu X, Zhang S, Wei F, Zhou M (2011) Recognizing named entities in tweets. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp 359–367. Association for Computational Linguistics, Portland, Oregon, USA. https://www.aclweb.org/anthology/P11-1037

  15. Liu Y, Albanie S, Nagrani A, Zisserman A (2019) Use what you have: Video retrieval using representations from collaborative experts. arXiv preprint arXiv:1907.13487

  16. Messina A, Borgotallo R, Dimino G, Gnota DA, Boch L (2008) Ants: A complete system for automatic news programme annotation based on multimodal analysis. In: 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services, pp 219–222

  17. Miech A, Zhukov D, Alayrac J-B, Tapaswi M, Laptev I, Sivic J (2019) Howto100m: Learning a text-video embedding by watching hundred million narrated video clips. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2630–2640

  18. Mithun NC, Li J, Metze F, Roy-Chowdhury AK (2018) Learning joint embedding with multimodal cues for cross-modal video-text retrieval. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, pp 19–27

  19. Mithun NC, Paul S, Roy-Chowdhury AK (2019) Weakly supervised video moment retrieval from text queries. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 11592–11601

  20. Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Linguisticae Investigationes 30(1):3–26. Publisher: John Benjamins Publishing Company

    Article  Google Scholar 

  21. Nothman J, Ringland N, Radford W, Murphy T, Curran JR (2013) Learning multilingual named entity recognition from wikipedia. Artif Intell 194:151–175

    Article  MathSciNet  Google Scholar 

  22. Patel BV, Meshram BB (2012) Content based video retrieval systems. International Journal of UbiComp (IJU) 3(2):13–30

    Article  Google Scholar 

  23. Purificato E, Rinaldi AM (2018) Multimedia and geographic data integration for cultural heritage information retrieval. Multimedia Tools and Applications 77(20):27447–27469

    Article  Google Scholar 

  24. Purves RS, Clough P, Jones CB, Hall MH, Murdock V (2018) Geographic information retrieval: progress and challenges in spatial search of text. Foundations and Trends in Information Retrieval 12(2-3):164–318

    Article  Google Scholar 

  25. Rae A, Kelm P (2012) Working notes for the placing task at mediaeval 2012. Santa Croce in Fossabanda, Pisa, Italy, October 4-5. MediaEval 2012 Working Notes Proceedings, available at http://ceur-ws.org/Vol-927/, pp 32–39

  26. Raju N, Anita HB (2017) Text extraction from video images. Int J Appl Eng Res 12(24):14750–14754

    Google Scholar 

  27. Ritter A, Clark S, Etzioni O, et al. (2011) Named entity recognition in tweets: an experimental study. In: Proceedings of the conference on empirical methods in natural language processing, pp 1524–1534. Association for Computational Linguistics

  28. Ritter A, Clark S, Mausam, Etzioni O (2011) Named entity recognition in tweets: An experimental study. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp 1524–1534. Association for Computational Linguistics. Edinburgh, Scotland, UK. https://www.aclweb.org/anthology/D11-1141

  29. Snoek CGM, Worring M (2008) Concept-based video retrieval. Foundations and trends in information retrieval 2(4):215–322

    Article  Google Scholar 

  30. Speranza M (2009) The named entity recognition task at evalita 2009. In: EVALITA 2009

  31. Sundheim BM (1995) Overview of results of the MUC-6 evaluation. In: Sixth Message Understanding Conference (MUC-6): Proceedings of a Conference Held in Columbia, Maryland, November 6-8, 1995. https://www.aclweb.org/anthology/M95-1002

  32. Tjong Kim Sang EF, De Meulder F (2003) Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp 142–147. https://www.aclweb.org/anthology/W03-0419

  33. Torabi A, Tandon N, Sigal L (2016) Learning language-visual embedding for movie understanding with natural-language. arXiv preprint arXiv:1609.08124

  34. Ullo SL, Khare SK, Bajaj V, Sinha GR (2020) Hybrid computerized method for environmental sound classification. IEEE Access 8:124055–124065

    Article  Google Scholar 

  35. Ullo SL, Sinha GR (2020) Advances in smart environment monitoring systems using iot and sensors. Sensors 20:3113. https://doi.org/10.3390/s20113113

    Article  Google Scholar 

  36. Veltkamp RC, Burkhardt H, Kriegel H-P (2013) State-of-the-art in content-based image and video retrieval. Springer Science & Business Media, 22

  37. Wilhelm-Stein T, Herms R, Ritter M, Eibl M (2014) Improving transcript-based video retrieval using unsupervised language model adaptation. In: Kanoulas E, Lupu M, Clough P, Sanderson M, Hall M, Hanbury A, Toms E (eds) Information Access Evaluation. Multilinguality, Multimodality, and Interaction, pp 110–115. Springer International Publishing, Cham

  38. Xu R, Xiong C, Chen W, Corso JJ (2015) Jointly modeling deep video and compositional text to bridge vision and language in a unified framework. In: Twenty-Ninth AAAI Conference on Artificial Intelligence

  39. Zheng Y-T, Zha Z-J, Chua T-S (2011) Research and applications on georeferenced multimedia: a survey. Multimedia Tools and Applications 51(1):77–98

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesca Fallucchi.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fallucchi, F., Di Stabile, R., Purificato, E. et al. Enriching videos with automatic place recognition in google maps. Multimed Tools Appl 81, 23105–23121 (2022). https://doi.org/10.1007/s11042-021-11253-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11253-9

Keywords

Navigation