Skip to main content

On the Impact of Location-related Terms in Neural Embeddings for Content Similarity Measures in Cultural Heritage Recommender Systems

  • Conference paper
  • First Online:
Web and Wireless Geographical Information Systems (W2GIS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13238))

Abstract

Analysing text to detect semantic similarities is a recent breakthrough of Natural Language Processing that brought many novel applications in different fields. A domain which could greatly benefit of this innovation is the one regarding Location-based and/or Touristic Recommender Systems, where the user receives suggestions based on his/her past liked items. In this work, we consider the use of neural embeddings weighted using Smooth-Inverse Frequency (SIF) to detect semantic similarities in textual descriptions found in a large graph database covering Italian cultural Points of Interests (POIs). Of all detected similar pairs on a national scale, 19% are composed by POIs that do not belong to the same ontological category, highlighting the potential neural embeddings have to match POIs beyond the categories they have been assigned to. However, since text descriptions also contain references to the places where POIs are found, similarities can be detected among POIs sharing the same location, especially in the case of low-frequency geographical terms. While this may be desirable, in some cases, it may harm location-aware applications, as POIs positions are already known. By comparing city names occurrence probabilities both in the full text corpus and in location-constrained sub-corpora, we observed probability shifts, on average, of 232%. This suggests that, for the specific case of location-aware services, SIF-weighted neural embeddings should use location-constrained sub-corpora for term occurrence probability computation in order to efficiently remove uninteresting information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arora, S., Liang, Y., Ma, T.: A simple but tough-to-beat baseline for sentence embeddings. In: Proceedings of the 5th International Conference on Learning Representations (2017)

    Google Scholar 

  2. Baeza-Yates, R., Ribeiro-Neto, B., et al.: Modern Information Retrieval, vol. 463. ACM press New York (1999)

    Google Scholar 

  3. Borràs, J., Moreno, A., Valls, A.: Intelligent tourism recommender systems: a survey. Expert Syst. Appl. 41(16), 7370–7389 (2014)

    Article  Google Scholar 

  4. Cera, V., Origlia, A., Cutugno, F., Campi, M.: Semantically annotated 3d material supporting the design of natural user interfaces for architectural heritage. In: AVI* CH (2018)

    Google Scholar 

  5. Dai, Z., Callan, J.: Deeper text understanding for ir with contextual neural language modeling. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 985–988 (2019)

    Google Scholar 

  6. De Carolis, B.N., Gena, C., Kuflik, T., Origlia, A., Raptis, G.E.: AVI-CH 2018: advanced visual interfaces for cultural heritage. In: Proceedings of the 2018 International Conference on Advanced Visual Interfaces, pp. 1–3 (2018)

    Google Scholar 

  7. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  8. Di Martino, S., Fiadone, L., Peron, A., Riccabone, A., Vitale, V.N.: Industrial internet of things: persistence for time series with nosql databases. In: 2019 IEEE 28th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), pp. 340–345. IEEE (2019)

    Google Scholar 

  9. Di Martino, S., Peron, A., Riccabone, A., Vitale, V.N.: Benchmarking management techniques for massive IIoT time series in a fog architecture. Int. J. Grid. Util. Comput. 12(2), 113–125 (2021)

    Article  Google Scholar 

  10. Dietze, F., Karoff, J., Calero Valdez, A., Ziefle, M., Greven, C., Schroeder, U.: An open-source object-graph-mapping framework for neo4j and Scala: Renesca. In: Buccafurri, F., Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-ARES 2016. LNCS, vol. 9817, pp. 204–218. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45507-5_14

    Chapter  Google Scholar 

  11. Do, P., Phan, T.H.V.: Developing a BERT based triple classification model using knowledge graph embedding for question answering system. Appl. Intell. 52(1), 636–651 (2021). https://doi.org/10.1007/s10489-021-02460-w

    Article  Google Scholar 

  12. Drakopoulos, G., Kanavos, A., Makris, C., Megalooikonomou, V.: On converting community detection algorithms for fuzzy graphs in neo4j. In: Proceedings of the 5th International Workshop on Combinations of Intelligent Methods and Applications, CIMA (2015)

    Google Scholar 

  13. Grazioso, M., Cera, V., Di Maro, M., Origlia, A., Cutugno, F.: From linguistic linked open data to multimodal natural interaction: a case study. In: 2018 22nd International Conference Information Visualisation (IV), pp. 315–320. IEEE (2018)

    Google Scholar 

  14. Jiménez, P., Diez, J.V., Ordieres-Mere, J.: Hoshin kanri visualization with neo4j. empowering leaders to operationalize lean structural networks. Procedia CIRP 55, 284–289 (2016)

    Google Scholar 

  15. Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  16. Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168 (2013)

  17. Origlia, A., Rossi, S., Di Martino, S., Cutugno, F., Chiacchio, M.L.: Multiple-source data collection and processing into a graph database supporting cultural heritage applications. J. Comput. Cult. Heritage (JOCCH) 14(4), 1–27 (2021)

    Article  Google Scholar 

  18. Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)

  19. Ranashinghe, T., Orasan, C., Mitkov, R.: Enhancing unsupervised sentence similarity methods with deep contextualised word representations. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (2019)

    Google Scholar 

  20. Ricci, F.: Recommender systems in tourism. In: Xiang Z., Fuchs M., Gretzel U., Höpken W. (eds.) Handbook of e-Tourism, Springer, Cham, pp. 1–18 (2020)

    Google Scholar 

  21. Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008 (2017)

    Google Scholar 

  22. Wang, W., Li, Y., Wang, S., Ye, X.: Qa4gis: A novel approach learning to answer GIS developer questions with API documentation. Trans. GIS 25(5), 2675–2700 (2021)

    Article  Google Scholar 

  23. Webber, J.: A programmatic introduction to neo4j. In: Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, pp. 217–218. ACM (2012)

    Google Scholar 

  24. Yochum, P., Chang, L., Gu, T., Zhu, M.: Linked open data in location-based recommendation system on tourism domain: a survey. IEEE Access 8, 16409–16439 (2020)

    Article  Google Scholar 

  25. Zhou, C., Zhao, J., Zhang, X., Ren, C.: Entity alignment method of points of interest for internet location-based services. J. Adv. Comput. Intell. Intell. Inf. 24(7), 837–845 (2020)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergio Di Martino .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Origlia, A., Di Martino, S. (2022). On the Impact of Location-related Terms in Neural Embeddings for Content Similarity Measures in Cultural Heritage Recommender Systems. In: Karimipour, F., Storandt, S. (eds) Web and Wireless Geographical Information Systems. W2GIS 2022. Lecture Notes in Computer Science, vol 13238. Springer, Cham. https://doi.org/10.1007/978-3-031-06245-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-06245-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-06244-5

  • Online ISBN: 978-3-031-06245-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics