Skip to main content

Neural Word Sense Disambiguation to Prune a Large Knowledge Graph of the Italian Cultural Heritage

  • Conference paper
  • First Online:
New Trends in Database and Information Systems (ADBIS 2022)

Abstract

In this paper, we describe our recent findings in interlinking the ArCo Italian cultural heritage entities to the well known Getty Art and Architecture (GVP) Thesaurus through the automated extraction of candidate entities from textual descriptions and the subsequent pruning of ambiguous out-of-domain entities using Neural Word Sense Disambiguation. The disambiguation task is particularly complex since, as detailed in this paper, we map Italian entities in the Arco cultural heritage onto lexical concepts in English (such as those in the GVP Thesaurus). To date, the majority of entity linking and word sense disambiguation systems are designed to work with English and to operate with general purpose sense inventories and knowledge bases, such as DBpedia, BabelNet and WordNet. To address this challenging entity linking and disambiguation task, we adapted a state-of-the-art Neural Word Sense Disambiguation to work in this multi-language setting. We here describe our adaptation process and discuss preliminary experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://sites.google.com/uniroma1.it/agdli/.

  2. 2.

    ArCo “indirectly reuses: DOLCE-Zero, DOLCE+DnS, CIDOC-CRM, EDM, BIBFRAME, FRBR, FaBiO, FEntry, OAEntry” ARCO - PRIMER GUIDE V1.0 http://wit.istc.cnr.it/arco/primer-guide-v1.0-en.html.

  3. 3.

    Further worsened, as previously mentioned, by the automated translation in English of the concepts extracted from the textual field in ArCo, in Italian.

  4. 4.

    “The idea of an AI-complete problem has been around since at least the late 1970s, and refers to the more formal idea of the technique used to confirm the computational complexity of NP-complete problems.” [10].

  5. 5.

    MDZ Digital Library team (dbmdz) at the Bavarian State Library https://huggingface.co/dbmdz/bert-base-italian-xxl-cased.

  6. 6.

    https://dati.cultura.gov.it/sparql.

  7. 7.

    Google Translation AI, https://cloud.google.com/translate.

  8. 8.

    floral decorative motifs (pedestal) - Neapolitan workshop (last quarter of the 19th century).

  9. 9.

    https://www.signll.org/conll/.

  10. 10.

    https://spacy.io/.

References

  1. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52

    Chapter  Google Scholar 

  2. de Bem Machado, A., Secinaro, S., Calandra, D., Lanzalonga, F.: Knowledge management and digital transformation for Industry 4.0: a structured literature review. Knowl. Manage. Res. Pract. 20(2), 320–338 (2022). https://doi.org/10.1080/14778238.2021.2015261

    Article  Google Scholar 

  3. Bevilacqua, M., Pasini, T., Raganato, A., Navigli, R.: Recent trends in word sense disambiguation: a survey. In: Zhou, Z.H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, Survey Track, pp. 4330–4338. International Joint Conferences on Artificial Intelligence Organization, August 2021. https://doi.org/10.24963/ijcai.2021/593

  4. Binucci, C., De Luca, F., Di Giacomo, E., Liotta, G., Montecchiani, F.: Designing the content analyzer of a travel recommender system. Exp. Syst. Appl. 87, 199–208 (2017). https://doi.org/10.1016/j.eswa.2017.06.028

    Article  Google Scholar 

  5. Carriero, V.A., et al.: ArCo: the Italian cultural heritage knowledge graph. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11779, pp. 36–52. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30796-7_3

    Chapter  Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 4171–4186. Association for Computational Linguistics, June 2019. https://doi.org/10.18653/v1/N19-1423

  7. Faralli, S., Lenzi, A., Velardi, P.: AGDLI: ArCo, GVP and DBpedia linking initiative. In: Seneviratne, O., Pesquita, C., Sequeda, J., Etcheverry, L. (eds.) Proceedings of the ISWC 2021 Posters, Demos and Industry Tracks: From Novel Ideas to Industrial Practice co-located with 20th International Semantic Web Conference, ISWC 2021, Virtual Conference, CEUR Workshop Proceedings, 24–28 October 2021, vol. 2980. CEUR-WS.org (2021). https://ceur-ws.org/Vol-2980/paper304.pdf

  8. Feigenbaum, E.A.: Knowledge engineering. Ann. NY Acad. Sci. 426(1), 91–107 (1984). https://doi.org/10.1111/j.1749-6632.1984.tb16513.x

    Article  Google Scholar 

  9. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. Language, Speech, and Communication. MIT Press, Cambridge, MA (1998)

    Google Scholar 

  10. Goebel, R.: Folk reducibility and AI-complete problems. In: Dengel, A.R., Berns, K., Breuel, T.M., Bomarius, F., Roth-Berghofer, T.R. (eds.) KI 2008. LNCS (LNAI), vol. 5243, pp. 1–1. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85845-4_1

    Chapter  Google Scholar 

  11. Hadiwinoto, C., Ng, H.T., Gan, W.C.: Improved word sense disambiguation using pre-trained contextualized word representations. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 5296–5305. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1533

  12. Harpring, P.: Development of the getty vocabularies: AAT, TGN, ULAN, and CONA. Art Documentation J. Art Libr. Soc. North Am. 29(1), 67–72 (2010). https://www.jstor.org/stable/27949541

  13. Hoppe, F., Dessì, D., Sack, H.: Deep learning meets knowledge graphs for scholarly data classification. In: Companion Proceedings of the Web Conference 2021, WWW 2021, pp. 417–421. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3442442.3451361

  14. Huang, L., Sun, C., Qiu, X., Huang, X.: GlossBERT: BERT for word sense disambiguation with gloss knowledge. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 3507–3512. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1355

  15. Janev, V., Graux, D., Jabeen, H., Sallinger, E. (eds.): Knowledge Graphs and Big Data Processing. LNCS, vol. 12072. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53199-7

    Book  Google Scholar 

  16. Jiaju, D., Fanchao, Q., Maosong, S.: Using BERT for Word Sense Disambiguation, September 2019. https://doi.org/10.48550/arXiv.1909.08358

  17. Luo, F., Liu, T., He, Z., Xia, Q., Sui, Z., Chang, B.: Leveraging gloss knowledge in neural word sense disambiguation by hierarchical co-attention. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October–November 2018, pp. 1402–1411. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/D18-1170

  18. Mladineo, M., Crnjac Zizic, M., Aljinovic, A., Gjeldum, N.: Towards a knowledge-based cognitive system for industrial application: case of personalized products. J. Ind. Inf. Integr. 27, 100284 (2022). https://doi.org/10.1016/j.jii.2021.100284

    Article  Google Scholar 

  19. Navigli, R.: Word sense disambiguation: a survey. ACM Comput. Surv. 41(2), 1–69 (2009). https://doi.org/10.1145/1459352.1459355

    Article  Google Scholar 

  20. Pokojski, J., Szustakiewicz, K., Woźnicki, Ł., Oleksiński, K., Pruszyński, J.: Industrial application of knowledge-based engineering in commercial CAD/CAE systems. J. Industr. Inf. Integr. 25, 100255 (2022). https://doi.org/10.1016/j.jii.2021.100255, https://www.sciencedirect.com/science/article/pii/S2452414X21000546

  21. Rossi, A., Barbosa, D., Firmani, D., Matinata, A., Merialdo, P.: Knowledge graph embedding for link prediction: a comparative analysis. ACM Trans. Knowl. Discov. Data 15(2), 1–49 (2021). https://doi.org/10.1145/3424672

    Article  Google Scholar 

  22. Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015). https://doi.org/10.1109/TKDE.2014.2327028

    Article  Google Scholar 

  23. Tiddi, I., Schlobach, S.: Knowledge graphs as tools for explainable machine learning: A survey. Artif. Intell. 302, 103627 (2022). https://doi.org/10.1016/j.artint.2021.103627

    Article  MathSciNet  MATH  Google Scholar 

  24. Wiedemann, G., Remus, S., Chawla, A., Biemann, C.: Does BERT make any sense? Interpretable word sense disambiguation with contextualized embeddings. In: Proceedings of the 15th Conference on Natural Language Processing, KONVENS 2019, Erlangen, Germany, 9–11 October 2019 (2019). https://corpora.linguistik.uni-erlangen.de/data/konvens/proceedings/papers/KONVENS2019_paper_43.pdf

  25. Yap, B.P., Koh, A., Chng, E.S.: Adapting BERT for word sense disambiguation with gloss selection objective and example sentences. In: Findings of the Association for Computational Linguistics, EMNLP 2020, Online, pp. 41–46. Association for Computational Linguistics, November 2020. https://www.aclweb.org/anthology/2020.findings-emnlp.4

  26. Zheng, C., et al.: Knowledge-based program generation approach for robotic manufacturing systems. Robot. Comput. Integr. Manuf. 73, 102242 (2022). https://doi.org/10.1016/j.rcim.2021.102242

    Article  Google Scholar 

  27. Zheng, W., Cheng, J., Wu, X., Sun, R., Wang, X., Sun, X.: Domain knowledge-based security bug reports prediction. Knowl. Based Syst. 241, 108293 (2022). https://doi.org/10.1016/j.knosys.2022.108293

    Article  Google Scholar 

  28. Zhuang, Y., Wu, F., Chen, C., Pan, Y.: Challenges and opportunities: from big data to knowledge in AI 2.0. Front. Inf. Technol. Electron. Eng. 18(1), 3–14 (2017). https://doi.org/10.1631/FITEE.1601883

    Article  Google Scholar 

Download references

Acknowledgements

This work was carried out within the research project “SMARTOUR: intelligent platform for tourism” funded by the Italian Ministry of University and Research with the Regional Development Fund of European Union (PON Research and Competitiveness 2007-2013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefano Faralli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Faggiani, E., Faralli, S., Velardi, P. (2022). Neural Word Sense Disambiguation to Prune a Large Knowledge Graph of the Italian Cultural Heritage. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_54

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15743-1_54

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15742-4

  • Online ISBN: 978-3-031-15743-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics