Abstract
In this paper, we describe our recent findings in interlinking the ArCo Italian cultural heritage entities to the well known Getty Art and Architecture (GVP) Thesaurus through the automated extraction of candidate entities from textual descriptions and the subsequent pruning of ambiguous out-of-domain entities using Neural Word Sense Disambiguation. The disambiguation task is particularly complex since, as detailed in this paper, we map Italian entities in the Arco cultural heritage onto lexical concepts in English (such as those in the GVP Thesaurus). To date, the majority of entity linking and word sense disambiguation systems are designed to work with English and to operate with general purpose sense inventories and knowledge bases, such as DBpedia, BabelNet and WordNet. To address this challenging entity linking and disambiguation task, we adapted a state-of-the-art Neural Word Sense Disambiguation to work in this multi-language setting. We here describe our adaptation process and discuss preliminary experimental results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
ArCo “indirectly reuses: DOLCE-Zero, DOLCE+DnS, CIDOC-CRM, EDM, BIBFRAME, FRBR, FaBiO, FEntry, OAEntry” ARCO - PRIMER GUIDE V1.0 http://wit.istc.cnr.it/arco/primer-guide-v1.0-en.html.
- 3.
Further worsened, as previously mentioned, by the automated translation in English of the concepts extracted from the textual field in ArCo, in Italian.
- 4.
“The idea of an AI-complete problem has been around since at least the late 1970s, and refers to the more formal idea of the technique used to confirm the computational complexity of NP-complete problems.” [10].
- 5.
MDZ Digital Library team (dbmdz) at the Bavarian State Library https://huggingface.co/dbmdz/bert-base-italian-xxl-cased.
- 6.
- 7.
Google Translation AI, https://cloud.google.com/translate.
- 8.
floral decorative motifs (pedestal) - Neapolitan workshop (last quarter of the 19th century).
- 9.
- 10.
References
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52
de Bem Machado, A., Secinaro, S., Calandra, D., Lanzalonga, F.: Knowledge management and digital transformation for Industry 4.0: a structured literature review. Knowl. Manage. Res. Pract. 20(2), 320–338 (2022). https://doi.org/10.1080/14778238.2021.2015261
Bevilacqua, M., Pasini, T., Raganato, A., Navigli, R.: Recent trends in word sense disambiguation: a survey. In: Zhou, Z.H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, Survey Track, pp. 4330–4338. International Joint Conferences on Artificial Intelligence Organization, August 2021. https://doi.org/10.24963/ijcai.2021/593
Binucci, C., De Luca, F., Di Giacomo, E., Liotta, G., Montecchiani, F.: Designing the content analyzer of a travel recommender system. Exp. Syst. Appl. 87, 199–208 (2017). https://doi.org/10.1016/j.eswa.2017.06.028
Carriero, V.A., et al.: ArCo: the Italian cultural heritage knowledge graph. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11779, pp. 36–52. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30796-7_3
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 4171–4186. Association for Computational Linguistics, June 2019. https://doi.org/10.18653/v1/N19-1423
Faralli, S., Lenzi, A., Velardi, P.: AGDLI: ArCo, GVP and DBpedia linking initiative. In: Seneviratne, O., Pesquita, C., Sequeda, J., Etcheverry, L. (eds.) Proceedings of the ISWC 2021 Posters, Demos and Industry Tracks: From Novel Ideas to Industrial Practice co-located with 20th International Semantic Web Conference, ISWC 2021, Virtual Conference, CEUR Workshop Proceedings, 24–28 October 2021, vol. 2980. CEUR-WS.org (2021). https://ceur-ws.org/Vol-2980/paper304.pdf
Feigenbaum, E.A.: Knowledge engineering. Ann. NY Acad. Sci. 426(1), 91–107 (1984). https://doi.org/10.1111/j.1749-6632.1984.tb16513.x
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. Language, Speech, and Communication. MIT Press, Cambridge, MA (1998)
Goebel, R.: Folk reducibility and AI-complete problems. In: Dengel, A.R., Berns, K., Breuel, T.M., Bomarius, F., Roth-Berghofer, T.R. (eds.) KI 2008. LNCS (LNAI), vol. 5243, pp. 1–1. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85845-4_1
Hadiwinoto, C., Ng, H.T., Gan, W.C.: Improved word sense disambiguation using pre-trained contextualized word representations. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 5296–5305. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1533
Harpring, P.: Development of the getty vocabularies: AAT, TGN, ULAN, and CONA. Art Documentation J. Art Libr. Soc. North Am. 29(1), 67–72 (2010). https://www.jstor.org/stable/27949541
Hoppe, F., Dessì, D., Sack, H.: Deep learning meets knowledge graphs for scholarly data classification. In: Companion Proceedings of the Web Conference 2021, WWW 2021, pp. 417–421. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3442442.3451361
Huang, L., Sun, C., Qiu, X., Huang, X.: GlossBERT: BERT for word sense disambiguation with gloss knowledge. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 3507–3512. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1355
Janev, V., Graux, D., Jabeen, H., Sallinger, E. (eds.): Knowledge Graphs and Big Data Processing. LNCS, vol. 12072. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53199-7
Jiaju, D., Fanchao, Q., Maosong, S.: Using BERT for Word Sense Disambiguation, September 2019. https://doi.org/10.48550/arXiv.1909.08358
Luo, F., Liu, T., He, Z., Xia, Q., Sui, Z., Chang, B.: Leveraging gloss knowledge in neural word sense disambiguation by hierarchical co-attention. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October–November 2018, pp. 1402–1411. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/D18-1170
Mladineo, M., Crnjac Zizic, M., Aljinovic, A., Gjeldum, N.: Towards a knowledge-based cognitive system for industrial application: case of personalized products. J. Ind. Inf. Integr. 27, 100284 (2022). https://doi.org/10.1016/j.jii.2021.100284
Navigli, R.: Word sense disambiguation: a survey. ACM Comput. Surv. 41(2), 1–69 (2009). https://doi.org/10.1145/1459352.1459355
Pokojski, J., Szustakiewicz, K., Woźnicki, Ł., Oleksiński, K., Pruszyński, J.: Industrial application of knowledge-based engineering in commercial CAD/CAE systems. J. Industr. Inf. Integr. 25, 100255 (2022). https://doi.org/10.1016/j.jii.2021.100255, https://www.sciencedirect.com/science/article/pii/S2452414X21000546
Rossi, A., Barbosa, D., Firmani, D., Matinata, A., Merialdo, P.: Knowledge graph embedding for link prediction: a comparative analysis. ACM Trans. Knowl. Discov. Data 15(2), 1–49 (2021). https://doi.org/10.1145/3424672
Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015). https://doi.org/10.1109/TKDE.2014.2327028
Tiddi, I., Schlobach, S.: Knowledge graphs as tools for explainable machine learning: A survey. Artif. Intell. 302, 103627 (2022). https://doi.org/10.1016/j.artint.2021.103627
Wiedemann, G., Remus, S., Chawla, A., Biemann, C.: Does BERT make any sense? Interpretable word sense disambiguation with contextualized embeddings. In: Proceedings of the 15th Conference on Natural Language Processing, KONVENS 2019, Erlangen, Germany, 9–11 October 2019 (2019). https://corpora.linguistik.uni-erlangen.de/data/konvens/proceedings/papers/KONVENS2019_paper_43.pdf
Yap, B.P., Koh, A., Chng, E.S.: Adapting BERT for word sense disambiguation with gloss selection objective and example sentences. In: Findings of the Association for Computational Linguistics, EMNLP 2020, Online, pp. 41–46. Association for Computational Linguistics, November 2020. https://www.aclweb.org/anthology/2020.findings-emnlp.4
Zheng, C., et al.: Knowledge-based program generation approach for robotic manufacturing systems. Robot. Comput. Integr. Manuf. 73, 102242 (2022). https://doi.org/10.1016/j.rcim.2021.102242
Zheng, W., Cheng, J., Wu, X., Sun, R., Wang, X., Sun, X.: Domain knowledge-based security bug reports prediction. Knowl. Based Syst. 241, 108293 (2022). https://doi.org/10.1016/j.knosys.2022.108293
Zhuang, Y., Wu, F., Chen, C., Pan, Y.: Challenges and opportunities: from big data to knowledge in AI 2.0. Front. Inf. Technol. Electron. Eng. 18(1), 3–14 (2017). https://doi.org/10.1631/FITEE.1601883
Acknowledgements
This work was carried out within the research project “SMARTOUR: intelligent platform for tourism” funded by the Italian Ministry of University and Research with the Regional Development Fund of European Union (PON Research and Competitiveness 2007-2013).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Faggiani, E., Faralli, S., Velardi, P. (2022). Neural Word Sense Disambiguation to Prune a Large Knowledge Graph of the Italian Cultural Heritage. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_54
Download citation
DOI: https://doi.org/10.1007/978-3-031-15743-1_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15742-4
Online ISBN: 978-3-031-15743-1
eBook Packages: Computer ScienceComputer Science (R0)