Neural Word Sense Disambiguation to Prune a Large Knowledge Graph of the Italian Cultural Heritage

Faggiani, Erica; Faralli, Stefano; Velardi, Paola

doi:10.1007/978-3-031-15743-1_54

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1652))

Included in the following conference series:

European Conference on Advances in Databases and Information Systems

1265 Accesses

Abstract

In this paper, we describe our recent findings in interlinking the ArCo Italian cultural heritage entities to the well known Getty Art and Architecture (GVP) Thesaurus through the automated extraction of candidate entities from textual descriptions and the subsequent pruning of ambiguous out-of-domain entities using Neural Word Sense Disambiguation. The disambiguation task is particularly complex since, as detailed in this paper, we map Italian entities in the Arco cultural heritage onto lexical concepts in English (such as those in the GVP Thesaurus). To date, the majority of entity linking and word sense disambiguation systems are designed to work with English and to operate with general purpose sense inventories and knowledge bases, such as DBpedia, BabelNet and WordNet. To address this challenging entity linking and disambiguation task, we adapted a state-of-the-art Neural Word Sense Disambiguation to work in this multi-language setting. We here describe our adaptation process and discuss preliminary experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automatic Identification and Disambiguation of Concepts and Named Entities in the Multilingual Wikipedia

DoSeR - A Knowledge-Base-Agnostic Framework for Entity Disambiguation Using Semantic Embeddings

Knowledge Graph Extension for Word Sense Annotation

Notes

1.
https://sites.google.com/uniroma1.it/agdli/.
2.
ArCo “indirectly reuses: DOLCE-Zero, DOLCE+DnS, CIDOC-CRM, EDM, BIBFRAME, FRBR, FaBiO, FEntry, OAEntry” ARCO - PRIMER GUIDE V1.0 http://wit.istc.cnr.it/arco/primer-guide-v1.0-en.html.
3.
Further worsened, as previously mentioned, by the automated translation in English of the concepts extracted from the textual field in ArCo, in Italian.
4.
“The idea of an AI-complete problem has been around since at least the late 1970s, and refers to the more formal idea of the technique used to confirm the computational complexity of NP-complete problems.” [10].
5.
MDZ Digital Library team (dbmdz) at the Bavarian State Library https://huggingface.co/dbmdz/bert-base-italian-xxl-cased.
6.
https://dati.cultura.gov.it/sparql.
7.
Google Translation AI, https://cloud.google.com/translate.
8.
floral decorative motifs (pedestal) - Neapolitan workshop (last quarter of the 19th century).
9.
https://www.signll.org/conll/.
10.
https://spacy.io/.

References

Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52
Chapter Google Scholar
de Bem Machado, A., Secinaro, S., Calandra, D., Lanzalonga, F.: Knowledge management and digital transformation for Industry 4.0: a structured literature review. Knowl. Manage. Res. Pract. 20(2), 320–338 (2022). https://doi.org/10.1080/14778238.2021.2015261
Article Google Scholar
Bevilacqua, M., Pasini, T., Raganato, A., Navigli, R.: Recent trends in word sense disambiguation: a survey. In: Zhou, Z.H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, Survey Track, pp. 4330–4338. International Joint Conferences on Artificial Intelligence Organization, August 2021. https://doi.org/10.24963/ijcai.2021/593
Binucci, C., De Luca, F., Di Giacomo, E., Liotta, G., Montecchiani, F.: Designing the content analyzer of a travel recommender system. Exp. Syst. Appl. 87, 199–208 (2017). https://doi.org/10.1016/j.eswa.2017.06.028
Article Google Scholar
Carriero, V.A., et al.: ArCo: the Italian cultural heritage knowledge graph. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11779, pp. 36–52. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30796-7_3
Chapter Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 4171–4186. Association for Computational Linguistics, June 2019. https://doi.org/10.18653/v1/N19-1423
Faralli, S., Lenzi, A., Velardi, P.: AGDLI: ArCo, GVP and DBpedia linking initiative. In: Seneviratne, O., Pesquita, C., Sequeda, J., Etcheverry, L. (eds.) Proceedings of the ISWC 2021 Posters, Demos and Industry Tracks: From Novel Ideas to Industrial Practice co-located with 20th International Semantic Web Conference, ISWC 2021, Virtual Conference, CEUR Workshop Proceedings, 24–28 October 2021, vol. 2980. CEUR-WS.org (2021). https://ceur-ws.org/Vol-2980/paper304.pdf
Feigenbaum, E.A.: Knowledge engineering. Ann. NY Acad. Sci. 426(1), 91–107 (1984). https://doi.org/10.1111/j.1749-6632.1984.tb16513.x
Article Google Scholar
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. Language, Speech, and Communication. MIT Press, Cambridge, MA (1998)
Google Scholar
Goebel, R.: Folk reducibility and AI-complete problems. In: Dengel, A.R., Berns, K., Breuel, T.M., Bomarius, F., Roth-Berghofer, T.R. (eds.) KI 2008. LNCS (LNAI), vol. 5243, pp. 1–1. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85845-4_1
Chapter Google Scholar
Hadiwinoto, C., Ng, H.T., Gan, W.C.: Improved word sense disambiguation using pre-trained contextualized word representations. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 5296–5305. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1533
Harpring, P.: Development of the getty vocabularies: AAT, TGN, ULAN, and CONA. Art Documentation J. Art Libr. Soc. North Am. 29(1), 67–72 (2010). https://www.jstor.org/stable/27949541
Hoppe, F., Dessì, D., Sack, H.: Deep learning meets knowledge graphs for scholarly data classification. In: Companion Proceedings of the Web Conference 2021, WWW 2021, pp. 417–421. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3442442.3451361
Huang, L., Sun, C., Qiu, X., Huang, X.: GlossBERT: BERT for word sense disambiguation with gloss knowledge. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 3507–3512. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1355
Janev, V., Graux, D., Jabeen, H., Sallinger, E. (eds.): Knowledge Graphs and Big Data Processing. LNCS, vol. 12072. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53199-7
Book Google Scholar
Jiaju, D., Fanchao, Q., Maosong, S.: Using BERT for Word Sense Disambiguation, September 2019. https://doi.org/10.48550/arXiv.1909.08358
Luo, F., Liu, T., He, Z., Xia, Q., Sui, Z., Chang, B.: Leveraging gloss knowledge in neural word sense disambiguation by hierarchical co-attention. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October–November 2018, pp. 1402–1411. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/D18-1170
Mladineo, M., Crnjac Zizic, M., Aljinovic, A., Gjeldum, N.: Towards a knowledge-based cognitive system for industrial application: case of personalized products. J. Ind. Inf. Integr. 27, 100284 (2022). https://doi.org/10.1016/j.jii.2021.100284
Article Google Scholar
Navigli, R.: Word sense disambiguation: a survey. ACM Comput. Surv. 41(2), 1–69 (2009). https://doi.org/10.1145/1459352.1459355
Article Google Scholar
Pokojski, J., Szustakiewicz, K., Woźnicki, Ł., Oleksiński, K., Pruszyński, J.: Industrial application of knowledge-based engineering in commercial CAD/CAE systems. J. Industr. Inf. Integr. 25, 100255 (2022). https://doi.org/10.1016/j.jii.2021.100255, https://www.sciencedirect.com/science/article/pii/S2452414X21000546
Rossi, A., Barbosa, D., Firmani, D., Matinata, A., Merialdo, P.: Knowledge graph embedding for link prediction: a comparative analysis. ACM Trans. Knowl. Discov. Data 15(2), 1–49 (2021). https://doi.org/10.1145/3424672
Article Google Scholar
Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015). https://doi.org/10.1109/TKDE.2014.2327028
Article Google Scholar
Tiddi, I., Schlobach, S.: Knowledge graphs as tools for explainable machine learning: A survey. Artif. Intell. 302, 103627 (2022). https://doi.org/10.1016/j.artint.2021.103627
Article MathSciNet MATH Google Scholar
Wiedemann, G., Remus, S., Chawla, A., Biemann, C.: Does BERT make any sense? Interpretable word sense disambiguation with contextualized embeddings. In: Proceedings of the 15th Conference on Natural Language Processing, KONVENS 2019, Erlangen, Germany, 9–11 October 2019 (2019). https://corpora.linguistik.uni-erlangen.de/data/konvens/proceedings/papers/KONVENS2019_paper_43.pdf
Yap, B.P., Koh, A., Chng, E.S.: Adapting BERT for word sense disambiguation with gloss selection objective and example sentences. In: Findings of the Association for Computational Linguistics, EMNLP 2020, Online, pp. 41–46. Association for Computational Linguistics, November 2020. https://www.aclweb.org/anthology/2020.findings-emnlp.4
Zheng, C., et al.: Knowledge-based program generation approach for robotic manufacturing systems. Robot. Comput. Integr. Manuf. 73, 102242 (2022). https://doi.org/10.1016/j.rcim.2021.102242
Article Google Scholar
Zheng, W., Cheng, J., Wu, X., Sun, R., Wang, X., Sun, X.: Domain knowledge-based security bug reports prediction. Knowl. Based Syst. 241, 108293 (2022). https://doi.org/10.1016/j.knosys.2022.108293
Article Google Scholar
Zhuang, Y., Wu, F., Chen, C., Pan, Y.: Challenges and opportunities: from big data to knowledge in AI 2.0. Front. Inf. Technol. Electron. Eng. 18(1), 3–14 (2017). https://doi.org/10.1631/FITEE.1601883
Article Google Scholar

Download references

Acknowledgements

This work was carried out within the research project “SMARTOUR: intelligent platform for tourism” funded by the Italian Ministry of University and Research with the Regional Development Fund of European Union (PON Research and Competitiveness 2007-2013).

Author information

Authors and Affiliations

Sapienza University of Rome, Rome, Italy
Erica Faggiani, Stefano Faralli & Paola Velardi

Authors

Erica Faggiani
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Faralli
View author publications
You can also search for this author in PubMed Google Scholar
Paola Velardi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefano Faralli .

Editor information

Editors and Affiliations

Politecnico di Torino, Turin, Italy
Silvia Chiusano
Politecnico di Torino, Turin, Italy
Tania Cerquitelli
Poznań University of Technology, Poznań, Poland
Robert Wrembel
Norwegian University of Science and Technology, Trondheim, Norway
Kjetil Nørvåg
University of Genoa, Genoa, Italy
Barbara Catania
CNRS, Villeurbanne Cedex, France
Genoveva Vargas-Solar
University of Calabria, Rende, Italy
Ester Zumpano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Faggiani, E., Faralli, S., Velardi, P. (2022). Neural Word Sense Disambiguation to Prune a Large Knowledge Graph of the Italian Cultural Heritage. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_54

Download citation

DOI: https://doi.org/10.1007/978-3-031-15743-1_54
Published: 29 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15742-4
Online ISBN: 978-3-031-15743-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Neural Word Sense Disambiguation to Prune a Large Knowledge Graph of the Italian Cultural Heritage