Abstract
Europeana, a digital library that aggregates content from libraries, archives and museums from all around Europe, offers search functionality using the metadata of more than 62 million objects. However, in most cases, this data is only available in one language, while users come from countries with different languages. Europeana’s strategy for the improvement of multilingual experiences includes the design and implementation of a multilingual information retrieval system based on the translation of queries and metadata to English. As a first development in this context, we have implemented a pilot applying query translation to English for the Spanish version of the website in order to surface results that have English metadata associated with them. We conducted an evaluation to assess the performance of this pilot and identify issues. The good performance rates observed allowed us to take the pilot to production, and the issues identified led to a list of specific actions, which should be addressed to the extent possible before the application of a wider multilingual information retrieval system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
This reflects a quality issue in the metadata received by Europeana, which is often not labelled with language information. We know that the language of the data provider is Spanish for 5.5% of the records, while it is English for 17.5%.
- 5.
The period considered seeks to reduce the automatic sampling of sessions done by Google Analytics, from which we have collected the queries.
- 6.
- 7.
- 8.
We used trec_eval v9.0, accessible at https://github.com/usnistgov/trec_eval.
- 9.
- 10.
References
Agosti, M., Fabris, E., Silvello, G.: On synergies between information retrieval and digital libraries. In: Manghi, P., Candela, L., Silvello, G. (eds.) IRCDL 2019. CCIS, vol. 988, pp. 3–17. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11226-4_1
CEF Automated Translation Service: eTranslation. https://ec.europa.eu/cefdigital/wiki/display/CEFDIGITAL/eTranslation
Chen, H.: Digital library research in the US: an overview with a knowledge management perspective. Program: Electron. Lib. Inf. Syst. 38(3), 157–167 (2004)
Diekema, A.R.: Multilinguality in the digital library: a review. Electron. Libr. 30(2), 165–181 (2012)
Dolamic, L., Savoy, J.: Retrieval effectiveness of machine translated queries. J. Am. Soc. Inf. Sci. Technol. 61, 2266–2273 (2010)
España-Bonet, C., Stiller, J., Ramthun, R., van Genabith, J., Petras, V.: Query translation for cross-lingual search in the academic search engine PubPsych. In: Garoufallou, E., Sartori, F., Siatri, R., Zervas, M. (eds.) MTSR 2018. CCIS, vol. 846, pp. 37–49. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14401-2_4
Google Cloud Translation API. https://cloud.google.com/translate, Java client library (com.google.cloud:libraries-bom:4.0.0)
Kools, J., Lagos, N., Petras, V., Stiller, J., Vald, E.: GALATEAS project (Generalized Analysis of Logs for Automatic Translation and Episodic Analysis of Searches). D7.4 Final Evaluation of Query Translation (2013). version 2.0
Marrero, M., Isaac, A.: Implementation and evaluation of a multilingual search pilot in the Europeana digital library (dataset) (2022). https://doi.org/10.5281/zenodo.6861293
Marrero, M., Isaac, A., Freire, N.: Automatic translation and multilingual cultural heritage retrieval: a case study with transcriptions in Europeana. In: Berget, G., Hall, M.M., Brenn, D., Kumpulainen, S. (eds.) TPDL 2021. LNCS, vol. 12866, pp. 133–138. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86324-1_17
Marrero, M., Isaac, A., Manguinas, H., Neale, A.: Europeana DSI-4 search improvement strategy. Technical report, Europeana (2021). https://pro.europeana.eu/post/europeana-search-strategy
Matusiak, K.K., Meng, L., Barczyk, E., Shih, C.J.: Multilingual metadata for cultural heritage materials: the case of the Tse-Tsung chow collection of Chinese scrolls and fan paintings. Electron. Libr. 33(1), 136–51 (2015)
Neale, A., Isaac, A., Manguinas, H., Moskalenko, D., Marrero, M.: Europeana DSI-4 multilingual strategy. Technical report, Europeana (2020). https://pro.europeana.eu/post/europeana-dsi-4-multilingual-strategy
Oudenaren, J.V.: The world digital library. Uncommon Culture 3(5/6), 65–71 (2012)
Peters, C., Braschler, M., Clough, P.: Multilingual Information Retrieval: From Research to Practice. Springer, Heidelberg, Germany (2012). https://doi.org/10.1007/978-3-642-23008-0
Savoy, J., Braschler, M.: Lessons learnt from experiments on the Ad Hoc multilingual test collections at CLEF. In: Information Retrieval Evaluation in a Changing World. TIRS, vol. 41, pp. 177–200. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22948-1_7
Stiller, J., Gäde, M., Petras, V.: Ambiguity of queries and the challenges for query language detection. In: CLEF 2010 LABs and Workshops (2010). Padua, Italy (2010)
Stiller, J., Gäde, M., Petras, V.: Multilingual access to digital libraries: the Europeana use case. Inf. Wiss. Prax. 64(2–3), 86–95 (2013)
Stiller, J., Petras, V.: Best practices for multilingual access. Technical report, Europeana (2016). https://pro.europeana.eu/post/best-practices-for-multilingual-access
Stiller, J., Petras, V., Lüschow, A.: CLUBS Project (Cross-Lingual Bibliographic Search). M5.3 Final Evaluation (2019). version 1.0
Vassilakaki, E., Garoufallou, E.: Multilingual digital libraries: a review of issues in system-centered and user-centered studies, information retrieval and user behavior. Int. Inf. Lib. Rev. 45, 3–19 (2013)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Marrero, M., Isaac, A. (2022). Implementation and Evaluation of a Multilingual Search Pilot in the Europeana Digital Library. In: Silvello, G., et al. Linking Theory and Practice of Digital Libraries. TPDL 2022. Lecture Notes in Computer Science, vol 13541. Springer, Cham. https://doi.org/10.1007/978-3-031-16802-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-16802-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16801-7
Online ISBN: 978-3-031-16802-4
eBook Packages: Computer ScienceComputer Science (R0)