Skip to main content

Implementation and Evaluation of a Multilingual Search Pilot in the Europeana Digital Library

  • Conference paper
  • First Online:
Linking Theory and Practice of Digital Libraries (TPDL 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13541))

Included in the following conference series:

Abstract

Europeana, a digital library that aggregates content from libraries, archives and museums from all around Europe, offers search functionality using the metadata of more than 62 million objects. However, in most cases, this data is only available in one language, while users come from countries with different languages. Europeana’s strategy for the improvement of multilingual experiences includes the design and implementation of a multilingual information retrieval system based on the translation of queries and metadata to English. As a first development in this context, we have implemented a pilot applying query translation to English for the Spanish version of the website in order to surface results that have English metadata associated with them. We conducted an evaluation to assess the performance of this pilot and identify issues. The good performance rates observed allowed us to take the pilot to production, and the issues identified led to a list of specific actions, which should be addressed to the extent possible before the application of a wider multilingual information retrieval system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://europeana.eu/es.

  2. 2.

    http://en.childrenslibrary.org.

  3. 3.

    https://pro.europeana.eu/page/edm-documentation.

  4. 4.

    This reflects a quality issue in the metadata received by Europeana, which is often not labelled with language information. We know that the language of the data provider is Spanish for 5.5% of the records, while it is English for 17.5%.

  5. 5.

    The period considered seeks to reduce the automatic sampling of sessions done by Google Analytics, from which we have collected the queries.

  6. 6.

    https://solr.apache.org/guide/7_7/the-query-elevation-component.html.

  7. 7.

    https://github.com/europeana/search/tree/12a2d78/solr_confs/metadata.

  8. 8.

    We used trec_eval v9.0, accessible at https://github.com/usnistgov/trec_eval.

  9. 9.

    https://pro.europeana.eu/page/entity#entity-collection.

  10. 10.

    https://pro.europeana.eu/project/europeana-translate.

References

  1. Agosti, M., Fabris, E., Silvello, G.: On synergies between information retrieval and digital libraries. In: Manghi, P., Candela, L., Silvello, G. (eds.) IRCDL 2019. CCIS, vol. 988, pp. 3–17. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11226-4_1

    Chapter  Google Scholar 

  2. CEF Automated Translation Service: eTranslation. https://ec.europa.eu/cefdigital/wiki/display/CEFDIGITAL/eTranslation

  3. Chen, H.: Digital library research in the US: an overview with a knowledge management perspective. Program: Electron. Lib. Inf. Syst. 38(3), 157–167 (2004)

    Article  Google Scholar 

  4. Diekema, A.R.: Multilinguality in the digital library: a review. Electron. Libr. 30(2), 165–181 (2012)

    Article  Google Scholar 

  5. Dolamic, L., Savoy, J.: Retrieval effectiveness of machine translated queries. J. Am. Soc. Inf. Sci. Technol. 61, 2266–2273 (2010)

    Article  Google Scholar 

  6. España-Bonet, C., Stiller, J., Ramthun, R., van Genabith, J., Petras, V.: Query translation for cross-lingual search in the academic search engine PubPsych. In: Garoufallou, E., Sartori, F., Siatri, R., Zervas, M. (eds.) MTSR 2018. CCIS, vol. 846, pp. 37–49. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14401-2_4

    Chapter  Google Scholar 

  7. Google Cloud Translation API. https://cloud.google.com/translate, Java client library (com.google.cloud:libraries-bom:4.0.0)

  8. Kools, J., Lagos, N., Petras, V., Stiller, J., Vald, E.: GALATEAS project (Generalized Analysis of Logs for Automatic Translation and Episodic Analysis of Searches). D7.4 Final Evaluation of Query Translation (2013). version 2.0

    Google Scholar 

  9. Marrero, M., Isaac, A.: Implementation and evaluation of a multilingual search pilot in the Europeana digital library (dataset) (2022). https://doi.org/10.5281/zenodo.6861293

  10. Marrero, M., Isaac, A., Freire, N.: Automatic translation and multilingual cultural heritage retrieval: a case study with transcriptions in Europeana. In: Berget, G., Hall, M.M., Brenn, D., Kumpulainen, S. (eds.) TPDL 2021. LNCS, vol. 12866, pp. 133–138. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86324-1_17

    Chapter  Google Scholar 

  11. Marrero, M., Isaac, A., Manguinas, H., Neale, A.: Europeana DSI-4 search improvement strategy. Technical report, Europeana (2021). https://pro.europeana.eu/post/europeana-search-strategy

  12. Matusiak, K.K., Meng, L., Barczyk, E., Shih, C.J.: Multilingual metadata for cultural heritage materials: the case of the Tse-Tsung chow collection of Chinese scrolls and fan paintings. Electron. Libr. 33(1), 136–51 (2015)

    Article  Google Scholar 

  13. Neale, A., Isaac, A., Manguinas, H., Moskalenko, D., Marrero, M.: Europeana DSI-4 multilingual strategy. Technical report, Europeana (2020). https://pro.europeana.eu/post/europeana-dsi-4-multilingual-strategy

  14. Oudenaren, J.V.: The world digital library. Uncommon Culture 3(5/6), 65–71 (2012)

    Google Scholar 

  15. Peters, C., Braschler, M., Clough, P.: Multilingual Information Retrieval: From Research to Practice. Springer, Heidelberg, Germany (2012). https://doi.org/10.1007/978-3-642-23008-0

    Book  Google Scholar 

  16. Savoy, J., Braschler, M.: Lessons learnt from experiments on the Ad Hoc multilingual test collections at CLEF. In: Information Retrieval Evaluation in a Changing World. TIRS, vol. 41, pp. 177–200. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22948-1_7

    Chapter  Google Scholar 

  17. Stiller, J., Gäde, M., Petras, V.: Ambiguity of queries and the challenges for query language detection. In: CLEF 2010 LABs and Workshops (2010). Padua, Italy (2010)

    Google Scholar 

  18. Stiller, J., Gäde, M., Petras, V.: Multilingual access to digital libraries: the Europeana use case. Inf. Wiss. Prax. 64(2–3), 86–95 (2013)

    Google Scholar 

  19. Stiller, J., Petras, V.: Best practices for multilingual access. Technical report, Europeana (2016). https://pro.europeana.eu/post/best-practices-for-multilingual-access

  20. Stiller, J., Petras, V., Lüschow, A.: CLUBS Project (Cross-Lingual Bibliographic Search). M5.3 Final Evaluation (2019). version 1.0

    Google Scholar 

  21. Vassilakaki, E., Garoufallou, E.: Multilingual digital libraries: a review of issues in system-centered and user-centered studies, information retrieval and user behavior. Int. Inf. Lib. Rev. 45, 3–19 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Mónica Marrero or Antoine Isaac .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Marrero, M., Isaac, A. (2022). Implementation and Evaluation of a Multilingual Search Pilot in the Europeana Digital Library. In: Silvello, G., et al. Linking Theory and Practice of Digital Libraries. TPDL 2022. Lecture Notes in Computer Science, vol 13541. Springer, Cham. https://doi.org/10.1007/978-3-031-16802-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16802-4_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16801-7

  • Online ISBN: 978-3-031-16802-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics