Skip to main content
Log in

Opportunities and challenges in enhancing access to metadata of cultural heritage collections: a survey

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Machine processable data that narrate digital/non-digital resources are termed as metadata. Different metadata standards exist for describing various types of digital objects. Several researches have reported on how to address issues related to accessing of metadata resources. Most studies on metadata involve cultural heritage domain, and this is an indication of the importance of this domain in metadata research and development. Research on metadata in cultural heritage mainly revolves around three fundamental issues: (1) lack of quality in metadata contents in most of the cases, (2) difficulty in accessing metadata contents due largely to limited user’s knowledge on the content of the metadata, and (3) heterogeneity of the data at the level of schemas which makes the access even more difficult. The lack of quality in metadata makes it difficult for the users to retrieve and explore information that satisfies their needs. So, in order to make its contents more accessible, enhancing the metadata content is required, especially for cultural heritage collections which consist of digital objects (structured documents) described by a variety of metadata schemas. This paper presents issues and challenges in enhancing access to metadata by reviewing the existing approaches in metadata environment with a particular emphasis on cultural heritage collections. In this paper, firstly, we look at the classification of metadata which is divided into two categories namely data retrieval and information retrieval. Then, we present the analysis, findings and suggestions on how to address issues in enhancing access to metadata contents especially in cultural heritage collections. A detailed comparison is given between information retrieval and data retrieval, and it focuses on the applicability of one approach over the other. A framework that aims to improve the effectiveness of retrieval when searching metadata is also proposed and tested. The proposed framework consists of approaches and methods that are expected to enhance access to metadata especially in cultural heritage collections and be useful for those with limited knowledge on cultural heritage. The experiments were conducted on CHiC2013 which is a collection on cultural heritage. The results show a considerable enhancement over other IR approaches that use the expansion methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Abd Manaf Z (2007) The state of digitization initiatives by cultural institutions Malaysia. An exploratory survey. Libr Rev 56(1):45–60

    Article  Google Scholar 

  • Agirre E, Arregi X, Otegi E (2010) Document expansion based on WordNet for robust IR. In: Proceedings of the 23rd international conference on computational linguistics, China, pp 9–17

  • Agosti M, Conlan O, Ferro N, Hampson C, Munnelly G (2013) Interacting with digital cultural heritage collections via annotations: the CULTURA approach. In: Proceedings of the 2013 ACM symposium on document engineering. ACM, New York, NY, USA, pp 13–22

  • Akasereh M (2013) A quantitative evaluation of query expansion in domain specific information retrieval. In: Proceedings of the 76th ASIS&T annual meeting: beyond the cloud: rethinking information boundaries, Montreal, Quebec, Canada, pp 1–5

  • Alma’aitah WZ, Talib AZ, Osman MA (2019a) Document expansion method for digital resource objects. In: 2019 IEEE Jordan international joint conference on electrical engineering and information technology (JEEIT), pp 256–260

  • Alma’aitah WZ, Talib AZ, Osman MA (2019b) Information retrieval framework for digital resource objects. In: Presented in international conference on innovations in computer science and engineering (ICICSE2019) Miri, Sarawak, Malaysia

  • Alma’aitah WZ, Talib AZ, Osman MA (2019c) Language model for digital recourse objects retrieval. J Theor Appl Inf Technol 97(11):2871–2881

    Google Scholar 

  • AlMasri M, Berrut C, Chevallet J-P (2013) Wikipedia-based semantic query enrichment. In: Proceedings of the sixth international workshop on exploiting semantic annotations in information retrieval. ACM, California, USA, pp 5–8

  • AlMasri M, Berrut C, Chevallet JP (2014) Exploiting wikipedia structure for short query expansion in cultural heritage. In: Proceedings of the CORIA, pp 287–302

  • Alvey E (2016) Cultural heritage information: access and management. Aust Acad Res Libr 47(2):120–121

    Article  Google Scholar 

  • Amer-Yahia S, Botev C, Shanmugasundaram J (2004) Texquery: a full-text search extension to xquery. In: Proceedings of the 13th international conference on World Wide Web, pp 583–594

  • Antoniou G, Van Harmelen F (2004) A semantic web primer. MIT Press, Cambridge

    Google Scholar 

  • Aouicha MB, Tmar M, Boughanem M, Abid M (2009) Experiments on element and document statistics for XML retrieval based on tree matching. Int J Comput Inf Sci Eng 3(1):7–16

    Google Scholar 

  • Arms W (1995) Key concepts in the architecture of the digital library. D-Lib Magazine, Vol 1(1). http://www.dlib.org/dlib/July95/07arms.html. Accessed 2017

  • Aruleba KD, Akomolafe DT, Afeni B (2016) A full text retrieval system in a digital library environment. Intell Inf Manag 8:720–726

    Google Scholar 

  • Attar R, Fraenkel AS (1977) Local feedback in full-text retrieval systems. J ACM 24(3):397–417. https://doi.org/10.1145/322017.322021

    Article  MATH  Google Scholar 

  • Baca M (2003) Practical issues in applying metadata schemas and controlled vocabularies to cultural heritage information. Catal Classif Q 36(3):47–55

    Google Scholar 

  • Bai J, Song D, Bruza P, Nie JY, Cao G (2005) Query expansion using term relationships in language models for information retrieval. In: Proceedings of the 14th ACM international conference on Information and knowledge management, pp 688–695

  • Barros EG, Laender AHF, Gonçalves MA, Barbosa FAR (2008) A digital library environment for integrating, disseminating and exploring ecological Data. Ecol Inform 3(4–5):295–308

    Article  Google Scholar 

  • Baziz M (2005) Indexation conceptuelle guidée par ontologie pour la recherche d’information. Ph.D. thesis, Université de Toulouse, Université Toulouse IIIPaul Sabatier

  • Bellini E, Deussom MA, Nesi P (2010) Assessing open archive OAI-PMH implementations. DMS, pp 153–158

  • Berardi G, Esuli A, Gordea S, Marcheggiani D, Sebastiani F (2012) Metadata enrichment services for the Europeana digital library. In: Proceeding of second International conference on theory and practice of digital libraries, Berlin, pp 508–511

  • Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 284(5):34–43

    Article  Google Scholar 

  • Bernstein P, Madhavan J, Rahm E (2011) Generic schema matching, ten years later. VLDB Endow J 4(11):695–701

    Article  Google Scholar 

  • Best BD, Halpin PN, Fujioka E, Read AJ, Qian SS, Hazen LJ, Schick RR (2007) Geospatial web services within scientific workflow: predicting marine mammal habitats in a dynamic environment. Ecol Inform 2(3):210–223

    Article  Google Scholar 

  • Bigi B, Huang Y, Mori RD (2004) Vocabulary and language model adaptation using information retrieval. In: Proceedings of the ICSLP, pp 602–605

  • Brocks H, Thiel U, Stein A, Dirsch-Weigand A (2001) Customizable retrieval functions based on user tasks in the cultural heritage domain. In: Proceedings of the 5th European conference on research and advanced technology for digital libraries (ECDL ‘01). Springer, Berlin, pp 37–48

  • Brocks H, Thiel U, Stein A, Dirsch-Weigand A (2002) How to incorporate collaborative discourse in cultural digital libraries. In: Proceedings of the ECAI 2002 workshop on semantic authoring, annotation and knowledge markup (SAAKM02), Lyon, France, pp 37–48

  • Broder A (2002) A taxonomy of web search. SIGIR Forum 36(2):3–10

    Article  MATH  Google Scholar 

  • Brogan ML (2003) Survey of digital library aggregation services, Digital Library Federation. Washington, District Columbia, USA. http://old.diglib.org/pubs/dlf101/dlf101.htm. Accessed 2017

  • Buckley C (1995) Automatic query expansion using smart: TREC 3. In: Proceedings of the third text retrieval conference (TREC–3) NIST Special Publication 500–226

  • Candela L, Castelli D, Ferro N, Ioannidis Y, Koutrika G, Meghini C, Pagano P, Ross S, Soergel D, Agosti M, Dobreva M, Katifori V, Schuldt H (2008) The DELOS digital library reference model—foundations for digital libraries, Version 0.98, December 2007. http://www.delos.info/files/pdf/ReferenceModel/DELOS_DLReferenceModel_0.98.pdf. Accessed 2017

  • Cao G, Nie JY, Gao J, Robertson S (2008) Selecting good expansion terms for pseudo relevance feedback. In: Proceedings of the SIGIR 2008. ACM, pp 243–250

  • Carpineto C, Romano G (2012) A survey of automatic query expansion in information retrieval. ACM Comput Surv 44(1), Article 1, 50 pages. http://dx.doi.org/10.1145/2071389.2071390

  • Carpineto C, Romano G, Giannini V (2002) Improving retrieval feedback with multiple term-ranking function combination. ACM Trans Inf Syst (TOIS) 20(3):259–290

    Article  Google Scholar 

  • Carrasco LB (2013) Information integration: mapping cultural heritage metadata into CIDOC CRM. [Pdf]. Available online

  • Chaudhari B, Parikh M (2012) A comparative study of clustering algorithms using weka tools. Int J Appl Innov Eng Manag 1(2):154–158

    Google Scholar 

  • Chellatamilan T, Suresh R (2013) Concept based query expansion and cluster based feature selection for information retrieval. Life Sci J 10(7):661–667

    Google Scholar 

  • CIDOC Documentation Standards Working Group, and CIDOC CRM SIG (2005) The CIDOC conceptual reference model

  • Coyle K (2010) Library data in the web world. Library technology reports, vol 46, no 2, pp 5–11

  • Craswell N, Hawking D, Robertson S (2001) Effective site finding using link anchor information. In: Proceedings of ACM SIGIR 2001, New Orleans, pp 250–257

  • Cui H, Wen JR, Nie JY, Ma WY (2003) Query expansion by mining user logs. IEEE Trans Knowl Data Eng 15(4):829–839

    Article  Google Scholar 

  • Daquino M, Mambelli F, Peroni S, Tomasi F, Vitali F (2017) Enhancing semantic expressivity in the cultural heritage domain: exposing the zeri photo archive as linked open data. J Comput Cultur Herit (JOCCH) 10(4):21:1–21:21

    Google Scholar 

  • Darwish K, Oard DW (2007) Adapting morphology for arabic information retrieval, Arabic Computational Morphology. Springer, pp 245-262

  • de Boer V, van Doornik J, Buitinck L, Marx M, Veken T, Ribbens K (2013) Linking the kingdom: enriched access to a historiographical text. In: Proceedings of the 7th international conference on knowledge capture (KCAP 2013). ACM, New York, USA, Banff, Canada, pp 17–24

  • Devi M, Gandhi G (2015) WordNet and ontology based query expansion for semantic information retrieval in sports domain. J Comput Sci 11(2):361–371

    Article  Google Scholar 

  • Diaz F, Mitra B, Craswell N (2016) Query expansion with locally-trained word embeddings. In: Proceedings of the 54th annual meeting of the association for computational (ACL), pp 367–377

  • Doerr Marti (2009) Ontologies for cultural heritage, international handbooks on information systems. Springer, Berlin, pp 463–486

    Google Scholar 

  • Dublin Core Metadata Initiative (2008) Expressing dublin core metadata using HTML/XHTML meta and link elements. DCMI recommendation

  • Dublin Core Metadata Initiative (2012) Dublin core metadata element set, Version 1.1. DCMI recommendation. http://dublincore.org/documents/dces/

  • Efron M, Organisciak P, Fenlon K (2012) Improving retrieval of short texts through document expansion. In: Proceedings of the 35th international ACM SIGIR conference, USA, pp 911–920

  • El-Sappagh S, Hendawi A, Bastawissy A (2011) A proposed model for data warehouse ETL processes. J King Saud Univ Comput Inf Sci 23(2):91–104

    Google Scholar 

  • Elsweiler D, Wilson ML, Kirkegaard Lunn B (2011) Understanding casual-leisure information behaviour. New directions in information behaviour. Emerald Press, Bingley, pp 211–241

    Chapter  Google Scholar 

  • Evens T, Hauttekeete L (2011) Challenges of digital preservation for cultural heritage institutions. J Librariansh Inf Sci 43(3):157–165

    Article  Google Scholar 

  • Fuhr N, Lalmas M, Malik S, Szlávik Z (2005) Advances in XML information retrieval. In: Proceedings of third international workshop of the initiative for the evaluation of XML retrieval. Germany. Vol 3493 of Lecture Notes in Computer Science. Springer

  • Fukumoto J, Aburai N, Yamanishi R (2013) Interactive document expansion for answer extraction of question answering system. Procedia Comput Sci 22(3):991–1000

    Article  Google Scholar 

  • Gan L, Hong H (2015) Improving query expansion for information retrieval using wikipedia. Int J Database Theory Appl 8(3):27–40

    Article  Google Scholar 

  • Ganguly D, Leveling J, Jones G (2011) Query expansion for language modeling using sentence similarities. In: Proceeding of the 2nd information retrieval facility (IRF), Vienna, Austria, pp 26–77

  • Ganguly D, Leveling J, Jones G (2013) An LDA-smoothed relevance model for document expansion: a case study for spoken document retrieval. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval, Ireland, pp 1057–1060

  • García P, García A, Alonso S (2017) Exploring the relevance of europeana digital resources: preliminary ideas on Europeana metadata quality. Revista Interamericana de Bibliotecología Journal 40(1):59–69

    Article  Google Scholar 

  • Gendt, M, Isaac, A, Meij, L, Schlobach (2006) Semantic web techniques for multiple views on heterogeneous collections: a case study. In: Proceeding of 10th European conference on digital libraries (ECDL), Spain, pp 426–437

  • Gergatsoulis M, Bountouri L, Gaitanou P, Papatheodorou C (2010) Mapping cultural metadata schemas to CIDOC conceptual reference model. Artificial intelligence: theories, models and applications, volume 6040 of LNCS. Springer, pp 321–326

  • Godby J, Smith D, Childress E (2003) Two paths to interoperable metadata. In: Proceedings of the international conference for dublin core and metadata applications, Washington, USA

  • Goodale P, Clough P, Ford N, Hall M, Stevenson M, Fernando S, Aletras N, Fernie K, Archer P, de Polo A (2012) User-centred design to support exploration and path creation in cultural heritage collections. In: Proceedings of the CEUR workshop, vol. 909, pp 75–78

  • Grefenstette G, Rafes K (2016) Transforming wikipedia into an ontology-based information retrieval search engine for local experts using a third-party taxonomy. In: Joint second workshop on language and ontology and terminology and knowledge structures, Portoroz, Slovenia

  • Gunasekara R, Wijegunasekara M, Dias N (2014) Comparison of major clustering algorithms using Weka tool. In: Proceedings of the 14th international conference on advances in ICT for emerging regions (ICTer), Colombo, pp 272–273

  • Guo L, Shao F, Botev C, Shanmugasundaram J (2003) XRANK: ranked keyword search over XML documents. In: Proceedings of ACM SIGMOD international conference on management of data

  • Hajmoosaei A, Skoric P (2016) Museum ontology-based metadata. In: Presented at the 2016 IEEE tenth international conference on semantic computing (ICSC)

  • Hampson C, Lawless S, Bailey E, Yogev S, Zwerdling N, Carmel D, Conlan O, O’Connor A, Wade V (2012) CULTURA: a metadata-rich environment to support the enhanced interrogation of cultural collections. In: Proceedings of the 6th metadata and semantics research conference, Cádiz, Spain. Springer, pp 227–238

  • Han MS (2013) Semantic Information retrieval based on wikipedia taxonomy. Int J Comput Appl Technol Res 2(1):77–80

    MathSciNet  Google Scholar 

  • Haslhofer B, Klas W (2010) A survey of techniques for achieving metadata interoperability. ACM Comput Surv 42(2), Article 7 (March 2010), p 37. http://dx.doi.org/10.1145/1667062.1667064

  • Hatano K, Kinutani H, Yoshikawa M, Uemura S (2002) Information retrieval system for XML documents. In: Proceedings of the 13th international conference on database and expert systems applications (DEXA ‘02). Springer, London, UK, pp 758–767

  • He B, Ounis I (2009) Studying query expansion effectiveness. In: Proceedings of the 31th European conference on information retrieval, France, pp 611–619

  • He T, Li L, Qu G, Zhang Y (2007) Chinese information retrieval based on document expansion. In: Proceedings of NTCIR-6 workshop meeting, Tokyo, Japan, pp 77–80

  • Hennicke S (2013) Representation of archival user needs using CIDOC CRM. In: Paper presented at the practical experiences with CIDOC CRM and its extensions (CRMEX 2013) workshop, 17th international conference on theory and practice of digital libraries (TPDL 2013)

  • Hersh WR (2005) Information retrieval and digital libraries. Medical informatics: knowledge management and data mining in biomedicine. Springer, Berlin, pp 237–275

    Google Scholar 

  • Heung-Seon Oh, Jung Yuchul (2015) Cluster-based query expansion using external collections in medical information retrieval. J Biomed Inform 58:70–79. https://doi.org/10.1016/j.jbi.2015.09.017

    Article  Google Scholar 

  • Huang Y, Sun L, Nie J-Y (2009) Smoothing document language model with local word graph. In: Proceedings of CIKM’09, pp 1943–1946

  • Hyvönen E, Heino E, Leskinen P, Ikkala E, Koho M, Tamper M, Tuominen J, Mäkelä M (2016) WarSampo data service and semantic portal for publishing linked open data about the second world war history. In: Proceedings of the 13th international conference on the semantic web. Latest advances and new domains, vol 9678. Springer, New York, USA, pp 758–773

  • Ikonomov N, Simeonov B, Parvanova J, Alexiev V (2013) Europeana creative. EDM endpoint. Custom views. In: Digital presentation and preservation of cultural and scientific heritage, vol 3, No. 1, pp 35–43

  • Imran H, Sharan A (2009) Thesaurus and query expansion. Int J Comput Sci Inf Technol (IJCSIT) 1(2):89–97

    Google Scholar 

  • Isaac A, Manguinhas H, Stiller J, Charles V (2015) Report on enrichment and evaluation. Retrieved on July 12, 2019 from https://pro.europeana.eu/files/Europeana_Professional/EuropeanaTech/EuropeanaTech_taskforces/Enrichment_Evaluation/FinalReport_EnrichmentEvaluation_102015.pdf. Accessed 2017

  • Johnson SE, Jourlin P, Spärck Jones K, Woodland PC (1999) Spoken document retrieval for TREC-8 at Cambridge University. In: Proceedings of the 8th text retrieval conference (TREC 8)

  • Kahn R, Wilensky R (2006) A framework for distributed digital object services. Int J Digit Libr 6(2):115–123. https://doi.org/10.1007/s00799-005-0128-x

    Article  Google Scholar 

  • Kakali C, Lourdi I, Stasinopoulou T, Bountouri L, Papatheodorou C, Doerr M, Gergatsoulis M (2007) Integrating Dublin core metadata for cultural heritage collections using ontologies. In: Proceeding of the international conference on Dublin core and metadata applications, Singapore, pp 128–138

  • Kando N, Adachi J (2004) Cultural heritage on line: information access across heterogeneous cultural heritage in Japan. In: Proceedings of international symposium on digital libraries and knowledge communities in networked information society (DLKC’04), pp 136

  • Kanhabua N, Kemkes P, Nejdl W, Nguyen TN, Reis F, Tran NK (2016) How to search the internet archive without indexing it. In: Proceedings of 20th international conference on theory and practice of digital libraries. Springer

  • Kazai G, Lalmas M, Roelleke T (2002) Focussed structured document retrieval. In: String processing and information retrieval, SPIRE 2002

  • Kollia I, Tzouvaras V, Drosopoulos N, Stamou G (2012) A systemic approach for effective semantic access to cultural content. Semant Web J 3(1):65–83

    Article  Google Scholar 

  • Koolen M, Arampatzis A, Kamps J, de Keijzer V, Nussbaum N (2007) Unified access to heterogeneous data in cultural heritage. RIAO

  • Koolen Marijn, Kamps Jaap, de Keijzer Vincent (2009) Information retrieval in cultural heritage. ISR Interdiscip Sci Rev 34(2/3):268–284

    Article  Google Scholar 

  • Kurland O, Krikon E (2011) The opposite of smoothing: a language model approach to ranking query-specific document clusters. J Artif Intell Res 41:367–395

    Article  MathSciNet  MATH  Google Scholar 

  • Kuzi S, Shtok A, Kurland O (2016) Query expansion using word embeddings. In: Proceedings of the 25th ACM international on conference on information and knowledge management. ACM, pp 1929–1932

  • Lee KS, Croft WB, Allan J (2008) A cluster-based resampling method for pseudo-relevance feedback. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 235–242

  • Leveling J, Jones, GJF (2010) Classifying and filtering blind feedback terms to improve information retrieval effectiveness. In: Proceedings of the RIAO 2010, CID

  • Liang S, Ren Z, de Rijke M (2014) The impact of semantic document expansion on cluster-based fusion for microblog search. In: Proceeding of 36th European conference on IR research, Amsterdam, pp 493–499

  • Liao S-H, Huang H-C, Chen Y-N (2010) A semantic web approach to heterogeneous metadata integration. In: Proceeding of second international conference ICCCI computational collective intelligence, Taiwan, pp 205–214

  • Lilis P, Lourdi I, Papatheodorou C, Gergatsoulis M, Department of Archive and Library Sciences (2005) A metadata model for representing time-dependent information in cultural collections. In: Proceedings of the first online metadata and semantics research conference

  • Lin Y, Lin H, He L (2012) A cluster-based resource correlative query expansion in distributed information retrieval. J Comput Inf Syst 8(1):31–38

    Google Scholar 

  • Liu S, McMahon CA, Culley SJ (2008) A review of structured document retrieval (SDR) technology to improve information access performance in engineering document management. Comput Ind 59(1):3–16

    Article  Google Scholar 

  • Liu Z, Natarajan S, Chen H (2011) Query expansion based on clustered results. J VLDB Endow 4(6):350–361

    Article  Google Scholar 

  • Lor PJ, Britz JJ (2012) An ethical perspective on political-economic issues in the long-term preservation of digital heritage. J Am Soc Inf Sci Technol 63(11):2153–2164

    Article  Google Scholar 

  • Lourdi I, Papatheodorou C, Doerr M (2009) Semantic integration of collection description: combining CIDOC/CRM and Dublin core collections application profile. D-Lib Magazine 15(7/8)

  • Madhavan J, Bernstein PA, Rahm E (2001) Generic schema matching with cupid. In: Proceedings of the 27th international conferences on very large databases, pp 49–58

  • Mahdabi P, Andersson L, Keikha M, Crestani F (2012) Automatic refinement of patent queries using concept importance predictors. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval (SIGIR), pp 505–514

  • Manguinhas H, Freire N, Isaac A, Stiller J, Charles V, Soroa A, Simon R, Alexiev V (2016) Exploring comparative evaluation of semantic enrichment tools for cultural heritage metadata. In: Proceeding of international conference on theory and practice of digital libraries. Springer, pp 266–278

  • Manning CD, Raghavan P, Schutze H (2007) Introduction to information retrieval. Cambridge University Press, Cambridge, pp 405–416

    MATH  Google Scholar 

  • Miles A, Bechhofer S (2009) SKOS simple knowledge organization system namespace document. W3C Recommendation

  • Mizzaro S, Marco P, Ivan S, Martino V (2014) Short text categorization exploiting contextual enrichment and external knowledge. In: Proceedings of the first international workshop on social media retrieval and analysis, Australia, pp 57–62

  • Mouromtsev D, Haase P, Cherny E, Pavlov D, Andreev A, Spiridonova A (2015) Towards the Russian linked culture cloud: data enrichment and publishing. In: European semantic web conference. Springer, pp 637–651

  • Naidu R, Bharti, SK, Babu KS, Mohapatra RK (2018) Text summarization with automatic keyword extraction in telugu e-Newspapers smart computing and informatics. Springer, pp 555–564

  • Ng TD, Wactlar HD (2002) Enriching perspectives in exploring cultural heritage documentaries using informedia technologies. In: Proceedings of the 4th international workshop on multimedia information retrieval in conjunction with ACM multimedia. Juan-les-Pins, France

  • Orgel T, Höffernig M, Bailer W, Russegger S (2015) A metadata model and mapping approach for facilitating access to heterogeneous cultural heritage assets. Int J Digit Libr 15(2–4):189–207

    Article  Google Scholar 

  • Pandey P, Maurya LS (2012) Information retrieval systems in XML based database—a review. Int J Adv Res Comput Commun Eng 1(2):789–793

    Google Scholar 

  • Parikh N, Sriram P, Al Hasan M (2013) On segmentation of ecommerce queries. In: Proceedings of the international conference on information and knowledge management, Vol 31, pp 493–518

  • Partridge C (2002) The role of ontology in integrating semantically heterogeneous databases. Technical Report 05/02 LADSEB-CNR, Padoue

  • Peroni S, Tomasi F, Vitali F (2013) The aggregation of heterogeneous metadata in web-based cultural heritage collections: a case study. J Web Eng Technol 8(4):412–432

    Article  Google Scholar 

  • Peterson A, Vieglais D, Sigüenza A, Silva M (2003) A global distributed biodiversity information network: building the world museum. Bul Br Ornithol Club 123A:186–196

    Google Scholar 

  • Pipanmaekaporn, L, Kamonsantoroj S (2016) Latent space learning for enhanced short text classification. In: Proceedings of the international conference on communication and information systems, Thailand, pp 47–52

  • Rami Ghorab M, Zhou Dong, O’Connor Alexander, Wade Vincent (2013) Personalized information retrieval: survey and classification. User Modeling User-Adapt Interact 23(4):381–443

    Article  Google Scholar 

  • Rangrej A, Kulkarni S, Tendulkar AV (2011) Comparative study of clustering techniques for short text documents. In: Proceedings of the 20th international conference companion on World Wide Web, Hyderabad, India, pp 111–112

  • Rebaï RZ, Mnif F, Zayani CA, Amous I (2015) Adaptive global schema generation from heterogeneous metadata schemas. Procedia Comput Sci 60:197–205

    Article  Google Scholar 

  • Reid J, Lalmas M, Finesilver K, Hertzum M (2006) Best entry points for structured document retrieval-part II: types, usage and effectiveness. Inf Process Manag 42(1):89–105. https://doi.org/10.1016/j.ipm.2005.03.002

    Article  Google Scholar 

  • Rivas AR, Iglesias EL, Borrajo L (2014) Study of query expansion techniques and their application in the biomedical information retrieval. Sci World J 2014. https://doi.org/10.1155/2014/132158

    Article  Google Scholar 

  • Rocha C, Schwabe D, Aragao MP (2004) A hybrid approach for searching in the semantic web. In: Paper presented at the proceedings of the 13th international conference on World Wide Web

  • Roelleke T, Lalmas M, Kazai G, Ruthven I, Quicker S (2002) The accessibility dimension for structured document retrieval. In: Proceedings of the 24th European colloquium on information retrieval research, ECIR02, Glasgow

  • Roy D, Paul D, Mitra M, Garain U (2016) Using word embeddings for automatic query expansion. In: Proceeding of the ACM SIGIR 2016 workshop on neural information retrieval (Neu-IR 2016)

  • Salton G, Buckley C (1990) Improving retrieval performance by relevance feedback. J Am Soc Inf Sci 41(4):288–297

    Article  Google Scholar 

  • Schlötterer J, Seifert C, Granitzer M (2014) Web-based just-in-time retrieval for cultural content. In: Proceedings of the 7th international ACM workshop on personalized access to cultural heritage, Vienna, Austria, pp 805–808

  • Seifert C, Bailer W, Orgel T, Gantner L, Kern R, Ziak H, Petit A, Schlötterer J, Zwicklbauer S, Granitzer M (2017) Ubiquitous access to digital cultural heritage. J Comput Cult Herit (JOCCH) Article 4, 27 pages, vol 10 (1)

  • Sharma P, Tripathi R, Tripathi R (2015) Finding similar patents through semantic query expansion. Procedia Comput Sci J 54:390–395

    Article  Google Scholar 

  • Shekarpour S, Marx E, Auer S, Sheth A (2017) Rquery: rewriting natural language queries on knowledge graphs to alleviate the vocabulary mismatch problem. In: Proceedings of the AAAI, pp 3936–3943

  • Shirakawa M, Hara T, Nishio S (2015) N-gram idf: a global term weighting scheme based on information distance. In: Paper presented at the proceedings of the 24th international conference on World Wide Web

  • Signore DO (2008) The semantic web and cultural heritage: ontologies and technologies help in accessing museum information. In: Robering K (Hrsg.) Semiotik der Kultur/Semiotics of culture, Vol. 6, pp 1–31

  • Singhal A, Pereira F (1999) Document expansion for speech retrieval. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval. ACM Press, USA, pp 34–41

  • Sokvitne L (2000) An evaluation of the effectiveness of current Dublin Core metadata for retrieval. In: VALA conference. pp 1–15

  • Spink A (2003) Web search: emerging patterns. Library Trends 52(4):299–306

    Google Scholar 

  • Tallerås K, Massey D, Husevåg A-SR, Preminger M, Pharo N (2014) Evaluating (linked) Metadata transformations across cultural heritage domains. In: Proceeding of the 8th metadata and semantics research conference. Karlsruhe, Germany, Springer, pp 250–261

  • Tan K, Berrut C, Chevallet JP, Mulhem P (2014) Integrating semantic term relations into information retrieval systems based on language models. J Inf Retr Technol 88(70):136–147

    Google Scholar 

  • Tao T, Wang X (2006) Language model information retrieval with document expansion. In: Proceedings of the human language technology conference, New York, pp 407–414

  • Teixeira L, Lopes G, Ribeiro RA (2011) Automatic extraction of document topics. In: Paper presented at the doctoral conference on computing, electrical and industrial systems

  • Theobald A, Weikum G (2002) The index-based XXL search engine for querying XML data with relevance ranking. In: Proceedings of conference on extending database technology, pp 477–495

  • Tomasi F, Ciotti F, Daquino M, Lana M (2015) Using ontologies as a faceted browsing for heterogeneous cultural heritage collections. In: Proceeding of the 1st workshop on intelligent techniques at libraries and archives, Italy

  • Tonkin EL, Tourte GJL (2016) Using the crowd to update cultural heritage catalogue. In: Presented at involving the crowd in future museum experience design—CHI 2016 workshop, San Jose, CA, United States, pp 1–6

  • Tzompanaki K, Doerr M (2012) A new framework for querying semantic networks. In: Proceedings of museums and the web 2012: the international conference for culture and heritage on-line

  • Uschold M, Gruninger M (1996) Ontologies: principles, methods and applications. Knowl Eng Rev 11(2):93–136

    Article  Google Scholar 

  • Vrochidis S, Doulaverakis C, Gounaris A, Nidelkou E, Makris L, Kompatsiaris I (2009) A hybrid ontology and visual-based retrieval model for cultural heritage multimedia collections. Metadata and semantics. Springer, pp 1–10

  • Wache H, Vogele T, Visser U, Stuckenschmidt H, Schuster G, Neumann H, Hubner S (2001) Ontology-based integration of information—a survey of existing approaches. In: Proceeding of the Stuckenschmidt, H., ed., IJCAI-01 Workshop: ontologies and information sharing, pp 108–117

  • Walsh D, Hall MM (2015) Just looking around: supporting casual user’s initial encounters with digital cultural heritage. In: Proceedings of the first international workshop on supporting complex search tasks, volume 1338 of CEUR workshop proceedings. CEUR-WS.org

  • Wang J, Oard DW (2005) Document and query expansion using side collections and thesauri. In: Proceedings of the CLEF 2005 workshop

  • Wen J-R, Nie J-Y, Zhang H-J (2001) Clustering user queries of a search engine. In: Proceedings of the 10th international conference on World Wide Web, Hong Kong

  • Windhager F, Mayr E, Schreder G, Smuc M, Federico P, Miksch S (2016) Reframing cultural heritage collections in a visualization framework of space–time cubes. In: Proceedings of the 3rd histo informatics workshop, vol 1632, pp 20–24

  • Xiong C, Callan J (2015) Query expansion with freebase. In: Proceedings of the 2015 international conference on the theory of information retrieval. ACM, pp 111–120

  • Xu J, Croft WB (1996) Query expansion using local and global document analysis. In: Proceedings of the 19th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR’96. ACM, New York, NY, USA, pp 4–11

  • Xu J, Croft WB (2000) Improving the effectiveness of information retrieval with local context analysis. ACM Trans Inf Syst 18(1):79–112. https://doi.org/10.1145/333135.333138

    Article  Google Scholar 

  • Xu X, Hu X (2010) Cluster-based query expansion using language modeling in the biomedical domain. In: Proceedings of IEEE international conference on bioinformatics and biomedicine workshops

  • Xu Q, He F, Qiu RG (2005) Heterogeneous information integration for supply chain. In: Proceedings of 2005 IEEE international conference on systems, man and cybernetics. IEEE Press, New Jersey, pp 97–105

  • Xu X, Zhu W, Zhang X, Hu X, Song I-Y (2006) A comparison of local analysis, global analysis and ontology-based query expansion strategies for bio-medical literature search. In: Proceedings of IEEE international conference on systems, man and cybernetics, SMC, vol. 4, pp 3441–3446

  • Yang Z, Fan K, Lai Y, Gao K, Wang Y (2014) Short texts classification through reference document expansion. Chin J Electron 23(2):315–321

    Google Scholar 

  • Zervanou K, Korkontzelos I, van den Bosch A, Ananiadou S (2011) Enrichment and structuring of archival description metadata. In: Proceedings of the 5th ACL-HLT workshop on language technology for cultural heritage, Oregon, pp 44–53

  • Zhang Z, Wang Q, Si L, Gao J (2016) Learning for efficient supervised query expansion via two-stage feature selection. In: Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 265–274

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wafa’ Za’al Alma’aitah.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alma’aitah, W.Z., Talib, A.Z. & Osman, M.A. Opportunities and challenges in enhancing access to metadata of cultural heritage collections: a survey. Artif Intell Rev 53, 3621–3646 (2020). https://doi.org/10.1007/s10462-019-09773-w

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-019-09773-w

Keywords

Navigation