Skip to main content
Log in

Ranking Dublin Core descriptor lists from user interactions: a case study with Dublin Core Terms using the Dendro platform

  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

Dublin Core descriptors capture metadata in most repositories, and this includes recent repositories dedicated to datasets. DC descriptors are generic and are being adapted to the requirements of different communities with the so-called Dublin Core Application Profiles that rely on the agreement within user communities, taking into account their evolving needs. In this paper, we propose an automated process to help curators and users discover the descriptors that best suit the needs of a specific research group in the task of describing and depositing datasets. Our approach is supported on Dendro, a prototype research data management platform, where an experimental method is used to rank and present DC Terms descriptors to the users based on their usage patterns. User interaction is recorded and used to score descriptors. In a controlled experiment, we gathered the interactions of two groups as they used Dendro to describe datasets from selected sources. One of the groups viewed descriptors according to the ranking, while the other had the same list of descriptors throughout the experiment. Preliminary results show that (1) some DC Terms are filled in more often than others, with different distribution in the two groups, (2) descriptors in higher ranks were increasingly accepted by users in detriment of manual selection, (3) users were satisfied with the performance of the platform, and (4) the quality of description was not hindered by descriptor ranking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. Example from http://wiki.dublincore.org/index.php/FAQ/DC_and_DCTERMS_Namespaces.

  2. Web site: http://dendro.fe.up.pt/blog/index.php/dendro. Source code: http://github.com/feup-infolab/dendro. Demo instance: http://dendro.fe.up.pt/demo.

  3. http://bloody-byte.net/rdf/dc_owl2dl/dcterms.

  4. https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-mlt-query.html.

  5. https://www.icpsr.umich.edu/icpsrweb/.

  6. https://b2share.eudat.eu/.

  7. http://www.re3data.org/.

  8. https://assessment.datasealofapproval.org/assessment_114/seal/html/.

  9. We have used the tokenize_words method of the tokenizers R package, available at https://cran.r-project.org/web/packages/tokenizers/index.html.

  10. http://dendro.fe.up.pt/demo.

  11. https://github.com/feup-infolab/dendro.

  12. https://www.inesctec.pt/en/projects/tail-PC05170.

  13. https://cibio.up.pt.

  14. https://dendro.inesctec.pt & https://dendro-rdm.up.pt.

  15. https://rdm.inesctec.pt and https://ckan-rdm.up.pt.

  16. https://ckan.org.

References

  1. Allinson, J., Johnston, P., Powell, A.: A Dublin Core Application Profile for Scholarly Works. Ariadne, 50, (2007). [Originating URL: http://www.ariadne.ac.uk/issue50/allinson-et-al/. Accessed 15 Jan 2018

  2. Amorim, R., Castro, J., Rocha, J., Ribeiro, C.: Engaging Researchers in Data Management with LabTablet, an Electronic Laboratory Notebook, pp. 216–223. Springer International Publishing, Cham (2015)

    Google Scholar 

  3. Amorim, R., Castro, J., Rocha, J., Ribeiro, C.: A comparison of research data management platforms: architecture, flexible metadata and interoperability. Univ. Access Inf. Soc. 16(4), 851–862 (2017)

    Article  Google Scholar 

  4. Ball, A.: Scientific data application profile scoping study report. Technical report, UKOLN, University of Bath, Bath, UK, (2009)

  5. Bechhofer, S., Buchan, I., Roure, D.D., Missier, P., Ainsworth, J., Bhagat, J., Couch, P., Cruickshank, D., Delderfield, M., Dunlop, I., Gamble, M., Michaelides, D., Owen, S., Newman, D., Sufi, S., Goble, C.: Why linked data is not enough for scientists. Future Gen. Comput. Syst. 29(2), 599–611 (2011)

    Article  Google Scholar 

  6. Berners-Lee, T.: Linked Data—Design Issues. http://www.w3.org/DesignIssues/LinkedData.html, (2008). Accessed 15 Jan 2018

  7. Bizer, C., Heath, T., Berners-Lee, T.: Linked data—the story so far. Special issue on linked data. Int. J. Semantic Web Inf. Syst. 5(3), 1–22 (2009)

    Article  Google Scholar 

  8. Borgman, C.L.: The conundrum of sharing research data. J. Am. Soc. Inform. Sci. Technol. 63(6), 1059–1078 (2012)

    Article  Google Scholar 

  9. Boyko, A., Kunze, J., California Digital Library, Littman, J., Madden, L., Library of Congress, Vargas, B.: The BagIt File Packaging Format (V0.97). https://tools.ietf.org/html/draft-kunze-bagit-06 (2012) Accessed 15 Jan 2018

  10. Coyle, K., Baker, T.: Guidelines for Dublin Core Application Profiles. http://dublincore.org/documents/profile-guidelines/ (2009). Accessed 15 Jan 2018

  11. Dublin Core Metadata Initiative. DCMI Metadata Terms. http://dublincore.org/documents/dcmi-terms (2012). Accessed 15 Jan 2018

  12. Dublin Core Metadata Initiative. Dublin Core Metadata Element Set, Version 1.0: Reference Description. http://dublincore.org/documents/1998/09/dces/ (2012). Accessed 15 Jan 2018

  13. European Commission, Directorate-General for Research & Innovation. H2020 Programme, Guidelines on FAIR Data Management in Horizon 2020, Version 3.0. Technical report, 26 July (2016)

  14. Eynden, V.V.D., Corti, L., Bishop, L., Horton, L.: Managing and Sharing Data: A guide to good practice. UK Data Archive University of Essex Wivenhoe Park Colchester Essex CO4 3SQ, 3rd edition (2011)

  15. Gormley, C., Tong, Z.: Elasticsearch: The Definitive Guide, 1st edn. O’Reilly Media, Inc., Sebastopol (2015)

    Google Scholar 

  16. Goy, A., Magro, D., Petrone, G., Picardi, C., Segnan, M.: Ontology-driven collaborative annotation in shared workspaces. Future Gen. Comput. Syst. 54, 435–449 (2016)

    Article  Google Scholar 

  17. Greenberg, J.: Metadata capital: raising awareness, exploring a new concept economics of knowledge organization systems. Bull. Assoc. Inf. Sci. Technol. 40(4), 30–33 (2014)

    Article  Google Scholar 

  18. Greenberg, J., Swauger, S., Feinstein, E.: Metadata capital in a data repository. Proc. Int. Conf. Dublin Core Metadata Appl. 2013, 140–150 (2013)

    Google Scholar 

  19. Heery, R., Patel, M.: Application profiles: mixing and matching metadata schemas. Ariadne, 25, (2000). Originating URL: http://www.ariadne.ac.uk/issue25/app-profiles/. Accessed 15 Jan 2018

  20. Heidorn, P.B.: Shedding light on the dark data in the long tail of science. Library Trends 57(2), 280–299 (2008)

    Article  Google Scholar 

  21. Hey, T., Tansley, S., Tolle, K.: The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research, October (2009)

  22. Hodson, S.: ADMIRAL: A Data Management Infrastructure for Research Activities in the Life sciences. Technical report, University of Oxford (2011)

  23. Hu, R., Pu, P.: Acceptance Issues of Personality-based Recommender Systems. In: Proceedings of the Third ACM Conference on Recommender Systems (Recsys ’09), pages 221–224, New York, New York, USA, (2009) ACM

  24. International Organization for Standardization. Space data and information transfer systems—Open Archival Information System (OAIS)—Reference model. Standard ISO 14721:2012, Geneva, CH, September (2012)

  25. Jahnke, L., Asher, A., Keralis, S.D.C.: The Problem of Data. Council on Library and Information Resources (2012). [Originating URL: http://www.clir.org/pubs/reports/pub154. Accessed 15 Jan 2018

  26. Joachims, T., Granka, L., Pan, B.: Accurately interpreting clickthrough data as implicit feedback. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 154—161 (2005)

  27. Krause, E.M., Clary, E., Greenberg, J., Ogletree, A.: Evolution of an application profile: advancing metadata best practices through the dryad data repository. In: Procedings of the International Conference on Dublin Core and Metadata Applications 2015, 63–75 (2015)

  28. Lecarpentier, D., Wittenburg, P., Elbers, W., Michelini, A., Kanso, R., Coveney, P., Baxter, R.: EUDAT: A New Cross-Disciplinary Data Infrastructure for Science. Int. J. Digit. Curation 8(1), 279–287 (2013)

    Article  Google Scholar 

  29. Leonelli, S., Spichtinger, D., Prainsack, B.: Sticks and carrots: encouraging open science at its source. Geo: Geogr. Environ. 2(1), 12–16 (2015)

    Google Scholar 

  30. Li, H.: A short introduction to Learning to Rank. IEICE Trans. Inf. Syst. E94–D(10), 1854–1862 (2011)

    Article  Google Scholar 

  31. Lord, P., Macdonald, A.: Data curation for e-Science in the UK: an audit to establish requirements for future curation and provision. Technical report, JISC (2003)

  32. Lyon, L.: Dealing with Data: Roles, Rights, Responsibilities and Relationships. Technical report, UKOLN, University of Bath (2007)

  33. Malta, M., Baptista, A.: State of the Art on Methodologies for the Development of a Metadata Application Profile, pp. 61–73. Springer, Berlin (2012)

    Google Scholar 

  34. Malta, M., Baptista, A.: A Method for the Development of Dublin Core Application Profiles (Me4DCAP V0.1): A Description. In: Proceedings of the International Conference on Dublin Core and Metadata Applications 2013, pp. 90–103 (2013)

  35. Malta, M., Baptista, A.: A panoramic view on metadata application profiles of the last decade. Int. J. Metadata Semant. Ontol. 9(1), 58–73 (2014)

    Article  Google Scholar 

  36. Martinez-Uribe, L.: Using the Data Audit Framework: an Oxford case study. Technical report, Oxford Digital Repositories Steering Group, JISC (2009)

  37. Martinez-Uribe, L., Macdonald, S.: User engagement in research data curation. In: Proceedings of the 13th European conference on Research and advanced technology for digital libraries, volume 5714, pages 309–314. Springer (2009)

  38. Piwowar, H., Vision, T.: Data reuse and the open data citation advantage. PeerJ, 1:e175, (2013) Originating URL: https://doi.org/10.7717/peerj.175. Accessed 15 Jan 2018

  39. Rocha, J.: Usage-driven Application Profile Generation Using Ontologies. Ph.D. thesis, Faculdade de Engenharia, Universidade do Porto, May (2016). Originating URL: http://hdl.handle.net/10216/83993. Accessed 15 Jan 2018

  40. Rocha, J., Castro, C., Ribeiro, J., Lopes, J.: Dendro: Collaborative Research Data Management Built on Linked Open Data, pp. 483–487. Springer International Publishing, Cham (2014)

    Google Scholar 

  41. Rocha, J., Ribeiro, C., Correia Lopes, J.: Ontology-based multi-domain metadata for research data management using triple stores. In: Proceedings of the 18th International Database Engineering & Applications Symposium, IDEAS’14, pp. 105–114, New York, NY, USA, ACM (2014)

  42. Rocha, J., Ribeiro, C., Correia Lopes, J.: The Dendro Research Data Management Platform: Applying Ontologies to Long-Term Preservation in a Collaborative Environment. In: Proceedings of the 11th International Conference on Digital Preservation, Ipres 2014, Melbourne, Australia, October 6–10, 2014 (2014)

  43. Rocha, J., Ribeiro, C., Lopes, J.: Managing research data at U. Porto: requirements, technologies and services. Innovations in XML Applications and Metadata Management: Advancing Technologies, IGI Global:174–197 (2013)

  44. Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and metrics for cold-start recommendations. Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’02) 46, 253–260 (2002)

  45. Magazine, Science: Dealing with data. Challenges and opportunities. Introduction. Science (New York, N.Y.) 331(6018), 692–693 (2011)

  46. Silvello, G.: Theory and practice of data citation. J. Assoc. Inf. Sci. Technol. 69(1), 6–20 (2018)

    Article  Google Scholar 

  47. Sinha, R., Swearingen, K.: Comparing recommendations made by online systems and friends. In: Proceedings of the DELOS-NSF Workshop on Personalisation and Recommender Systems in Digital Libraries, volume 01/W03, Dublin City University, Ireland, 18–20 June (2001)

  48. Sinha, R., Swearingen, K.: The role of transparency in recommender systems. In: CHI ’02 Extended Abstracts on Human Factors in Computing Systems, CHI EA ’02, pp. 830–831, New York, NY, USA, ACM (2002)

  49. Strickroth, S., Pinkwart, N.: High quality recommendations for small communities: the case of a regional parent network. In: Proceedings of the Sixth ACM Conference on Recommender Systems, RecSys ’12, pp. 107–114, New York, NY, USA, ACM (2012)

  50. Swanberg, S.: Inter-university consortium for political and social research (ICPSR). J. Med. Lib. Assoc. 105(1), 106–107 (2017)

    Google Scholar 

  51. Swearingen, K., Sinha, R.: Beyond Algorithms: an HCI perspective on recommender systems. ACM SIGIR 2001 Workshop on Recommender Systems (2001), pp. 1–11 (2001)

  52. The Data Seal of Approval Board. Implementation of the data seal of approval. https://assessment.datasealofapproval.org/assessment_114/seal/html/ (2014). Accessed 15 Jan 2018

  53. Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M., Baak, A., Blomberg, et al.: The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3:160018 EP, 03 (2016)

Download references

Acknowledgements

This work is financed by the ERDF – European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme and by National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project POCI-01-0145-FEDER-016736.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João Rocha da Silva.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

da Silva, J.R., Ribeiro, C. & Lopes, J.C. Ranking Dublin Core descriptor lists from user interactions: a case study with Dublin Core Terms using the Dendro platform. Int J Digit Libr 20, 185–204 (2019). https://doi.org/10.1007/s00799-018-0238-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-018-0238-x

Keywords

Navigation