Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter Oldenbourg August 1, 2017

Opinion paper: Data provenance challenges in biomedical research

  • Benjamin Baum

    Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

    , Christian R. Bauer

    Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

    , Thomas Franke

    Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

    , Harald Kusch

    Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

    , Marcel Parciak

    Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

    , Thorsten Rottmann

    Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

    , Nadine Umbach

    Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

    and Ulrich Sax

    Head of WG Infrastructure for Translational Research, Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

    EMAIL logo

Abstract

In this opinion paper we provide an overview of some challenges concerning data provenance in biomedical research. We reflect current literature and depict some examples of existing implicit or explicit provenance aspects in some standard data types in translational research. Furthermore, we assess the need of further data provenance standardization in biomedical informatics. Basic data provenance should provide a recall about the origin of the data, transformation process steps, support replication and presentation of the data. Even though usable concepts for the documentation of data provenance can be found in other fields as early as 2005, the penetration rate in biomedical projects and in the biomedical literature is quite low. The awareness for the necessity of basic data provenance has to be raised, the education of data managers has to be further improved.

About the authors

Benjamin Baum

Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

Christian R. Bauer

Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

Thomas Franke

Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

Harald Kusch

Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

Marcel Parciak

Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

Thorsten Rottmann

Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

Nadine Umbach

Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

Ulrich Sax

Head of WG Infrastructure for Translational Research, Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany

Acknowledgement

Some Aspects of this article were presented at the HEC 2016 conference and were cited accordingly [1].

This work was supported by the German Federal Ministry of Education and Research (BMBF) in the context of the following projects: DZHK 81Z7300173 and 81X1300101, GenoPerspektiv 01GP1402, sysINFLAME 01ZX1306C, myPathSem 031L0024A and IDRT1/2 by the TMF/BMBF grant 01GI1003 as well as the German Research Foundation (DFG) within the grant for the Collaborative Research Centre (SFB) 1002, subproject INF.

Received: 2017-5-18
Revised: 2017-6-16
Accepted: 2017-7-21
Published Online: 2017-8-1
Published in Print: 2017-8-28

©2017 Walter de Gruyter Berlin/Boston

Downloaded on 26.4.2024 from https://www.degruyter.com/document/doi/10.1515/itit-2016-0031/html
Scroll to top button