Daten wie Sand am Meer – Datenerhebung, -strukturierung, -management und Data Provenance für die Ostseeforschung

Bruder, Ilvio; Klettke, Meike; Möller, Mark Lukas; Meyer, Frank; Heuer, Andreas; Jürgensmann, Susanne; Feistel, Susanne

doi:10.1007/s13222-017-0259-4

Daten wie Sand am Meer – Datenerhebung, -strukturierung, -management und Data Provenance für die Ostseeforschung

Fachbeitrag
Published: 19 June 2017

Volume 17, pages 183–196, (2017)
Cite this article

Datenbank-Spektrum Aims and scope Submit manuscript

Ilvio Bruder¹,
Meike Klettke¹,
Mark Lukas Möller¹,
Frank Meyer¹,
Andreas Heuer¹,
Susanne Jürgensmann² &
…
Susanne Feistel²

1042 Accesses
3 Citations
Explore all metrics

Zusammenfassung

Das Datenmanagement für heterogene Umweltdaten wird am Beispiel verschiedener Projekte aus dem maritimen Umfeld gezeigt. Besonderer Schwerpunkt dabei sind eine Pipeline zur Integration heterogener Forschungsdaten, die Nachvollziehbarkeit der Daten (Data Provenance) und die Berücksichtigung temporaler Aspekte bei der Erhebung, Speicherung und Auswertung der Daten.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

Pangea.de ist eine Initiative für ein System zur Veröffentlichung und Nachweis für georeferenzierte Forschungsdaten des Alfred-Wegener-Instituts Helmholtz-Zentrum für Polar- und Meeresforschung sowie der Universität Bremen.
Benthos: Organismen im und auf dem Boden von Gewässern [40]
Taxon, Pl. Taxa: Bezeichnung für eine im biologischen Sinn definierte Gruppe von Organismen.
„Abra prismatica“, Synonym „Abra fragilis“, ist eine Muschelart, auch Lange Pfeffermuschel genannt.
„Semelidae“, Synonym „Scrobiculariidae“, ist die Familie der Pfeffermuscheln.

Literatur

Bose R, Frew J (2005) Lineage retrieval for scientific data processing: a survey. ACM Comput Surv 37(1):1–28
Article Google Scholar
Buneman P, Chapman A, Cheney J (2006) Provenance management in curated databases. In: Proceedings of the 2006 ACM SIGMOD international conference on management of data - SIGMOD ’06. ACM, New York, pp 539–550
Chapter Google Scholar
Büttner S, Hobohm HC, Müller L (2011) Handbuch Forschungsdatenmanagement. BOCK+HERCHEN, Bad Honnef
Google Scholar
Celko J (2012) Joe Celko’s trees and hierarchies in SQL for smarties. Morgan Kaufmann, Elsevier, Burlington
Google Scholar
Cheney J, Chiticariu L, Tan WC (2009) Provenance in databases: why, how, and where. Found Trends Databases 1(4):379–474
Article Google Scholar
Curino CA, Moon HJ, Zaniolo C (2008) Graceful database schema evolution: the prism workbench. Proceedings VLDB Endowment 1(1):761–772
Article Google Scholar
Dalamagas T, Cheng T, Winkel KJ, Sellis T (2006) A methodology for clustering XML documents by structure. Inf Syst 31:187–228
Article MATH Google Scholar
Doan A, Halevy AY, Ives ZG (2012) Principles of data integration. Morgan Kaufmann, Burlington
Google Scholar
Fagin R, Kolaitis PG, Popa L, Tan WC (2011) Schema mapping evolution through composition and inversion. In: Schema matching and mapping. Springer, Heidelberg
Google Scholar
Glavic B, Alonso G (2009) The PERM provenance management system in action. In: Proceedings of the 35th SIGMOD international conference on Management of data - SIGMOD ’09. ACM, New York
Google Scholar
Hartung M, Terwilliger JF, Rahm E (2011) Recent advances in schema and ontology evolution. In: Schema matching and mapping. Springer, Heidelberg, pp 149–190
Chapter Google Scholar
Hegewald J, Naumann F, Weis M (2006) XStruct: efficient schema extraction from multiple and large XML documents. In: ICDE Workshops, pp 81
Google Scholar
Heuer A (1989) Equivalent schemes in semantic, nested relational, and relational database models. In: Proceedings MFDBS’89, pp 237–253
Google Scholar
Heuer A (2015) METIS in PArADISE: Provenance Management bei der Auswertung von Sensordatenmengen für die Entwicklung von Assistenzsystemen. In: BTW workshops, pp 131–136
Google Scholar
ISO/IEC 9075-2:2011 (2011) Information technology - Database languages - SQL-Part 2: Foundation (SQL/Foundation). Tech. rep., ISO/IEC JTC 1/SC 32
Kirsten T, Gross A, Hartung M, Rahm E (2011) GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution. J Biomed Semant 2(1):6
Article Google Scholar
Klettke M (2007) Modellierung, Bewertung und Evolution von XML-Dokumentkollektionen. Habilitationsschrift. Logos, Berlin
Google Scholar
Klettke M, Meyer H (2000) XML and object-relational database systems - enhancing structural mappings based on statistics. In: Proceedings WebDB, pp 151–170
Google Scholar
Klettke M, Scherzinger S, Störl U (2014) Datenbanken ohne Schema? Herausforderungen und Lösungs-Strategien in der agilen Anwendungsentwicklung mit schema-flexiblen NoSQL-Datenbanksystemen. Datenbank Spektrum 14(2):119–129
Article Google Scholar
Klettke M, Scherzinger S, Störl U (2015) Schema extraction and structural outlier detection for JSON-based NoSQL data stores. In: Proceedings BTW’15, pp 425–444
Google Scholar
Kulkarni K, Michels JE (2012) Temporal features in SQL:2011. ACM SIGMOD Rec 41(3):34–43
Article Google Scholar
Köppen V, Saake G, Sattler KU (2012) Data Warehouse Technologien. mitp, Frechen
Google Scholar
Leser U, Naumann F (2006) Informationsintegration. dpunkt.verlag, Heidelberg
MATH Google Scholar
Luksetich DL (2012) How to leverage DB2’s automated time travel queries and temporal tables. Enterprise Systems Media, Richardson
Google Scholar
McPhilips T, Bowers S, Ludäscher B (2006) Collection-oriented scientific workflows for integrating and analyzing biological data. In: Proceedings of the DILS Workshop
Google Scholar
Meyer F (2015) Aufbau einer Artenlistenverwaltung im Benthos-Projekt. Bachelor-Arbeit, Universität Rostock
Meyer F (2016) Temporale Aspekte und Provenance-Anfragen im Umfeld des Forschungsdatenmanagements. Master-Arbeit, Universität Rostock
Miller RJ (2007) Retrospective on Clio: schema mapping and data exchange in practice. In: Proceedings Ws. DL’07
Google Scholar
Miller RJ, Hernández MA, Haas LM, Yan L, Ho CTH, Fagin R, Popa L (2001) The Clio project: managing heterogeneity. ACM SIGMOD Rec 30(1):78–83
Article Google Scholar
Moh CH, Lim EP, Ng WK (2000) DTD-miner, a tool for mining DTD from XML documents. In: Proceedings WECWIS
Google Scholar
Moreau L, Groth PT (2013) Provenance: an introduction to PROV. Morgan & Claypool, San Rafael
Google Scholar
Motro A (1994) Intensional answers to database queries. IEEE Trans Knowl Data Eng 6(3):444–454
Article Google Scholar
Möller ML (2016) Aufbau einer Forschungsdatenverwaltung für chemische und physikalische In-Situ-Daten aus der Ostsee. Bachelor-Arbeit, Universität Rostock
Naumann F, Leser U, Freytag JC (1999) Quality-driven integration of heterogenous information systems. In: Proceedings VLDB’99, pp 447–458
Google Scholar
Necaský M, Klímek J, Malý J, Mlýnková I (2012) Evolution and change management of XML-based systems. J Syst Softw 85(3):683–707
Article Google Scholar
Prien RD, Schulz-Bull DE (2016) Technical note: GODESS – a profiling mooring in the Gotland Basin. Ocean Sci Discuss. doi:10.5194/os-2016-11
Google Scholar
Rahm E, Do HH (2000) Data cleaning: problems and current approaches. IEEE Data Eng Bull 23(4):3–13
Google Scholar
Rahm E, Kirsten T, Lange J (2007) The GeWare data warehouse platform for the analysis of molecular-biological and clinical data. J Integr Bioinform 4(1):47
Google Scholar
Redman TC (1996) Data quality for the information age. Artech House, London
Google Scholar
Rheinheimer G (1996) Meereskunde der Ostsee. Springer, Heidelberg
Book Google Scholar
Saake G, Sattler K, Heuer A (2013) Datenbanken - Konzepte und Sprachen, 5. Aufl. mitp, Frechen
MATH Google Scholar
Saracco C, Nicola M, Gandhi L (2012) A matter of time: temporal data management in DB2 10 (IBM Developer Works)
Google Scholar
Schick S, Meyer H, Heuer A (2013) Flexy: Flexible, datengetriebene prozessmodelle mit YAWL. In: Proceedings der BTW’13
Google Scholar
Schönbach C, Kowalski-Saunders P, Brusic V (2000) Data warehousing in molecular biology. Brief Bioinformatics 1(1):190–198
Article Google Scholar
Snodgrass RT (Hrsg) (1995) The TSQL2 temporal query language. Kluwer, Dordrecht
MATH Google Scholar
Snodgrass RT (1999) Developing time-oriented database applications in SQL. Morgan Kaufmann, Burlington
Google Scholar
Svacina J (2016) Intensional Answers for Provenance Queries in Big Data Analytics. Bachelor-Arbeit, Universität Rostock
Zierke J (2014) Konzeption der Datenintegration für eine zu entwickelnde Benthos-Datenbank. Master-Arbeit, Universität Rostock

Download references

Danksagung

Wir danken den Mitarbeitern der IOW-Arbeitsgruppe Chemische in situ Sensoren um Ralf Prien sowie den Mitarbeitern der IOW-Arbeitsgruppe Ökologie benthischer Organismen um Michael Zettler für die fachliche Unterstützung und die Bereitstellung der Daten. Weiterhin danken wir den studentischen Projektgruppen und Hilfskräften der Generationen 2015 und 2016 für ihre Unterstützung bei der Implementierung einzelner Komponenten des Frameworks, der Provenance-Techniken (Projekt METIS) und der Privacy-Techniken (Projekt PArADISE): im Cluster Forschungsdatenmanagement-Framework Dennis Weu, Paul Wegener, Oleg Wagenleitner, Hannes Awolin; im Cluster Provenance Jan Svacina, Pia Wilsdorf, Tanja Auge, Sabrina Brossmann, Marc Stefan Martens, Daniel Horak; im Cluster Privacy Jan Tepke, Hannes Steffenhagen, Christoph Damerius, Martin Haufschild, Felix Thomas Wächter; übergreifend unseren Hilfskräften Richard Dabels, Johann Kluth, Jörg Stüwe.

Author information

Authors and Affiliations

Universität Rostock, Rostock, Deutschland
Ilvio Bruder, Meike Klettke, Mark Lukas Möller, Frank Meyer & Andreas Heuer
Leibniz-Institut für Ostseeforschung Warnemünde, Warnemünde, Deutschland
Susanne Jürgensmann & Susanne Feistel

Authors

Ilvio Bruder
View author publications
You can also search for this author inPubMed Google Scholar
Meike Klettke
View author publications
You can also search for this author inPubMed Google Scholar
Mark Lukas Möller
View author publications
You can also search for this author inPubMed Google Scholar
Frank Meyer
View author publications
You can also search for this author inPubMed Google Scholar
Andreas Heuer
View author publications
You can also search for this author inPubMed Google Scholar
Susanne Jürgensmann
View author publications
You can also search for this author inPubMed Google Scholar
Susanne Feistel
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Ilvio Bruder.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bruder, I., Klettke, M., Möller, M.L. et al. Daten wie Sand am Meer – Datenerhebung, -strukturierung, -management und Data Provenance für die Ostseeforschung. Datenbank Spektrum 17, 183–196 (2017). https://doi.org/10.1007/s13222-017-0259-4

Download citation

Received: 21 March 2017
Accepted: 01 June 2017
Published: 19 June 2017
Issue Date: July 2017
DOI: https://doi.org/10.1007/s13222-017-0259-4

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Daten wie Sand am Meer – Datenerhebung, -strukturierung, -management und Data Provenance für die Ostseeforschung

Zusammenfassung

Access this article

Subscribe and save

Buy Now

Notes

Literatur

Danksagung

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now