- GridDB: A Data-Centric Overlay for Scientific Grids

https://doi.org/10.1016/B978-012088469-8.50054-1Get rights and content

Publisher Summary

This chapter describes the design, implementation, and evaluation of GridDB—a data-centric overlay for the scientific grid data analysis. In contrast to the currently deployed process-centric middleware, GridDB manages data entities rather than processes. GridDB provides a suite of services important to data analysis—a declarative interface, type-checking, interactive query processing, memoization, data provenance, and co-existence. GridDB is based on two core principles. First, scientific analysis programs can be abstracted as typed functions, and program invocations as typed function calls. Second, while most scientific analysis data is not relational in nature, a key subset, including the inputs and outputs of scientific workflows, has relational characteristics. This data can be manipulated with SQL and can serve as an interface to the full data set. The chapter discusses several elements of the GridDB such as workflow/data model, query language, software architecture, query processing, and a prototype implementation. The validation of GridDB is presented by showing its modeling of real-world physics and astronomy analyses, and measurements on the prototype.

References (0)

Cited by (45)

  • QuiCK: A Queuing System in CloudKit

    2021, Proceedings of the ACM SIGMOD International Conference on Management of Data
  • About cloud storage systems survivability

    2021, CEUR Workshop Proceedings
  • On efficiently processing workflow provenance queries in spark

    2019, Proceedings - International Conference on Distributed Computing Systems
View all citing articles on Scopus
View full text