Abstract
Meteorological data contribute significantly to “Big Data”, handling multi-dimensional raster data cubes up to 5-D and with single cubes up to multi-Petabyte sizes. Due to the lack of support for raster data, traditionally file-based implementations have been used for serving such data to the community, rather than databases. Array databases overcome this by providing storage and query support.
In this paper, we present a case study conducted by Deutscher Wetterdienst (DWD) where extraction and processing of gridded meteorological data sets has been investigated hands-on. Following a brief introduction of the rasdaman DBMS used, we present the database schema used and a series of array queries, selected according to their practical importance in weather forecast services. We discuss several issues that have come up, such as null values and time modeling, and how they have been addressed. To the best of our knowledge, this is the first non-academic deployment of an array database for up to 5-D data sets.








Similar content being viewed by others
Notes
Chosen randomly for the purpose of this example.
Consortium for Small Scale Modeling.
The different name has been chosen to avoid confusion.
References
Alagiannis I, Borovica R, Branco M, Idreos S, Ailamaki A (2012) NoDB: efficient query execution on raw data files. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data (SIGMOD ’12). ACM, New York, pp 241–252. doi:10.1145/2213836.2213864. http://doi.acm.org/10.1145/2213836.2213864
Baumann P (1994) Management of multidimensional discrete data. VLDB J 3(4):401–444. http://dl.acm.org/citation.cfm?id=615204.615207
Baumann P (2010) The OGC web coverage processing service (WCPS) standard. Geoinformatica 14(4):447–479. doi:10.1007/s10707-009-0087-2. http://dx.doi.org/10.1007/s10707-009-0087-2
Baumann P, Feyzabadi S, Jucovschi C (2010) Putting pixels in place: a storage layout language for scientific data. In: 2010 IEEE international conference on data mining workshops (ICDMW), pp 194–201. doi:10.1109/ICDMW.2010.70
Baumann P, Misev D (2012) rasdaman as a climate data service. Tech rep, rasdaman GmbH
Cattell RGG, Barry DK (2000) The object data standard: ODMG 3.0. Kaufmann, Los Altos
Cox S (2011) Observations and measurements—XML implementation. OGC document 10-025r1
Gasperi J, Houbie F, Woolf A, Smolders S (2011) Earth observation metadata profile of observations & measurements. OGC document 10-157r2
International organization for standardization: ISO/IEC 9075-1:2003: Information technology—database languages—SQL. Part 1. Framework (SQL/Framework) (2003)
ISO 19123:2005: Geographic information—schema for coverage geometry and functions (2005)
Kersten M, Zhang Y, Ivanova M, Nes N (2011) SciQL, a query language for science applications. In: Proceedings of the EDBT/ICDT 2011 workshop on array databases (AD’11). ACM, New York, pp 1–12. doi:10.1145/1966895.1966896. http://doi.acm.org/10.1145/1966895.1966896
Machlin R (2007) Index-based multidimensional array queries: safety and equivalence. In: Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS ’07). ACM, New York, pp 175–184. doi:10.1145/1265530.1265555. http://doi.acm.org/10.1145/1265530.1265555
Manegold S, Kersten ML, Boncz PA (2009) Database architecture evolution: mammals flourished long before dinosaurs became extinct. In: Proceedings of the international conference on very large data bases (VLDB). http://oai.cwi.nl/oai/asset/14299/14299B.pdf. 10-year best paper award for database architecture optimized for the New Bottleneck: memory access. In: Proceedings of the international conference on very large data bases (VLDB), pp 54–65, Edinburgh, United Kingdom, September 1999
Obe R, Hsu L PostGIS in Action. Manning pubs Co series. Manning Publications (2011). http://books.google.de/books?id=4kEBRQAACAAJ
rasdaman GmbH: rasdaman query language guide, 8.3 edn (2012)
Rew R, Davis G, Emmerson S, Davies H, Hartnett E, Heimbigner D (2010) The NetCDF users guide—data model, programming interfaces, and format for self-describing, portable data—NetCDF Version 4.1
Sellis TK, Roussopoulos N, Faloutsos C (1987) The R+-Tree: a dynamic index for multi-dimensional objects. In: Proceedings of the 13th international conference on very large data bases (VLDB ’87). Kaufmann, San Francisco, pp 507–518. http://dl.acm.org/citation.cfm?id=645914.671636
www.dwd.de: Accessed on 2012-aug-28
www.earthlook.com: Accessed on 2012-aug-28
www.earthserver.eu: Accessed on 2012-aug-28
www.kliwas.de: Accessed on 2012-aug-28
www.scidb.org: Accessed on 2012-aug-28
www.unidata.ucar.edu/projects/THREDDS/. Accessed on 2012-aug-28
Acknowledgements
This work has been partially supported by Deutscher Wetterdienst (www.dwd.de).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Misev, D., Baumann, P. & Seib, J. Towards Large-Scale Meteorological Data Services: A Case Study. Datenbank Spektrum 12, 183–192 (2012). https://doi.org/10.1007/s13222-012-0103-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13222-012-0103-9