Skip to main content

Retrospective Satellite Data in the Cloud: An Array DBMS Approach

  • Conference paper
  • First Online:
Supercomputing (RuSCDays 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 793))

Included in the following conference series:

Abstract

Earth remote sensing has always been a source of “big” data. Satellite data have inspired the development of “array” DBMS. An array DBMS processes N-dimensional (N-d) arrays utilizing a declarative query style to simplify raster data management and processing. However, raster data are traditionally stored in files, not in databases. Respective command line tools have long been developed to process these files. Most tools are feature-rich and free but optimized for a single machine. The approach of partially delegating in situ raster data processing to such tools has been recently proposed. The approach includes a new formal N-d array data model to abstract from the files and the tools as well as new distributed algorithms based on the model. This paper extends the approach with a new algorithm for the reshaping (tiling) of N-d arrays. The algorithm physically reorganizes the storage layout of N-d arrays to obtain an order of magnitude speedup. The extended approach outperforms SciDB up to 28\(\times \) on retrospective Landsat data – one of the most typical and popular kind of satellite imagery. SciDB is the only freely available distributed array DBMS to date. Experiments were carried out on an 8-node cluster in Microsoft Azure Cloud.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. ArcGIS for server|Image Extension. http://www.esri.com/software/arcgis/arcgisserver/extensions/image-extension

  2. Baumann, P., Holsten, S.: A comparative analysis of array models for databases. Int. J. Database Theory Appl. 5(1), 89–120 (2012)

    Google Scholar 

  3. Blanas, S., Wu, K., Byna, S., Dong, B., Shoshani, A.: Parallel data analysis directly on scientific file formats. In: ACM SIGMOD 2014, pp. 385–396 (2014)

    Google Scholar 

  4. Coverity scan: GDAL. https://scan.coverity.com/projects/gdal

  5. Earth on AWS. https://aws.amazon.com/earth/

  6. GeoTIFF. http://trac.osgeo.org/geotiff/

  7. Landsat apps. https://aws.amazon.com/blogs/aws/start-using-landsat-on-aws/

  8. Landsat project statistics. https://landsat.usgs.gov/landsat-project-statistics

  9. Nativi, S., Caron, J., Domenico, B., Bigagli, L.: Unidata’s common data model mapping to the ISO 19123 data model. Earth Sci. Inform. 1, 59–78 (2008)

    Article  Google Scholar 

  10. NCO homepage. http://nco.sourceforge.net/

  11. Not enough memory error - SciDB forum. http://forum.paradigm4.com/t/problem-with-memory-while-stacking-array/1838

  12. Oracle spatial and graph. http://www.oracle.com/technetwork/database/options/spatialandgraph/overview/index.html

  13. PostGIS raster data management. http://postgis.net/docs/manual-2.2/using_raster_dataman.html

  14. RasDaMan homepage. http://rasdaman.org/

  15. Rodriges Zalipynis, R.A.: Chronosserver: real-time access to “native” multi-terabyte retrospective data warehouse by thousands of concurrent clients. Inform. Cybern. Comput. Eng. 14(188), 151–161 (2011)

    Google Scholar 

  16. Rodriges Zalipynis, R.A.: ChronosServer: fast in situ processing of large multidimensional arrays with command line tools. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2016. CCIS, vol. 687, pp. 27–40. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-55669-7_3

    Chapter  Google Scholar 

  17. Rodriges Zalipynis, R.A.: Distributed in situ processing of big raster data in the cloud. In: Perspectives of System Informatics - 11th International Andrei Ershov Informatics Conference, PSI 2017, Moscow, Russia, June 27–29, 2017, Revised Selected Papers. Lecture Notes in Computer Science, LNCS. Springer (2017, in press)

    Google Scholar 

  18. SciDB homepage. http://www.paradigm4.com/

  19. TileDB. http://istc-bigdata.org/tiledb/index.html

Download references

Acknowledgments

This work was partially supported by Russian Foundation for Basic Research (grant №16-37-00416). We also thank anonymous reviewers for their helpful and inspiring comments.

Contributions. Rodriges: all text, figures, algorithms, ChronosServer, its data model, Azure management code, SciDB import code, experimental setup. Pozdeev: SciDB cluster deployment. Bryukhov: partial implementation of the reshaping algorithm for one machine, adapted SciDB import code to Landsat data. All authors: experiments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ramon Antonio Rodriges Zalipynis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rodriges Zalipynis, R.A., Bryukhov, A., Pozdeev, E. (2017). Retrospective Satellite Data in the Cloud: An Array DBMS Approach. In: Voevodin, V., Sobolev, S. (eds) Supercomputing. RuSCDays 2017. Communications in Computer and Information Science, vol 793. Springer, Cham. https://doi.org/10.1007/978-3-319-71255-0_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-71255-0_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-71254-3

  • Online ISBN: 978-3-319-71255-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics