Skip to main content

Scalable Architecture for Mining Big Earth Observation Data: SAMBEO

  • Conference paper
  • First Online:
Computer Vision and Image Processing (CVIP 2022)

Abstract

Variety of sensors present in the satellites revolving around the earth, generates a huge amount of raw data called Big Earth Observation Data (BEOD). The data collected by the sensors contains the information important to various applications. There are a number of architectures proposed for processing earth observation (EO) data by the people working in the relevant areas. The spatio-temporal nature of data poses a variety of challenges in terms of storage and archival, retrieval, processing, analysis, visualization. A scalable solution is required for handling exponential rise in data.

In order to address the scalability issues, recent well known distributed architectures to process spatio-temporal data are, HadoopGIS, SpatialSpark, STARK, Sedona, Geomesa and Geowave. In this paper, we present an architecture for mining BEOD to provide scalability in every phase, from storage to analysis. Also, the architecture is equipped with the capability of analyzing the data through machine learning and deep learning models. We also present the comparison based on the performance on different types of spatial queries involved in effective accessing of data, that utilizes spatial indexing techniques for the data stored in a distributed environment. We also have demonstrated the performance of proposed architecture on EO data obtained through INSAT-3D Imager.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aji, A., et al.: Hadoop-GIS: a high performance spatial data warehousing system over MapReduce. In: Proceedings of the VLDB Endowment International Conference on Very Large Data Bases, vol. 6, August 2013

    Google Scholar 

  2. Bladin, K., et al.: Globe browsing: contextualized spatio-temporal planetary surface visualization. IEEE Trans. Visual Comput. Graphics 24(1), 802–811 (2018). https://doi.org/10.1109/TVCG.2017.2743958

    Article  Google Scholar 

  3. Dai, J.J., et al.: BigDL: a distributed deep learning framework for big data. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 50–60. SoCC 2019, Association for Computing Machinery (2019). https://doi.org/10.1145/3357223.3362707, https://arxiv.org/pdf/1804.05839.pdf

  4. Eldawy, A., Mokbel, M.: Spatialhadoop: a mapreduce framework for spatial data. In: Proceedings - International Conference on Data Engineering 2015, pp. 1352–1363, May 2015. https://doi.org/10.1109/ICDE.2015.7113382

  5. ESA: Newcomers EO guide (newcomers-earth-observation-guide). https://business.esa.int. Accessed 8 Jan 2022

  6. Ferreira, K.R., et al.: Towards a spatial data infrastructure for big spatiotemporal data sets. In: Proceedings of 17th Brazilian Symposium on Remote Sensing (SBSR), 2015, pp. 7588–7594 (2015)

    Google Scholar 

  7. Griffith, D., Chun, Y., Dean, D.: Advances in Geocomputation: Geocomputation 2015–The 13th International Conference (2017). https://doi.org/10.1007/978-3-319-22786-3

  8. Guo, H., Wang, L., Liang, D.: Big earth data from space: a new engine for earth science. Sci. Bull. 61(7), 505–513 (2016). https://doi.org/10.1007/s11434-016-1041-y

  9. Hagedorn, S., Götze, P., Sattler, K.U.: The stark framework for spatio-temporal data analytics on spark. In: Mitschang, B., et al. (eds.) Datenbanksysteme für Business, Technologie und Web (BTW 2017), pp. 123–142. Gesellschaft für Informatik, Bonn (2017)

    Google Scholar 

  10. Karun, A.K., Chitharanjan, K.: A review on hadoop - hdfs infrastructure extensions. In: 2013 IEEE Conference on Information and Communication Technologies, pp. 132–137 (2013)

    Google Scholar 

  11. Klein, L., et al.: Pairs: A scalable geo-spatial data analytics platform. pp. 1290–1298, Oct 2015. https://doi.org/10.1109/BigData.2015.7363884

  12. Li, S., et al.: Geospatial big data handling theory and methods: a review and research challenges. ISPRS J. Photogramm. Remote. Sens. 115, 119–133 (2016). https://doi.org/10.1016/j.isprsjprs.2015.10.012

    Article  Google Scholar 

  13. Ma, Y., et al.: Remote sensing big data computing: Challenges and opportunities. Future Gener. Comput. Syst. 51, 47–60 (2015). https://doi.org/10.1016/j.future.2014.10.029. (special Section: A Note on New Trends in Data-Aware Scheduling and Resource Provisioning in Modern HPC Systems)

  14. Maatouki, A., Szuba, M., Meyer, J., Streit, A.: A horizontally-scalable multiprocessing platform based on node.js. CoRR abs/1507.02798 (2015). http://arxiv.org/abs/1507.02798

  15. Nothaft, F.A., et al.: Rethinking data-intensive science using scalable analytics systems. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 631–646. SIGMOD 2015, Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2723372.2742787

  16. Oancea, B., Dragoescu, R.: Integrating r and Hadoop for big data analysis. Roman. Statist. Rev. 83–94 (2014)

    Google Scholar 

  17. Oliveira, S.F., Fürlinger, K., Kranzlmüller, D.: Trends in computation, communication and storage and the consequences for data-intensive science. In: 2012 IEEE 14th International Conference on High Performance Computing and Communication 2012 IEEE 9th International Conference on Embedded Software and Systems, pp. 572–579 (2012). https://doi.org/10.1109/HPCC.2012.83

  18. Raghavendra, M, A.U.: A survey on analytical architecture of real-time big data for remote sensing applications. Asian. J. Eng. Technol. Innov. 4, 120–123 (2016)

    Google Scholar 

  19. Roy, S., Gupta, S., Omkar, S.: Case study on: scalability of preprocessing procedure of remote sensing in hadoop. Proc. Comput. Sci. 108, 1672–1681 (2017). https://doi.org/10.1016/j.procs.2017.05.042

  20. Szuba, M., Ameri, P., Grabowski, U., Meyer, J., Streit, A.: A distributed system for storing and processing data from earth-observing satellites: System design and performance evaluation of the visualisation tool. In: Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 169–174. CCGRID 2016, IEEE Press (2016). https://doi.org/10.1109/CCGrid.2016.19

  21. Yu, J., Wu, J., Sarwat, M.: Geospark: a cluster computing framework for processing large-scale spatial data, pp. 1–4, November 2015. https://doi.org/10.1145/2820783.2820860

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Neha Sisodiya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sisodiya, N., Vyas, K., Dube, N., Thakkar, P. (2023). Scalable Architecture for Mining Big Earth Observation Data: SAMBEO. In: Gupta, D., Bhurchandi, K., Murala, S., Raman, B., Kumar, S. (eds) Computer Vision and Image Processing. CVIP 2022. Communications in Computer and Information Science, vol 1776. Springer, Cham. https://doi.org/10.1007/978-3-031-31407-0_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-31407-0_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-31406-3

  • Online ISBN: 978-3-031-31407-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics