Abstract
Variety of sensors present in the satellites revolving around the earth, generates a huge amount of raw data called Big Earth Observation Data (BEOD). The data collected by the sensors contains the information important to various applications. There are a number of architectures proposed for processing earth observation (EO) data by the people working in the relevant areas. The spatio-temporal nature of data poses a variety of challenges in terms of storage and archival, retrieval, processing, analysis, visualization. A scalable solution is required for handling exponential rise in data.
In order to address the scalability issues, recent well known distributed architectures to process spatio-temporal data are, HadoopGIS, SpatialSpark, STARK, Sedona, Geomesa and Geowave. In this paper, we present an architecture for mining BEOD to provide scalability in every phase, from storage to analysis. Also, the architecture is equipped with the capability of analyzing the data through machine learning and deep learning models. We also present the comparison based on the performance on different types of spatial queries involved in effective accessing of data, that utilizes spatial indexing techniques for the data stored in a distributed environment. We also have demonstrated the performance of proposed architecture on EO data obtained through INSAT-3D Imager.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aji, A., et al.: Hadoop-GIS: a high performance spatial data warehousing system over MapReduce. In: Proceedings of the VLDB Endowment International Conference on Very Large Data Bases, vol. 6, August 2013
Bladin, K., et al.: Globe browsing: contextualized spatio-temporal planetary surface visualization. IEEE Trans. Visual Comput. Graphics 24(1), 802–811 (2018). https://doi.org/10.1109/TVCG.2017.2743958
Dai, J.J., et al.: BigDL: a distributed deep learning framework for big data. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 50–60. SoCC 2019, Association for Computing Machinery (2019). https://doi.org/10.1145/3357223.3362707, https://arxiv.org/pdf/1804.05839.pdf
Eldawy, A., Mokbel, M.: Spatialhadoop: a mapreduce framework for spatial data. In: Proceedings - International Conference on Data Engineering 2015, pp. 1352–1363, May 2015. https://doi.org/10.1109/ICDE.2015.7113382
ESA: Newcomers EO guide (newcomers-earth-observation-guide). https://business.esa.int. Accessed 8 Jan 2022
Ferreira, K.R., et al.: Towards a spatial data infrastructure for big spatiotemporal data sets. In: Proceedings of 17th Brazilian Symposium on Remote Sensing (SBSR), 2015, pp. 7588–7594 (2015)
Griffith, D., Chun, Y., Dean, D.: Advances in Geocomputation: Geocomputation 2015–The 13th International Conference (2017). https://doi.org/10.1007/978-3-319-22786-3
Guo, H., Wang, L., Liang, D.: Big earth data from space: a new engine for earth science. Sci. Bull. 61(7), 505–513 (2016). https://doi.org/10.1007/s11434-016-1041-y
Hagedorn, S., Götze, P., Sattler, K.U.: The stark framework for spatio-temporal data analytics on spark. In: Mitschang, B., et al. (eds.) Datenbanksysteme für Business, Technologie und Web (BTW 2017), pp. 123–142. Gesellschaft für Informatik, Bonn (2017)
Karun, A.K., Chitharanjan, K.: A review on hadoop - hdfs infrastructure extensions. In: 2013 IEEE Conference on Information and Communication Technologies, pp. 132–137 (2013)
Klein, L., et al.: Pairs: A scalable geo-spatial data analytics platform. pp. 1290–1298, Oct 2015. https://doi.org/10.1109/BigData.2015.7363884
Li, S., et al.: Geospatial big data handling theory and methods: a review and research challenges. ISPRS J. Photogramm. Remote. Sens. 115, 119–133 (2016). https://doi.org/10.1016/j.isprsjprs.2015.10.012
Ma, Y., et al.: Remote sensing big data computing: Challenges and opportunities. Future Gener. Comput. Syst. 51, 47–60 (2015). https://doi.org/10.1016/j.future.2014.10.029. (special Section: A Note on New Trends in Data-Aware Scheduling and Resource Provisioning in Modern HPC Systems)
Maatouki, A., Szuba, M., Meyer, J., Streit, A.: A horizontally-scalable multiprocessing platform based on node.js. CoRR abs/1507.02798 (2015). http://arxiv.org/abs/1507.02798
Nothaft, F.A., et al.: Rethinking data-intensive science using scalable analytics systems. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 631–646. SIGMOD 2015, Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2723372.2742787
Oancea, B., Dragoescu, R.: Integrating r and Hadoop for big data analysis. Roman. Statist. Rev. 83–94 (2014)
Oliveira, S.F., Fürlinger, K., Kranzlmüller, D.: Trends in computation, communication and storage and the consequences for data-intensive science. In: 2012 IEEE 14th International Conference on High Performance Computing and Communication 2012 IEEE 9th International Conference on Embedded Software and Systems, pp. 572–579 (2012). https://doi.org/10.1109/HPCC.2012.83
Raghavendra, M, A.U.: A survey on analytical architecture of real-time big data for remote sensing applications. Asian. J. Eng. Technol. Innov. 4, 120–123 (2016)
Roy, S., Gupta, S., Omkar, S.: Case study on: scalability of preprocessing procedure of remote sensing in hadoop. Proc. Comput. Sci. 108, 1672–1681 (2017). https://doi.org/10.1016/j.procs.2017.05.042
Szuba, M., Ameri, P., Grabowski, U., Meyer, J., Streit, A.: A distributed system for storing and processing data from earth-observing satellites: System design and performance evaluation of the visualisation tool. In: Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 169–174. CCGRID 2016, IEEE Press (2016). https://doi.org/10.1109/CCGrid.2016.19
Yu, J., Wu, J., Sarwat, M.: Geospark: a cluster computing framework for processing large-scale spatial data, pp. 1–4, November 2015. https://doi.org/10.1145/2820783.2820860
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sisodiya, N., Vyas, K., Dube, N., Thakkar, P. (2023). Scalable Architecture for Mining Big Earth Observation Data: SAMBEO. In: Gupta, D., Bhurchandi, K., Murala, S., Raman, B., Kumar, S. (eds) Computer Vision and Image Processing. CVIP 2022. Communications in Computer and Information Science, vol 1776. Springer, Cham. https://doi.org/10.1007/978-3-031-31407-0_38
Download citation
DOI: https://doi.org/10.1007/978-3-031-31407-0_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31406-3
Online ISBN: 978-3-031-31407-0
eBook Packages: Computer ScienceComputer Science (R0)