skip to main content
10.1145/3332186.3333266acmotherconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
extended-abstract

Spatial Data Decomposition and Load Balancing on HPC Platforms

Published: 28 July 2019 Publication History

Abstract

We are in the era of Spatial Big Data. Due to the developments of topographic techniques, clear satellite imagery, and various means for collecting information, geospatial datasets are growing in volume, complexity and heterogeneity. For example, OpenStreetMap data for the whole world is about 1 TB and NASA world climate datasets are about 17 TB. Spatial data volume and variety makes spatial computations both data-intensive and compute-intensive. Due to the irregular distribution of spatial data, domain decomposition becomes challenging. In this work, we present spatial data partitioning technique that takes into account spatial join cost. In addition, we present spatial join computation using Asynchronous Dynamic Load Balancing (ADLB) library. ADLB is a software library designed to help rapidly build scalable parallel programs using MPI. We evaluated the performance of ADLB-based MPI-GIS implementation. In our existing work, spatial data movement cost from ADLB server to worker MPI processes limited the scalability of MPI-GIS.

References

[1]
{n. d.}. SpatialHadoop, http://spatialhadoop.cs.umn.edu. Website. ({n. d.}). http://spatialhadoop.cs.umn.edu/
[2]
Dinesh Agarwal, Satish Puri, Xi He, and Sushil K Prasad. 2012. A system for GIS polygonal overlay computation on linux cluster-an experience and performance report. In 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum. IEEE, 1433--1439.
[3]
Ewing L Lusk, Steve C Pieper, Ralph M Butler, et al. 2010. More scalability, less pain: A simple programming model and its implementation for extreme computing. SciDAC Review 17, 1 (2010), 30--37.
[4]
Satish Puri. 2019. SpatialMPI: Message Passing Interface for GIS Applications. Geographic Information Science & Technology Body of Knowledge 2019, Q2 (2019).
[5]
Satish Puri, Anmol Paudel, and Sushil K Prasad. 2018. MPI-Vector-IO: Parallel I/O and Partitioning for Geospatial Vector Data. In Proceedings of the 47th International Conference on Parallel Processing, ICPP. 13.
[6]
Satish Puri and Sushil K Prasad. 2015. A parallel algorithm for clipping polygons with improved bounds and a distributed overlay processing system using mpi. In 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE, 576--585.

Cited By

View all
  • (2022)Accelerating Spatial Autocorrelation Computation with Parallelization, Vectorization and Memory Access Optimization: With a focus on rapid recalculation of COVID related spatial statistics for faster geospatial analysis and response2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00064(544-554)Online publication date: May-2022
  • (2020)Efficient Parallel and Adaptive Partitioning for Load-balancing in Spatial Join2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS47924.2020.00088(810-820)Online publication date: May-2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
PEARC '19: Practice and Experience in Advanced Research Computing 2019: Rise of the Machines (learning)
July 2019
775 pages
ISBN:9781450372275
DOI:10.1145/3332186
  • General Chair:
  • Tom Furlani
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 July 2019

Check for updates

Author Tags

  1. HPC
  2. Message Passing Interface
  3. Parallel IO
  4. Spatial Data
  5. Spatial Join

Qualifiers

  • Extended-abstract
  • Research
  • Refereed limited

Conference

PEARC '19

Acceptance Rates

Overall Acceptance Rate 133 of 202 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Accelerating Spatial Autocorrelation Computation with Parallelization, Vectorization and Memory Access Optimization: With a focus on rapid recalculation of COVID related spatial statistics for faster geospatial analysis and response2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00064(544-554)Online publication date: May-2022
  • (2020)Efficient Parallel and Adaptive Partitioning for Load-balancing in Spatial Join2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS47924.2020.00088(810-820)Online publication date: May-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media