Skip to main content
Log in

Geographical information system parallelization for spatial big data processing: a review

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

With the increasing interest in large-scale, high-resolution and real-time geographic information system (GIS) applications and spatial big data processing, traditional GIS is not efficient enough to handle the required loads due to limited computational capabilities.Various attempts have been made to adopt high performance computation techniques from different applications, such as designs of advanced architectures, strategies of data partition and direct parallelization method of spatial analysis algorithm, to address such challenges. This paper surveys the current state of parallel GIS with respect to parallel GIS architectures, parallel processing strategies, and relevant topics. We present the general evolution of the GIS architecture which includes main two parallel GIS architectures based on high performance computing cluster and Hadoop cluster. Then we summarize the current spatial data partition strategies, key methods to realize parallel GIS in the view of data decomposition and progress of the special parallel GIS algorithms. We use the parallel processing of GRASS as a case study. We also identify key problems and future potential research directions of parallel GIS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Aji, A., Wang, F., Vo, H., Lee, R., Liu, Q., Zhang, X., Saltz, J.: Hadoop gis: a high performance spatial data warehousing system over mapreduce. Proc. VLDB Endow. 6(11), 1009–1020 (2013)

    Article  Google Scholar 

  2. Akhter, S., Aida, K., Chemin, Y.: Grass gis on high performance computing with mpi, openmp and ninf-g programming framework. In: Proceeding of ISPRS 2010 (2010)

  3. Alesheikh, A., Helali, H., Behroz, H.: Web gis: technologies and its applications. In: Symposium on Geospatial Theory, Processing and Applications, vol. 15 (2002)

  4. Aronoff, S.: Geographic Information Systems: A Management Perspective. Taylor & Francis, London (1989)

    Google Scholar 

  5. Bader, D.A., JáJá, J.: Parallel algorithms for image histogramming and connected components with an experimental study (1998)

  6. Benedičič, L., Cruz, F.A., Hamada, T., Korošec, P.: A grass gis parallel module for radio-propagation predictions. Int. J. Geogr. Inf. Sci. 28(4), 799–823 (2014)

    Article  Google Scholar 

  7. Berson, A.: Client-Server Architecture. IEEE-802. McGraw-Hill, New York (1992)

    Google Scholar 

  8. Bhat, M.A., Shah, R.M., Ahmad, B.: Cloud computing: a solution to geographical information systems(gis). Int. J. Comput. Sci. Eng. 3(2), 594–600 (2011)

    Google Scholar 

  9. Bilal, K., Khan, S.U., Zhang, L., Li, H., Hayat, K., Madani, S.A., Min-Allah, N., Wang, L., Chen, D., Iqbal, M.I., Xu, C.Z., Zomaya, A.Y.: Quantitative comparisons of the state-of-the-art data center architectures. Concurr. Comput. Pract Exp. 25(12), 1771–1783 (2013). doi:10.1002/cpe.2963

    Article  Google Scholar 

  10. Bok, K., Seo, D., Song, S., Kim, M., Yoo, J.: An index structure for parallel processing of multidimensional data. In: Advances in Web-Age Information Management, pp. 589–600. Springer, New York (2005)

  11. Boukerram, A., Azzou, S.A.K.: Parallelisation of algorithms of mathematical morphology. J. Comput. Sci. 2(8), 615–618 (2006)

  12. Cordeau, J.F., Maischberger, M.: A parallel iterated tabu search heuristic for vehicle routing problems. Comput. Oper. Res. 39(9), 2033–2050 (2012)

    Article  Google Scholar 

  13. Dalton, C.M., Thatcher, J.: Inflated Granularity: Spatial Big Dataand Geodemographics. Available at SSRN 2544638 (2015)

  14. Dash, M., Petrutiu, S., Scheuermann, P.: ppop: fast yet accurate parallel hierarchical clustering using partitioning. Data Knowl. Eng. 61(3), 563–578 (2007)

    Article  Google Scholar 

  15. Delling, D., Katz, B., Pajor, T.: Parallel computation of best connections in public transportation networks. J. Exp. Algorithmics 17, 4–4 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  16. Dewitt, D.J., Kabra, N., Luo, J., Patel, J.M., Yu, J.B.: Client-server paradise. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 558–569 (2001)

  17. Dong, W., Liu, D., Zhao, L.: A new mpi-based grass technology for parallel processing and its architecture[j]. Remote Sens. Inf. 28(01), 102–109 (2013)

    Google Scholar 

  18. Egenhofer, M.J.: Reasoning about binary topological relations. In: Advances in Spatial Databases, pp. 141–160. Springer, New York (1991)

  19. Fan, J., Ji, M., Gu, G., Sun, Y.: Optimization approaches to mpi and area merging-based parallel buffer algorithm. Boletim de Ciências Geodésicas 20(2), 237–256 (2014)

    Article  Google Scholar 

  20. Festa, P., Resende, M.G.: Hybridizations of grasp with path-relinking. In: Hybrid Metaheuristics, pp. 135–155. Springer, New York (2013)

  21. Foster, I.: Designing and Building Parallel Programs. Addison Wesley Publishing Company, Reading (1995)

    MATH  Google Scholar 

  22. Frank, A.U.: Qualitative spatial reasoning: cardinal directions as an example. Int. J. Geogr. Inf. Sci. 10(3), 269–290 (1996)

    Article  Google Scholar 

  23. Franklin, W.R., Narayanaswami, C., Kankanhalli, M., Sun, D., Zhou, M.C., Wu, P.Y.: Uniform grids: a technique for intersection detection on serial and parallel machines. In: Proceedings of Auto Carto 9: Ninth International Symposium on Computer-Assisted Cartography, pp. 100–109 (1989)

  24. Gao, S., Li, L., Li, W., Janowicz, K., Zhang, Y.: Constructing gazetteers from volunteered big geo-data based on hadoop. Comput. Environ. Urban Syst. (2014). doi:10.1016/j.compenvurbsys.2014.02.004

  25. Garcıa-López, F., Melián-Batista, B., Moreno-Pérez, J.A., Moreno-Vega, J.M.: Parallelization of the scatter search for the p-median problem. Parallel Comput. 29(5), 575–589 (2003)

    Article  Google Scholar 

  26. Gong, J., Xie, J.: Extraction of drainage networks from large terrain datasets using high throughput computing. Comput. Geosci. 35(2), 337–346 (2009)

    Article  Google Scholar 

  27. Goodchild, M.F.: Geographical information science. Int. J. Geogr. Inf. Syst. 6(1), 31–45 (1992)

    Article  Google Scholar 

  28. Goodchild, M.F.: The quality of big (geo) data. Dialogues Human Geogr. 3(3), 280–284 (2013)

    Article  Google Scholar 

  29. Groër, C., Golden, B., Wasil, E.: A parallel algorithm for the vehicle routing problem. INFORMS J. Comput. 23(2), 315–330 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  30. Guo, H., Wang, L., Chen, F., Liang, D.: Scientific big data and digital earth. Chin. Sci. Bull. 59(35), 5066–5073 (2014). doi:10.1007/s11434-014-0645-3

    Article  Google Scholar 

  31. Guo, M.: Research on the key technologies of high performance computing webgis model. Ph.D. thesis, China University of Geosciences, Wuhan (2012)

  32. Hawick, K.A., Coddington, P.D., James, H.A.: Distributed frameworks and parallel algorithms for processing large-scale geographic data. Parallel Comput. 29(10), 1297–1333 (2003)

    Article  Google Scholar 

  33. Healey, R., Dowers, S., Gittings, B., Mineter, M.J.: Parallel Processing Algorithms for GIS. CRC Press, Basingstoke (1997)

    Google Scholar 

  34. Hu, B., Wang, H.F., Wang, P.F., Liu, H.Z.: A parallel algorithm of pca image fusion in remote sensing and its implementation. Microelectron. Comput. 23(10), 153–157 (2006)

    Google Scholar 

  35. Huang, F., Liu, D., Liu, P., Wang, S., Zeng, Y., Li, G., Yu, W., Wang, J., Zhao, L., Pang, L.: Research on cluster-based parallel gis with the example of parallelization on grass gis. In: Sixth International Conference on Grid and Cooperative Computing, 2007. GCC 2007, pp. 642–649. IEEE (2007)

  36. Huang, F., Liu, D., Tan, X., Wang, J., Chen, Y., He, B.: Explorations of the implementation of a parallel idw interpolation algorithm in a linux cluster-based parallel gis. Comput. Geosci. 37(4), 426–434 (2011)

    Article  Google Scholar 

  37. Hussain, H., Malik, S.U.R., Hameed, A., Khan, S.U., Bickler, G., Min-Allah, N., Qureshi, M.B., Zhang, L., Wang, Y., Ghani, N., Kolodziej, J., Zomaya, A.Y., Xu, C.Z., Balaji, P., Vishnu, A., Pinel, F., Pecero, J.E., Kliazovich, D., Bouvry, P., Li, H., Wang, L., Chen, D., Rayes, A.: A survey on resource allocation in high performance distributed computing systems. Parallel Comput. 39(11), 709–736 (2013)

    Article  MathSciNet  Google Scholar 

  38. Jia, T., Wei, Z., Tang, S., Kim, J.H.: New spatial data partition approach for spatial data query. Comput. Sci. 37(8), 198–200 (2013)

    Google Scholar 

  39. Jin, H., Meng, L., Wang, X.: Cluster-based architecture design of parallel gis [j]. Geospat. Inf. 5, 015 (2005)

    Google Scholar 

  40. Kalpana, R., Thambidurai, P.: Optimizing shortest path queries with parallelized arc flags. In: International Conference on Recent Trends in Information Technology (ICRTIT), 2011, pp. 601–606. IEEE (2011)

  41. Kamel, I., Faloutsos, C.: Parallel R-Trees, vol. 21. In: ACM (1992)

  42. Katz, R.H.: High-performance network and channel-based storage. Proc. IEEE 80(8), 1238–1261 (1992)

    Article  Google Scholar 

  43. Kolodziej, J., Khan, S.U., Wang, L., Byrski, A., Min-Allah, N., Madani, S.A.: Hierarchical genetic-based grid scheduling with energy optimization. Clust. Comput. 16(3), 591–609 (2013). doi:10.1007/s10586-012-0226-7

    Article  Google Scholar 

  44. Kwok, T., Smith, K., Lozano, S., Taniar, D.: Parallel fuzzy c-means clustering for large data sets. In: Euro-Par 2002 Parallel Processing, pp. 365–374. Springer, New York (2002)

  45. Lai, S., Zhu, F., Sun, Y.: A design of parallel r-tree on cluster of workstations. In: Databases in Networked Information Systems, pp. 119–133. Springer, New York (2000)

  46. Lee, C.K., Hamdi, M.: Parallel image processing applications on a network of workstations. Parallel Comput. 21(1), 137–160 (1995)

    Article  MATH  Google Scholar 

  47. Lin, D., Liang, Q.: Research progress and connotation of cloud gis [j]. Prog. Geogr. 11, 013 (2012)

    Google Scholar 

  48. Liu, D., Liu, Y.: A review on spatial reasoning and geographic information system. J. Softw. 11(12), 1598–1606 (2000)

    Google Scholar 

  49. Liu, L., Yang, A., Chen, L., Xiong, W., Wu, Q., Jing, N.: Higis-when gis meets hpc. In: 12th International Conference on GeoComputation, Wuhan (2013)

  50. Liu, P., Yuan, T., Ma, Y., Wang, L., Liu, D., Yue, S., Kolodziej, J.: Parallel processing of massive remote sensing images in a gpu architecture. Comput. Inf. 33(1), 197–217 (2014)

    Google Scholar 

  51. Ma, Y., Wang, L., Liu, D., Yuan, T., Liu, P., Zhang, W.: Distributed data structure templates for data-intensive remote sensing applications. Concurr. Comput. Pract. Exp. 25(12), 1784–1797 (2013). doi:10.1002/cpe.2965

    Article  Google Scholar 

  52. Modenesi, M.V., Costa, M.C., Evsukoff, A.G., Ebecken, N.F.: Parallel fuzzy c-means cluster analysis. In: High Performance Computing for Computational Science-VECPAR 2006, pp. 52–65. Springer, New York (2007)

  53. Modenesi, M.V., Evsukoff, A.G., Costa, M.C.: A load balancing knapsack algorithm for parallel fuzzy c-means cluster analysis. In: High Performance Computing for Computational Science-VECPAR 2008, pp. 269–279. Springer, New York (2008)

  54. Nagesh, H., Goil, S., Choudhary, A.: Parallel algorithms for clustering high-dimensional large-scale datasets. In: Data Mining for Scientific and Engineering Applications, pp. 335–356. Springer, New York (2001)

  55. Osterman, A.: Implementation of the r. cuda. los module in the open source grass gis by using parallel computation on the nvidia cuda graphic cards. ELEKTROTEHNIË\(\breve{\rm {G}}\)SKI VESTNIK 79(1–2), 19–24 (2012)

  56. Padmanabhan, A., Wang, S., Navarro, J.P.: A cybergis gateway approach to interoperable access to the national science foundation teragrid and the open science grid. In: Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery, p. 42. ACM (2011)

  57. Pang, L., Li, G., Yan, Y., Ma, Y.: Research on parallel buffer analysis with grided based hpc technology. In: IEEE International Geoscience and Remote Sensing Symposium, 2009, IGARSS 2009, vol. 4, pp. IV–200. IEEE (2009)

  58. Paulsen, J., Körner, C.: Gis-analysis of tree-line elevation in the swiss alps suggests no exposure effect. J. Veg. Sci. 12(6), 817–824 (2001)

    Article  Google Scholar 

  59. Qatawneh, M., Sleit, A., Almobaideen, W.: Parallel implementation of polygon clipping using transputer. Am. J. Appl. Sci. 6(2), 214 (2009)

    Article  Google Scholar 

  60. Rajasekaran, S.: Efficient parallel hierarchical clustering algorithms. IEEE Trans. Parallel Distrib. Syst. 6, 497–502 (2005)

    Article  Google Scholar 

  61. Rao, Q., Ding, J., Su, L., Gu, Y., Xia, L., Hu, Z.: The design and implementation of distributed map tiling service based on cloud computing. Geomat. Spat. Inf. Technol. 36, 29–35 (2013)

    Google Scholar 

  62. Schnitzer, B., Leutenegger, S.T.: Master-client r-trees: a new parallel r-tree architecture. In: Eleventh International Conference on Scientific and Statistical Database Management, 1999, pp. 68–77. IEEE (1999)

  63. Shekhar, S., Gunturi, V., Evans, M.R., Yang, K.: Spatial big-data challenges intersecting mobility and cloud computing. In: Proceedings of the Eleventh ACM International Workshop on Data Engineering for Wireless and Mobile Access, pp. 1–6. ACM (2012)

  64. Shen, Z., Luo, J., Zhou, C., Cai, S., Zheng, J., Chen, Q., Ming, D., Sun, Q.: Architecture design of grid gis and its applications on image processing based on lan. Inf. Sci. 166(1), 1–17 (2004)

    Article  Google Scholar 

  65. Sloan, T.M., Mineter, M.J., Dowers, S., Mulholland, C., Darling, G., Gittings, B.M.: Partitioning of vector-topological data for parallel gis operations: Assessment and performance analysis. In: Euro-Par’99 Parallel Processing, pp. 691–694. Springer, New York (1999)

  66. Sun, W., Tan, Z., Wang, J., Zhou, C., He, J.: An analysis of parallelizing shortest path algorithm. Geogr. GeoInf. Sci. 4, 005 (2013)

  67. Theoharis, T., Page, I.: Two parallel methods for polygon clipping. In: Computer Graphics Forum, vol. 8, pp. 107–114. Wiley Online Library (1989)

  68. Tomlinson, R.F., Calkins, H.W., Marble, D.F.: Computer Handling of Geographical Data. UNESCO Press, Paris (1976)

    Google Scholar 

  69. Wang, B., Horinokuchi, H., Kaneko, K., Makinouchi, A.: Parallel r-tree search algorithm on dsvm. In: Proceedings of the 6th International Conference on Database Systems for Advanced Applications, 1999, pp. 237–244. IEEE (1999)

  70. Wang, L., Chen, D., Hu, Y., Ma, Y., Wang, J.: Towards enabling cyberinfrastructure as a service in clouds. Comput. Electr. Eng. 39(1), 3–14 (2013)

    Article  Google Scholar 

  71. Wang, L., Kunze, M., Tao, J., von Laszewski, G.: Towards building a cloud for scientific applications. Adv. Eng. Softw. 42(9), 714–722 (2011)

    Article  Google Scholar 

  72. Wang, L., von Laszewski, G., Kunze, M., Tao, J., Dayal, J.: Provide virtual distributed environments for grid computing on demand. Adv. Eng. Softw. 41(2), 213–219 (2010)

    Article  MATH  Google Scholar 

  73. Wang, L., von Laszewski, G., Younge, A.J., He, X., Kunze, M., Tao, J., Fu, C.: Cloud computing: a perspective study. New Gener. Comput. 28(2), 137–146 (2010)

    Article  MATH  Google Scholar 

  74. Wang, L., Lu, K., Liu, P.: Compressed sensing of a remote sensing image based on the priors of the reference image. IEEE Geosci. Remote Sens. Lett. 12(4), 736–740 (2015)

    Article  MathSciNet  Google Scholar 

  75. Wang, L., Tao, J., Ma, Y., Khan, S.U., Kolodziej, J., Chen, D.: Software design and implementation for mapreduce across distributed data centers. Int. J. Appl. Math. Inf. Sci. 7(1), 85–90 (2013)

    Article  Google Scholar 

  76. Wang, S.: A cybergis framework for the synthesis of cyberinfrastructure, gis, and spatial analysis. Ann. Assoc. Am. Geogr. 100(3), 535–557 (2010)

    Article  Google Scholar 

  77. Wang, S., Anselin, L., Bhaduri, B., Crosby, C., Goodchild, M.F., Liu, Y., Nyerges, T.L.: Cybergis software: a synthetic review and integration roadmap. Int. J. Geogr. Inf. Sci. 27(11), 2122–2145 (2013)

    Article  Google Scholar 

  78. Wang, Y., Meng, L., Zhao, C.: The research of massive spatial data partitioning algorithm, based on the hilbert space permutation code. Geomat. Inf. Sci. Wuhan Univ. 32(7), 650–653 (2007)

    Google Scholar 

  79. Wilson, G.: Assessing the usability of parallel programming systems: The cowichan problems. In: Proceedings of the IFIP Working Conference on Programming Environments for Massively Parallel Distributed Systems, pp. 183–193 (1994)

  80. Wu, X., Huang, B., Wang, L., Lu, K., Zhang, J.: Gpu-based parallel design of the hyperspectral signal subspace identification by minimum error (hysime). IEEE J. Sel. Topics Appl. Earth Obs. Remote Sens. Accepted (2015)

  81. Wu, X., Xu, S., Wan, B., Wu, L.: Next generation software architecture t-c-v. Earth Sci. J. China Univ. Geosci. 39(2), 221–226 (2014)

    MathSciNet  Google Scholar 

  82. Yan, Z., Sun, W., Zhou, C., Xiong, T., Wang, J.: A parallel scatter search algorithm for the p-median problem. Geogr. GeoInf. Sci. 4, 011 (2013)

    Google Scholar 

  83. Yang, C., Goodchild, M., Huang, Q., Nebert, D., Raskin, R., Xu, Y., Bambacus, M., Fay, D.: Spatial cloud computing: how can the geospatial sciences use and help shape cloud computing? Int. J. Digit. Earth 4(4), 305–329 (2011)

    Article  Google Scholar 

  84. Yang, Y., Lixin, W.: A vector data partitioning method for realizing efficient parallel computing of topological relations. Geogr. GeoInf. Sci. 29(7), 25–29 (2013)

    Google Scholar 

  85. Yao, Y., Gao, J., Meng, L., Deng, S.: Parallel computing of buffer analysis based on grid computing [j]. Geospat. Inf. 1, 035 (2007)

    Google Scholar 

  86. Yu, B., Hao, Z.: Research of distributed and parallel spatial index mechanism based on dpr-tree [j]. Comput. Technol. Dev. 6, 012 (2010)

    Google Scholar 

  87. Zhang, J., Xu, M.: Design and implementation of connected component labeling parallel algorithm with multi-core processor. Comput. Syst. Appl. 19(4), 140–143 (2010)

    Google Scholar 

  88. Zhang, J., You, S.: Cudagis: report on the design and realization of a massive data parallel gis on gpus. In: Proceedings of the Third ACM SIGSPATIAL International Workshop on GeoStreaming, pp. 101–108. ACM (2012)

  89. Zhang, W., Wang, L., Liu, D., Song, W., Ma, Y., Liu, P., Chen, D.: Towards building a multi-datacenter infrastructure for massive remote sensing image processing. Concurr. Comput. Pract. Exp. 25(12), 1798–1812 (2013)

    Article  Google Scholar 

  90. Zhang, W., Wang, L., Ma, Y., Liu, D.: Design and implementation of task scheduling strategies for massive remote sensing data processing across multiple data centers. Software: Practice and Experience 44(7), 873–886 (2014)

  91. Zhao, Y., Li, C.: Research on the distributed parallel spatial indexing schema based on r-tree. Geogr. GeoInf. Sci. 6, 009 (2007)

    Google Scholar 

  92. Zhong, Y.: Towards distributed management scheme for big spatio-temporal data. Ph.D. thesis, Institute of Computing Technology, Chinese Academy of Sciences, Beijing (2013)

  93. Zhou, Y., Zhu, Q., Yeting, Z.: The spatial data partitioning method, based on the hilbert curve hierarchical decomposition. Geogr. GeoInf. Sci. 23(4), 13–17 (2007)

    Google Scholar 

Download references

Acknowledgments

This study is supported by National Natural Science Foundation of China (41301028).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Lajiao Chen or Jijun He.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, L., Chen, L., Ranjan, R. et al. Geographical information system parallelization for spatial big data processing: a review. Cluster Comput 19, 139–152 (2016). https://doi.org/10.1007/s10586-015-0512-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-015-0512-2

Keywords

Navigation