Skip to main content
Log in

Improving the performance of GIS polygon overlay computation with MapReduce for spatial big data processing

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

As one of the important operations in Geographic Information System (GIS) spatial analysis, polygon overlay processing is a time-consuming task in many big data cases. In this paper, a specially designed MapReduce algorithm with grid index is proposed to decrease the running time. Our proposed algorithm can reduce the times of calling intersection computation by the aid of grid index. The experiment is carried out on the cloud framework based on Hadoop built by ourselves. Experimental results show that our algorithm with spatial grid index consumes less time than its peer without spatial index. Moreover, the proposed algorithm has an upward speed-up ratio when more nodes of Hadoop framework are used. Nevertheless, with the increase of nodes, the upward trend of speed-up ratio slows down.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Wang, F.: A parallel intersection algorithm for vector polygon overlay. IEEE Comput. Graph. Appl. 13(2), 74–81 (1993)

    Article  Google Scholar 

  2. Wang, L., Liu, P., Ranjan, R., Chen, L.: IK-SVD: dictionary learning for spatial big data via incremental atom update. Comput. Sci. Eng. 16(4), 41–52 (2014)

    Article  Google Scholar 

  3. Shekhar, S., Gunturi, V., Evans, M.R., Yang, K.: Spatial big-data challenges intersecting mobility and cloud computing. In: Proceedings of the 11th ACM International Workshop on Data Engineering for Wireless and Mobile Access, pp. 1–6. ACM (2012)

  4. Ma, Y., Wang, L., Zomaya, A., Chen, D., Ranjan, R.: Task-tree based large-scale mosaicking for remote sensed imageries with dynamic DAG scheduling. IEEE Trans. Parallel Distrib. Syst. 25(8), 2126–2137 (2014)

    Article  Google Scholar 

  5. Wang, J., Cheng, L., Wang, L.: Concentric layout, a new scientific data layout for matrix data-set in Hadoop file system. Int. J. Parallel Emergent Distrib. Syst. 28(5), 407–433 (2013)

    Article  Google Scholar 

  6. Chen, D., Li, X., Cui, D., Wang, L., Lu, D.: Global synchronization measurement of multivariate neural signals with massively parallel nonlinear interdependence analysis. IEEE Trans. Neural Syst. Rehabil. Eng. 22(1), 33–43 (2014)

    Article  Google Scholar 

  7. Chen, D., Li, D., Xiong, M., Bao, H., Li, X.: GPGPU-aided ensemble empirical-mode decomposition for EEG analysis during anesthesia. IEEE Trans. Inform. Technol. Biomed. 14(6), 1417–1427 (2010)

    Article  Google Scholar 

  8. Agarwal, D., Prasad, S.K.: Lessons learnt from the development of gis application on azure cloud platform. In: Proceedings of the 5th IEEE International Conference on Cloud Computing (CLOUD), pp. 352–359 (2012)

  9. Agarwal, D., Puri, S., He, X., Prasad, S.K.: A system for GIS polygon overlay computation on linux cluster-an experience and performance report. In: Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops and PhD Forum (IPDPSW), pp. 1433–1439 (2012)

  10. Agarwal, D., Puri, S., He, X., Prasad, S.K.: Cloud computing for fundamental spatial operations on polygon gis data. 2012 Cloud Futures Workshop (2012)

  11. Hadoop: Open source implementation of MapReduce. http://lucene.apache.org/hadoop/

  12. Lam, C.: Hadoop in Action. Manning Publications Company, Greenwich (2010)

    Google Scholar 

  13. White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Sebastopol (2012)

    Google Scholar 

  14. Zikopoulos, P., Eaton, C.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill Osborne Media, New York (2011)

    Google Scholar 

  15. Wang, L., Tao, J., Ranjan, R., Marten, H., Streit, A., Chen, J., Chen, D.: G-Hadoop: MapReduce across distributed data centers for data-intensive computing. Future Gener. Comput. Syst. 29(3), 739–750 (2013)

    Article  Google Scholar 

  16. Huang, F., Liu, D., Li, X., Wang, L., Xu, W.: Preliminary study of a cluster-based open-source parallel GIS based on the GRASS GIS. Int. J. Digit. Earth 4(5), 402–420 (2011)

    Article  Google Scholar 

  17. Cary, A., Sun, Z., Hristidis, V., Rishe, N.: Experiences on processing spatial data with MapReduce. In: Proceedings of the 21st International Conference on Scientific and Statistical Database Management, pp. 302–319. Springer (2009)

  18. Chen, Q., Wang, L., Shang, Z.: MRGIS: A MapReduce-enabled high performance workflow system for GIS. In: Proceedings of the 4th IEEE International Conference on e-Science, pp. 646–651 (2008)

  19. Puri, S., Agarwal, D., He, X., Prasad, S.K.: MapReduce algorithms for GIS polygon overlay processing. In: Proceedings of the 27th IEEE International Parallel and Distributed Processing Symposium Workshops and PhD Forum (IPDPSW), pp. 1009–1016 (2013)

  20. Wang, Y., Wang, J., Li, C., Yan, X.: Cloud GIS: Theory. Method and Practice, Geocomputation (2013)

  21. Preparatat, F.P., Shamos, M.I.: Computational Geometry: An Introduction. Springer, New York (1985)

    Book  Google Scholar 

  22. General polygon clipper library. http://www.cs.man.ac.uk/_toby/alan/software/gpc.html

  23. Shekhar, S., Xiong, H.: Encyclopedia of GIS. Springer, New York (2008)

    Book  Google Scholar 

  24. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  25. Fu, Z., Liu, S., Tian, Z., Xu, H.: Distributed spatial index based on multilevel R-tree. Bull. Surv. Mapp. 11, 42–46 (2012)

    Google Scholar 

  26. Li, X., Zheng, W.: Parallel spatial index algorithm based on Hilbert partition. In: Proceedings of the 5th International Conference on Computational and Information Sciences (ICCIS), pp. 876–879 (2013)

  27. Zhong, Y., Han, J., Zhang, T., Li, Z., Fang, J., Chen, G.: Towards parallel spatial query processing for big spatial data. In: Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops and PhD Forum (IPDPSW), pp. 2085–2094 (2012)

  28. Puri, S., Prasad, S, K.: Efficient parallel and distributed algorithms for GIS polygon overlay processing. In: Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, pp. 2238–2241. IEEE Computer Society (2013)

  29. Kim, J., Hong, S., Nam, B.: A performance study of traversing spatial indexing structures in parallel on GPU. In: Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & the 9th IEEE International Conference on Embedded Software and Systems (HPCC-ICESS), pp. 855–860 (2012)

  30. Cheng, C.: Spatial Database Management System. Science Press, Beijing (2012)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank Professor Shanyu Tang for his valuable suggestions. The project was supported by the Fundamental Research Founds for National University, China University of Geosciences (Wuhan) under Grant CUGL110228 and CUGL120292.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Liu, Z., Liao, H. et al. Improving the performance of GIS polygon overlay computation with MapReduce for spatial big data processing. Cluster Comput 18, 507–516 (2015). https://doi.org/10.1007/s10586-015-0428-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-015-0428-x

Keywords

Navigation