Abstract
Computer architectures have evolved into parallel and heterogeneous systems with multi-core CPUs, many-core GPUs, and vector instructions. Meanwhile, advances in data collection technologies have led to a rapid increase in the spatial and temporal resolution of geographic data. Efficiently dealing with large volumes of geographic data demands, now more than ever, an effective use of modern parallel computers. However, parallel programming is distinctly more challenging than writing sequential scripts. Moreover, parallelism is not the only issue; data locality is critical, too. This work addresses the issues of data escalation and parallel transition using a compiler approach to map algebra. More specifically, we design and implement a framework that uses compiler techniques to automatically speed up raster spatial analysis. In this way, users simply write sequential map algebra scripts in Python, which are translated into a graph where optimizations are applied. Then the scripts are parallelized, reordered for locality, and executed on OpenCL devices such as multi-core CPUs and GPUs. The novelty of our approach resides in the efficient organization of the execution, which we achieve via compilation. Unlike interpreters, our framework reorders the raster operations to maximize data reuse and minimize memory movements. The reordering occurs at two hierarchical levels and is controlled by a scheduler and a fusion technique. This strategy targets data locality, which, as we show, is key to the performance of raster spatial analysis. The experiments report speed-ups of one to two orders of magnitude compared to traditional interpreters.
Similar content being viewed by others
References
Tomlin CD, Berry JK (1979) Mathematical structure for cartographic modeling in environmental analysis. In: Proc. 39th Symp. Am. Congr. Surv. Mapp. Washington DC, 269–283
Tomlin CD (1990) Geographic Information Systems and Cartographic Modelling. Prentice Hall, New Jersey
Tomlin CD (2013) GIS and cartographic modeling. Esri Press
Wesseling CG, Karssenberg D, Burrough PA, Van Deursen WPA (1996) Integrating dynamic environmental models in GIS: The development of a Dynamic Modelling language. Trans GIS 1:40–48. https://doi.org/10.1111/j.1467-9671.1996.tb00032.x
Takeyama M, Couclelis H (1997) Map dynamics: integrating cellular automata and GIS through Geo-Algebra. Int J Geogr Inf Sci 37–41
Bruns HT, Egenhofer MJM (1997) User interfaces for map algebra. Urban Reg Inf Syst Assoc 9:44–54
Frank AU (2005) Map algebra extended with functors for temporal data. Perspect Concept Model 1980:194–207
Mennis J, Viger R, Tomlin CD (2005) Cubic Map Algebra Functions for Spatio-Temporal Analysis. Cartogr Geogr Inf Sci 32:17–32. https://doi.org/10.1559/1523040053270765
Câmara G, Palomo D (2005) Towards a Generalized Map Algebra: Principles and Data Types 66–81
Cerveira Cordeiro JP, Câmara G, Moura de Freitas U, Almeida F (2009) Yet Another Map Algebra. GeoInformatica 13:183–202. https://doi.org/10.1007/s10707-008-0045-4
Mennis J (2010) Multidimensional Map Algebra: Design and Implementation of a Spatio-Temporal GIS Processing Language. Trans GIS 14:1–21. https://doi.org/10.1111/j.1467-9671.2009.01179.x
Shapiro M, Westervelt J (1992) R.MAPCALC: an Algebra for GIS and Image Processing. Champaign, Illinois
Pullar D (2001) MapScript: A map algebra programming language incorporating neighborhood analysis. Geoinformatica 145–163
Schmitz O, Karssenberg D, de Jong K, de Kok J-L, de Jong SM (2013) Map algebra and model algebra for integrated model building. Environ Model Softw 48:113–128. doi: https://doi.org/10.1016/j.envsoft.2013.06.009
Healey R, Dowers S, Gittings B, Mineter MJ (1997) Parallel processing algorithms for GIS. CRC Press
Dubrule DE, Morin PR, Sack J-R (1997) A Parallel Cartographic Modelling System: Design, Implementation, and Performance. In: Elev. Annu. Symp. Geogr. Inf. Syst. Vancouver, 16–20
Hutchinson D, Lanthier M, Maheshwari A, Nussbaum D, Roytenberg D, Sack J-R (1996) Parallel neighbourhood modelling. In: Proc. fourth ACM Work. Adv. Geogr. Inf. Syst. - GIS ‘96. ACM Press, New York, New York, USA, pp 25–34
Guan Q, Clarke KC (2010) A general-purpose parallel raster processing programming library test application using a geographic cellular automata model. Int J Geogr Inf Sci 24:695–722. https://doi.org/10.1080/13658810902984228
Guan Q, Zeng W, Gong J, Yun S (2014) pRPL 2.0: Improving the Parallel Raster Processing Library. Trans GIS 18:25–52. https://doi.org/10.1111/tgis.12109
Cheng G, Liu L, Jing N, Chen L, Xiong W (2012) General-purpose optimization methods for parallelization of digital terrain analysis based on cellular automata. Comput Geosci 45:57–67. https://doi.org/10.1016/j.cageo.2012.03.009
Wu Y, Ge Y, Yan W, Li X (2007) Improving the performance of spatial raster analysis in GIS using GPU. In: Gong P, Liu Y (eds) Proc. SPIE 6754, Geoinformatics 2007 Geospatial Inf. Technol. Appl. 1–11
Open Computing Language. https://www.khronos.org/opencl/
Steinbach M, Hemmerling R (2012) Accelerating batch processing of spatial raster analysis using GPU. Comput Geosci 45:212–220. https://doi.org/10.1016/j.cageo.2011.11.012
Qin C-Z, Zhan L-J, Zhu A-X, Zhou C-H (2014) A strategy for raster-based geocomputation under different parallel computing platforms. Int J Geogr Inf Sci 28:2127–2144. https://doi.org/10.1080/13658816.2014.911300
Shook E, Hodgson ME, Wang S, Behzad B, Soltani K, Hiscox A, Ajayakumar J (2016) Parallel cartographic modeling: a methodology for parallelizing spatial data processing. Int J Geogr Inf Sci 8816:1–22. https://doi.org/10.1080/13658816.2016.1172714
Tate A, Kamil A, Dubey A, Größlinger A (2014) Programming Abstractions for Data Locality. In: PADAL Work. Program. Abstr. Data Locality. Lugano, Switzerland, 1–54
Shalf J, Dosanjh S, Morrison J (2011) Exascale Computing Technology Challenges. In: High Perform. Comput. Comput. Sci. – VECPAR 2010. 1–25
Carabaño J (2017) Github repository: Parallel Map Algebra. www.github.com/jcaraban/map
Mäkinen V, Sarjakoski T, Oksanen J, Westerholm J (2014) Scalable uncertainty-aware drainage basin delineation program using digital elevation models in multi-node GPU environments. Big Data from Sp 267–270. doi: https://doi.org/10.2788/1823
Kovanen J, Sarjakoski T (2015) Tilewise Accumulated Cost Surface Computation with Graphics Processing Units. ACM Trans Spat Algorithms Syst 1:1–27. https://doi.org/10.1145/2803172
Horn BKP (1981) Hill shading and the reflectance map. Proc IEEE 69:14–47. https://doi.org/10.1109/PROC.1981.11918
Augonnet C, Thibault S, Namyst R, Wacrenier P-A (2011) StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr Comput Pract Exp 23:187–198. https://doi.org/10.1002/cpe.1631
Chi-Keung Luk, Sunpyo Hong, Hyesoon Kim (2009) Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: Microarchitecture, 2009. MICRO-42. 42nd Annu. IEEE/ACM Int. Symp. 45–55
Agullo E, Demmel J, Dongarra J, Hadri B, Kurzak J, Langou J, Ltaief H, Luszczek P, Tomov S (2009) Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects. J Phys Conf Ser 180:12037. https://doi.org/10.1088/1742-6596/180/1/012037
Cytron R, Ferrante J, Rosen BK, Wegman MN, Zadeck FK (1991) Efficiently computing static single assignment form and the control dependence graph. ACM Trans Program Lang Syst 13:451–490. https://doi.org/10.1145/115372.115320
Click C (1995) Global code motion/global value numbering. ACM SIGPLAN Not 30:246–257. https://doi.org/10.1145/223428.207154
González-Vélez H, Leyton M (2010) A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Softw Pract Exp 40:1135–1160. https://doi.org/10.1002/spe.1026
Steuwer M, Kegel P, Gorlatch S (2011) SkelCL - A Portable Skeleton Library for High-Level GPU Programming. In: 2011 I.E. Int. Symp. Parallel Distrib. Process. Work. Phd Forum. IEEE, 1176–1182
Enmyren J, Kessler CW (2010) SkePU: A Multi-Backend Skeleton Programming Library for Multi-GPU Systems. In: Proc. fourth Int. Work. High-level parallel Program. Appl. - HLPP ‘10. ACM Press, New York, New York, USA, 5
McCool MD, Robison AD, Reinders J (2012) Structured parallel programming: patterns for efficient computation. Elsevier
Goodman JR, Hsu W-C (1988) Code scheduling and register allocation in large basic blocks. In: Proc. 2nd Int. Conf. Supercomput. - ICS ‘88. ACM Press, New York, New York, USA, 442–452
Carabaño J, Sarjakoski T, Westerholm J (2015) Efficient Implementation of a Fast Viewshed Algorithm on SIMD Architectures. In: 2015 23rd Euromicro Int. Conf. Parallel, Distrib. Network-Based Process. IEEE, pp 199–202
Kennedy K, McKinley K (1994) Maximizing loop parallelism and improving data locality via loop fusion and distribution. Lang Compil Parallel Comput 301–320. doi: https://doi.org/10.1007/3-540-57659-2_18
Filipovič J, Madzin M, Fousek J, Matyska L (2015) Optimizing CUDA code by kernel fusion: application on BLAS. J Supercomput 71:3934–3957. https://doi.org/10.1007/s11227-015-1483-z
Coutts D, Leshchinskiy R, Stewart D (2007) Stream Fusion. From Lists to Streams to Nothing at All. Proc 2007 ACM SIGPLAN Int Conf Funct Program - ICFP ‘07 42:315. doi: https://doi.org/10.1145/1291151.1291199
Darte A (2000) On the complexity of loop fusion. Parallel Comput 26:1175–1193. https://doi.org/10.1016/S0167-8191(00)00034-X
Wu H, Diamos G, Cadambi S, Yalamanchili S (2012) Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation. In: 2012 45th Annu. IEEE/ACM Int. Symp. Microarchitecture. IEEE, 107–118
McDonell TL, Chakravarty MMT, Keller G, Lippmeier B (2013) Optimising purely functional GPU programs. In: Proc. 18th ACM SIGPLAN Int. Conf. Funct. Program. - ICFP ‘13. ACM Press, New York, 49
Ashari A, Tatikonda S, Boehm M, Reinwald B, Campbell K, Keenleyside J, Sadayappan P (2015) On optimizing machine learning workloads via kernel fusion. In: Proc. 20th ACM SIGPLAN Symp. Princ. Pract. Parallel Program. - PPoPP 2015. ACM Press, New York, New York, USA, 173–182
Kristensen MRB, Lund SAF, Blum T, Avery J (2016) Fusion of Parallel Array Operations. In: Proc. 2016 Int. Conf. Parallel Archit. Compil. - PACT ‘16. ACM Press, New York, New York, USA, 71–85
Ragan-Kelley J, Barnes C, Adams A, Paris S, Durand F, Amarasinghe S (2013) Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines. In: Proc. 34th ACM SIGPLAN Conf. Program. Lang. Des. Implement. - PLDI ‘13. ACM Press, New York, p 519
Vitter JS (2001) External memory algorithms and data structures: dealing with massive data. ACM Comput Surv 33:209–271. https://doi.org/10.1145/384192.384193
Gomes TL, Magalhães SVG, Andrade MVA, Franklin WR, Pena GC (2015) Efficiently computing the drainage network on massive terrains using external memory flooding process. GeoInformatica 19:671–692. https://doi.org/10.1007/s10707-015-0225-y
MVA A, SVG M, Magalhães MA, Franklin WR, Cutler BM (2011) Efficient viewshed computation on terrain in external memory. GeoInformatica 15:381–397. https://doi.org/10.1007/s10707-009-0100-9
Love R (2010) Linux Kernel Development (Third Edition). Addison-Wesley
Mokbel MF, Aref WG, Kamel I (2003) Analysis of Multi-Dimensional Space-Filling Curves. GeoInformatica 7:179–209. https://doi.org/10.1023/A:1025196714293
Morton G (1966) A computer oriented geodetic data base and a new technique in file sequencing. IBM, Ottawa
Batty M, Xie Y, Sun Z (1999) Modeling urban dynamics through GIS-based cellular automata. Comput Environ Urban Syst 23:205–233. https://doi.org/10.1016/S0198-9715(99)00015-0
Guan Q, Shi X, Huang M, Lai C (2016) A hybrid parallel cellular automata model for urban growth simulation over GPU/CPU heterogeneous architectures. Int J Geogr Inf Sci 30:494–514. https://doi.org/10.1080/13658816.2015.1039538
Ullman JD (1975) NP-complete scheduling problems. J Comput Syst Sci 10:384–393. https://doi.org/10.1016/S0022-0000(75)80008-0
Franklin W, Ray C (1994) Higher isn’t necessarily better: Visibility algorithms and experiments. Adv GIS Res Sixth Int Symp Spat data Handl 2:1–22
Carabano J, Westerholm J (2017) From Python Scripting to Parallel Spatial Modeling: Cellular Automata Simulations of Land Use, Hydrology and Pest Dynamics. In: 2017 25th Euromicro Int. Conf. Parallel, Distrib. Network-based Process. IEEE, 511–518
Lattner C, Adve V (2004) LLVM: A compilation framework for lifelong program analysis & transformation. Int Symp Code Gener Optim CGO 75–86. doi: https://doi.org/10.1109/CGO.2004.1281665
Acknowledgements
This work was supported by the Academy of Finland (decision numbers 259557 and 259995).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Carabaño, J., Westerholm, J. & Sarjakoski, T. A compiler approach to map algebra: automatic parallelization, locality optimization, and GPU acceleration of raster spatial analysis. Geoinformatica 22, 211–235 (2018). https://doi.org/10.1007/s10707-017-0312-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-017-0312-3