Abstract
This paper investigates the colocation pattern mining problem for big spatial event data. Colocation patterns refer to subsets of spatial features whose instances are frequently located together. The problem is important in many applications such as analyzing relationships of crimes or disease with various environmental factors, but is computationally challenging due to a large number of instances, the potentially exponential number of candidate patterns, and high computational cost in generating pattern instances. Existing colocation mining algorithms (e.g., Apriori algorithm, multi-resolution filter, partial join and joinless approaches) are mostly sequential, and thus can be insufficient for big spatial event data. Recently, parallel colocation mining algorithms have been developed based on the Map-reduce framework. However, these algorithms need a large number of nodes to scale up, which is economically expensive, and their reducer nodes have a bottleneck of aggregating all instances of the same colocation patterns. Another work proposes a parallel colocation mining algorithm on GPU based on the iCPI tree and the joinless approach, but assumes that the number of neighbors for each instance is within a small constant, and thus may be inefficient when instances are dense and unevenly distributed. To address these limitations, we propose a grid-based GPU colocation mining algorithm that includes a novel cell aggregate based upper bound filter, and two refinement algorithms. We prove the correctness and completeness of proposed GPU algorithms. Preliminary results on both real world data and synthetic data show that proposed GPU algorithms are promising with over 30 times speedup on up to millions of instances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Koperski, K., Han, J.: Discovery of spatial association rules in geographic information databases. In: Egenhofer, M.J., Herring, J.R. (eds.) SSD 1995. LNCS, vol. 951, pp. 47–66. Springer, Heidelberg (1995). doi:10.1007/3-540-60159-7_4
Morimoto, Y.: Mining frequent neighboring class sets in spatial databases. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 353–358. ACM (2001)
Huang, Y., Shekhar, S., Xiong, H.: Discovering colocation patterns from spatial data sets: a general approach. IEEE Trans. Knowl. Data Eng. 16(12), 1472–1485 (2004)
Yoo, J.S., Shekhar, S., Smith, J., Kumquat, J.P.: A partial join approach for mining co-location patterns. In: Proceedings of the 12th Annual ACM International Workshop on Geographic Information Systems, pp. 241–249. ACM (2004)
Yoo, J.S., Shekhar, S.: A joinless approach for mining spatial colocation patterns. IEEE Trans. Knowl. Data Eng. 18(10), 1323–1337 (2006)
Boinski, P., Zakrzewicz, M.: Collocation pattern mining in a limited memory environment using materialized iCPI-tree. In: International Conference on Data Warehousing and Knowledge Discovery, pp. 279–290. Springer, Heidelberg (2012)
Mohan, P., Shekhar, S., Shine, J.A., Rogers, J.P., Jiang, Z., Wayant, N.: A neighborhood graph based approach to regional co-location pattern discovery: A summary of results. In: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 122–132. ACM (2011)
Wang, S., Huang, Y., Wang, X.S.: Regional co-locations of arbitrary shapes. In: Nascimento, M.A., Sellis, T., Cheng, R., Sander, J., Zheng, Y., Kriegel, H.-P., Renz, M., Sengstock, C. (eds.) SSTD 2013. LNCS, vol. 8098, pp. 19–37. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40235-7_2
Liu, B., Chen, L., Liu, C., Zhang, C., Qiu, W.: RCP mining: towards the summarization of spatial co-location patterns. In: Claramunt, C., Schneider, M., Wong, R.C.-W., Xiong, L., Loh, W.-K., Shahabi, C., Li, K.-J. (eds.) SSTD 2015. LNCS, vol. 9239, pp. 451–469. Springer, Cham (2015). doi:10.1007/978-3-319-22363-6_24
Celik, M., Kang, J.M., Shekhar, S.: Zonal co-location pattern discovery with dynamic parameters. In: Seventh IEEE International Conference on Data Mining. ICDM 2007, pp. 433–438. IEEE (2007)
Barua, S., Sander, J.: SSCP: mining statistically significant co-location patterns. In: Pfoser, D., Tao, Y., Mouratidis, K., Nascimento, M.A., Mokbel, M., Shekhar, S., Huang, Y. (eds.) SSTD 2011. LNCS, vol. 6849, pp. 2–20. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22922-0_2
Barua, S., Sander, J.: Mining statistically significant co-location and segregation patterns. IEEE Trans. Knowl. Data Eng. 26(5), 1185–1199 (2014)
Barua, S., Sander, J.: Statistically significant co-location pattern mining (2015)
Yoo, J.S., Bow, M.: Mining top-k closed co-location patterns. In: 2011 IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services (ICSDM), pp. 100–105. IEEE (2011)
Huang, Y., Xiong, H., Shekhar, S., Pei, J.: Mining confident co-location rules without a support threshold. In: Proceedings of the 2003 ACM Symposium on Applied Computing, pp. 497–501. ACM (2003)
Yoo, J.S., Boulware, D., Kimmey, D.: A parallel spatial co-location mining algorithm based on mapreduce. In: 2014 IEEE International Congress on Big Data (BigData Congress), pp. 25–31. IEEE (2014)
Andrzejewski, W., Boinski, P.: GPU-accelerated collocation pattern discovery. In: Catania, B., Guerrini, G., Pokorný, J. (eds.) ADBIS 2013. LNCS, vol. 8133, pp. 302–315. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40683-6_23
Andrzejewski, W., Boinski, P.: A parallel algorithm for building iCPI-trees. In: Manolopoulos, Y., Trajcevski, G., Kon-Popovska, M. (eds.) ADBIS 2014. LNCS, vol. 8716, pp. 276–289. Springer, Cham (2014). doi:10.1007/978-3-319-10933-6_21
Yoo, J.S., Boulware, D.: Incremental and parallel spatial association mining. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 75–76. IEEE (2014)
Andrzejewski, W., Boinski, P.: Parallel GPU-based plane-sweep algorithm for construction of iCPI-trees. J. Database Manage. (JDM) 26(3), 1–20 (2015)
Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Databases, VLDB. vol. 1215, pp. 487–499 (1994)
City of Seattle, Department of Information Technology, Seattle Police Department: Seattle Police Department 911 Incident Response. https://data.seattle.gov/Public-Safety/Seattle-Police-Department-911-Incident-Response/3k2p-39jp
Acknowledgement
We would like to thank Dr. Simin You for helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Sainju, A.M., Jiang, Z. (2017). Grid-Based Colocation Mining Algorithms on GPU for Big Spatial Event Data: A Summary of Results. In: Gertz, M., et al. Advances in Spatial and Temporal Databases. SSTD 2017. Lecture Notes in Computer Science(), vol 10411. Springer, Cham. https://doi.org/10.1007/978-3-319-64367-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-64367-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64366-3
Online ISBN: 978-3-319-64367-0
eBook Packages: Computer ScienceComputer Science (R0)