Abstract
Spatial co-location pattern is a subset of spatial features whose instances are frequently located together in geography. Mining co-location patterns are particularly valuable for discovering spatial dependencies. Traditional co-location pattern mining algorithms are computationally expensive with rapidly increasing of data volume. In this paper, we explore a novel iterative framework based on parallel ordered-clique-growth for co-location pattern mining. The ordered clique extension can re-use previously processed information and be executed in parallel, and hence speed up the identification of co-location instances. Based on the iterative framework, a MapReduce algorithm is designed to search for prevalent co-location patterns in a level-wise manner, namely PCPM_OC. To narrow the search space of ordered cliques, two pruning techniques are suggested for filtering invalid clique instances as much as possible. The completeness and correctness of PCPM_OC are proven and we also discuss its complexity in this paper. Moreover, we compare PCPM_OC with two advanced MapReduce based co-location pattern mining algorithms on multiple perspectives. At last, substantial experiments are conducted on synthetic and real-world spatial datasets to study the performance of PCPM_OC. Experimental results demonstrate that PCPM_OC has a significant improvement in efficiency and shows better scalability on massive spatial data.












Similar content being viewed by others
References
Shekhar, S., Huang, Y.: Discovering spatial co-location patterns: a summary of results. In: 7th International Symposium on Advances in Spatial and Temporal Databases (SSTD), pp. 236–256 (2001)
Yoo, J.S., Shekhar, S.: A joinless approach for mining spatial colocation patterns. IEEE Trans. Knowl. Data Eng. 18(10), 1323–1337 (2006)
Huang, Y., Shekhar, S., Xiong, H.: Discovering colocation patterns from spatial data Sets: a general approach. IEEE Trans. Knowl. Data Eng. 16(12), 1472–1485 (2004)
Xiong, H., Shekhar, S., Huang, Y., Kumar, V., Ma, X., Yoo, J.S.: A framework for discovering co-location patterns in data sets with extended spatial objects. In: SIAM International Conference on Data Mining, pp. 1–13 (2004)
Mohammad, A., Farhad, S., Robert, W.: A generic regional spatio-temporal co-occurrence pattern mining model: a case study for air pollution. J. Geogr. Syst. 17(3), 249–274 (2015)
Fang, Y., Wang, L., Hu, T., Wang, X.: DFCPM: a dominant feature co-location pattern miner. In: APWEB/WAIM, pp. 456–460 (2018)
Wang L., Bao X., Cao, L.: Interactive probabilistic post-mining of user-preferred spatial co-location patterns. In: IEEE International Conference on Data Engineering (ICDE), pp. 1256–1259 (2018)
Yang, P., Zhang, T., Wang, L.: TSRS: trip service recommended system based on summarized co-location patterns. In: APWEB/WAIM, pp. 451–455 (2018)
Yu, W.: Spatial co-location pattern mining for location-based services in road networks. Expert Syst. Appl. 46, 324–335 (2016)
Li, J., Adilmagambetov, A., Jabbar, M.S.M., Osornio-Vargas, A., Wine, O.: On discovering co-location patterns in datasets: a casestudy of pollutants and child cancers. Geoinformatica 20(4), 651–692 (2016)
Lu, J., Wang, L., Fang, Y., Zhao, J.: Mining strong symbiotic patterns hidden in spatial prevalent co-location patterns. Knowl. Based Syst. 146, 190–202 (2018)
Lu, J., Wang, L., Fang, Y., Li, M.: Mining competitive pairs hidden in co-location patterns from dynamic spatial databases. In: Pacific Asia Knowledge Discovery and Data Mining (PAKDD), pp. 467–480 (2017)
Yao, X., Chen, L., Peng, L., Chi, T.: A co-location pattern-mining algorithm with a density-weighted distance thresholding consideration. Inf. Sci. 396, 144–161 (2017)
Wang, L., Bao, X., Zhou, L.: Redundancy reduction for prevalent co-location patterns. IEEE Trans. Knowl. Data Eng. 30(1), 142–155 (2018)
Wang, L., Bao, X., Chen, H., Cao, L.: Effective lossless condensed representation and discovery of spatial co-location patterns. Inf. Sci. 436–437, 197–213 (2018)
Yang, P., Wang, L., Wang, X.: A parallel spatial co-location pattern mining approach based on ordered clique growth. In: International Conference on Database Systems for Advanced Applications (DASFAA), pp. 734–742 (2018)
Andrzejewski, W., Boinski, P.: Efficient spatial co-location pattern mining on multiple GPUs. Expert Syst. Appl. 93, 465–483 (2018)
Fang, Y., Wang, L., Wang, X., Zhou, L.: Mining co-location patterns with dominant features. In: International Conference on Web Information Systems Engineering (WISE), pp. 183–198 (2017)
Fang, Y., Wang, L., Hu, T.: Spatial co-location pattern mining based on density peaks clustering and fuzzy theory. In: APWEB/WAIM, pp. 298–305 (2018)
Ouyang, Z., Wang, L., Wu, P.: Spatial co-location pattern discovery from fuzzy objects. Int. J. Artif. Intell. Tools 26, 1750003 (2017). https://doi.org/10.1142/S0218213017500038
Chan, H.K., Long, C., Yan, D., Wong, R.C. : Fraction-score: a new support measure for co-location pattern mining. In: IEEE International Conference on Data Engineering (ICDE), pp. 1514–1525 (2019)
Wang, L., Bao, Y., Lu, J., Yip, J.: A new join-less approach for co-location pattern mining. In: 8th IEEE International Conference on Computer and Information Technology (CIT), pp. 197–202 (2008)
Wang, L., Zhou, L., Lu, J., Yip, J.: An order-clique-based approach for mining maximal co-locations. Inf. Sci. 179(19), 3370–3382 (2009)
Lin, Z., Lim, S.J.: Fast spatial co-location mining without cliqueness checking. In: International Conference on Information and Knowledge Management (CIKM), pp. 1461–1462 (2008)
Yoo, J.S., Shekhar, S.: A partial join approach for mining co-location patterns. In: The 12th Annual ACM International Workshop on Geographic Information Systems, pp. 241–249 (2004)
Yao, X., Peng, L., Yang, L., Chi, T.: A fast space-saving algorithm for maximal co-location pattern mining. Expert Syst. Appl. 63, 310–323 (2016)
Xiao, X., Xie, X., Luo, Q., Ma, W.: Density based co-location pattern discovery. In: 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 1–10 (2008)
Kim, S., K., Kim, Y., Kim, U.: Maximal cliques generating algorithm for spatial co-location pattern mining. In: Secure and Trust Computing, Data Management and Applications (STA), pp. 241–250 (2011)
Yoo, J.S., Boulware, D., Kimmey, D.: A parallel spatial co-location mining algorithm based on MapReduce. In: IEEE International Congress on Big Data, pp. 25–31 (2014)
Yang, P., Wang, L., Wang, X., Fang, Y.: A parallel joinless algorithm for co-location pattern mining based on group-dependent shard. In: International Conference on Web Information Systems Engineering (WISE), pp. 240–250 (2018)
Zheng, B., Zheng, K., Jensen, C.S., Nguyen, Q.V.H., Su, H., Li, G., Zhou, X.: Answering why-not group spatial keyword queries. IEEE Trans. Knowl. Data Eng. (2019). https://doi.org/10.1109/TKDE.2018.2879819
Zheng, B., Su, H., Hua, W., Zheng, K., Zhou, X., Li, G.: Efficient clue-based route search on road networks. IEEE Trans. Knowl. Data Eng. 29(9), 1846–1859 (2017)
Zhao, Y., Shang, S., Wang, Y., Zheng, B., Nguyen, Q.V.H., Zheng, K.: REST: a reference-based framework for spatio-temporal trajectory compression. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 2797–2806 (2018)
Zheng, K., Zhao, Y., Lian, D., Zheng, B., Liu, G., Zhou, X.: Reference-based framework for spatio-temporal trajectory compression and query processing. IEEE Trans. Knowl. Data Eng. (2019). https://doi.org/10.1109/TKDE.2019.2914449
Zheng, B., Zheng, K., Xiao, X., Su, H., Yin, H., Zhou, X., Li, G.: Keyword-aware continuous kNN query on road networks. In: IEEE International Conference on Data Engineering (ICDE), pp. 871–882 (2016)
Liu, J., Lemus, N.M., Pacitti, E., Porto, F., Valduriez, P.: Parallel computation of PDFs on big spatial data using spark. Distrib. Parallel Databases (2019). https://doi.org/10.1007/s10619-019-07260-3
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: International Conference on Very Large Data Bases (VLDB), pp. 487–499 (1994)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD International Conference on Management of Data, pp. 1–12 (2000)
Barua, S., Sander, J.: Mining statistically significant co-location and segregation patterns. IEEE Trans. Knowl. Data Eng. 26(5), 1185–1199 (2014)
Cai, J., Liu, Q., Deng, M., Tang, J., He, Z.: Adaptive detection of statistically significant regional spatial co-location patterns. Comput. Environ. Urban Syst. 68, 53–63 (2018)
Yao, X., Chen, L., Wen, C., Peng, L., Yang, L., Chi, T., Wang, X., Yu, W.: A spatial co-location mining algorithm that includes adaptive proximity improvements and distant instance references. Int. J. Geogr. Inf. Sci. 3, 1–26 (2018)
Andrzejewski, W., Boinski, P.: Parallel GPU-based plane-sweep algorithm for construction of iCPI-trees. J. Database Manage. 26(3), 1–20 (2015)
Garaeva, A., Makhmutova, F., Anikin, I., Sattler, K.U.: A framework for co-location patterns mining in big spatial data. In: IEEE International Conference on Soft Computing & Measurements, pp. 477–480 (2017)
Li, H., Wang, Y., Zhan, D., Zhang, M., Chang, E.: PFP: parallel FP-growth for query recommendation. In: ACM Conference on Recommender Systems, pp. 107–114 (2008)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI, pp. 137–150 (2004)
Acknowledgements
This work is supported by the National Natural Science Foundation of China (61472346, 61662086, 61762090), and the Project of Innovative Research Team of Yunnan Province (2018HC019).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, P., Wang, L. & Wang, X. A MapReduce approach for spatial co-location pattern mining via ordered-clique-growth. Distrib Parallel Databases 38, 531–560 (2020). https://doi.org/10.1007/s10619-019-07278-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-019-07278-7