Abstract
Spatiotemporal co-occurrence patterns (STCOPs) represent the subsets of feature types whose instances are frequently co-occurring both in space and time. Spatiotemporal co-occurrences reflect the spatiotemporal overlap relationships among two or more spatiotemporal instances both in spatial and temporal dimensions. STCOPs can be potentially used to predict and understand the generation and evolution of different types of interacting phenomena in various scientific fields such as astronomy, meteorology, biology, geosciences. Meaningful and statistically significant data analysis for these scientific fields requires processing sufficiently large datasets. Due to the computationally expensive nature of spatiotemporal operations required for mining spatiotemporal co-occurrences, it is increasingly difficult to identify spatiotemporal co-occurrences and discover STCOPs in centralized system settings. As a solution, we developed a cloud-based distributed mining system for discovering STCOPs. Our system uses Accumulo, a column-oriented non-relational database management system as its backbone. In order to efficiently mine the STCOPs, we propose three data models for managing trajectory-based spatiotemporal data in Accumulo. We introduce an in-memory join-index structure and a join algorithm for effectively performing spatiotemporal join operations on spatiotemporal trajectories in non-relational databases. Lastly, with the experiments with artificial and real life datasets, we evaluate the performance of the proposed models for STCOP mining.
Similar content being viewed by others
Notes
The Apache Accumulo - https://accumulo.apache.org/
HBase– http://hbase.apache.org/
Cassandra– ‘http://cassandra.apache.org/
Heliophysics Event Registry - https://www.lmsal.com/hek/api.html
NBA.com/Stats - http://stats.nba.com/
Amazon Web Services - Cloud Computing Services – http://aws.amazon.com/
JTS Topology Suite – http://www.vividsolutions.com/jts/JTSHome.htm
References
Apache Accumulo user manual version 1.6., https://accumulo.apache.org/1.6/accumulo_user_manual.html (2014). Accessed: December 1, 2014
Agouris P., Aref W., Goodchild M.F., Barbra S., Jensen J., Knoblock C.A., Langley R., Mikhail E., Shekhar S., Wolfson O., Yuan M. (2012) From GPS and virtual globes to spatial computing-2020. Tech. rep., Computing Community Consortium
Agrawal R., Srikant R. (1994) Fast algorithms for mining association rules in large databases. In: VLDB’94, Proceedings of 20th international conference on very large data bases, Santiago de Chile, pp 487–499
Andrienko N.V., Andrienko G.L. (2007) Designing visual analytics methods for massive collections of movement data. Cartographica 42(2):117–138
Armbrust M., Fox A., Griffith R., Joseph A.D., Katz R.H., Konwinski A., Lee G., Patterson D.A., Rabkin A., Stoica I., Zaharia M. (2010) A view of cloud computing. Commun ACM 53(4):50–58
Aydin B., Angryk R.A., Pillai K.G. (2014) ERMO-DG: Evolving region moving object dataset generator. In: Proceedings of the twenty-seventh international florida artificial intelligence research society conference, FLAIRS 2014, Pensacola Beach
Aydin B., Kempton D., Akkineni V., Angryk R., Pillai K.G. (2015) Mining spatiotemporal co-occurrence patterns in solar datasets. Astronomy and Computing. doi:10.1016/j.ascom.2015.10.003. In Press
Aydin B., Kempton D., Akkineni V., Govaparam S., Pillai K.G., Angryk R. (2014) Spatiotemporal indexing techniques for efficiently mining spatiotemporal co-occurrence patterns. In: Workshop on solar astronomy big data, 2014 IEEE International Conference on Big Data. IEEE, pp 1–10
Burrows M. (2006) The Chubby lock service for loosely-coupled distributed systems. In: Proceedings of the 7th symposium on operating systems design and implementation 2006, OSDI ’06. USENIX Association, Seattle, pp 335–350
Celik M. (2011) Discovering partial spatio-temporal co-occurrence patterns, Fuzhou, pp 116–120
Celik M., Azginoglu N., Terzi R. (2012) Mining periodic spatio-temporal co-occurrence patterns: a summary of results. In: 2012 international symposium on innovations in intelligent systems and applications (INISTA), pp 1–5
Celik M., Shekhar S., Rogers J.P., Shine J.A. (2008) Mixed-drove spatiotemporal co-occurrence pattern mining. IEEE Trans Knowl Data Eng 20 (10):1322–1335
Chang F., Dean J., Ghemawat S., Hsieh W.C., Wallach D.A., Burrows M., Chandra T., Fikes A., Gruber R.E. (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst 26(2)
Elsberry R.L. (2002) Predicting hurricane landfall precipitation: optimistic and pessimistic views from the symposium on precipitation extremes. Bull Am Meteorol Soc 83(9):1333–1339
Erwig M. (2004) Toward spatio-temporal patterns. In: de Caluwe R, de Tr G, Bordogna G (eds) Spatio-temporal databases. Springer, Berlin, pp 29–53
Gauthreaux S.A., Belser C.G. (2003) Bird movements on Doppler weather surveillance radar. Birding 35(6):616–628
Ghemawat S., Gobioff H., Leung S. (2003) The google file system, Bolton Landing, pp 29–43
Huang Y., Shekhar S., Xiong H. (2004) Discovering colocation patterns from spatial data sets: a general approach. IEEE Trans Knowl Data Eng 16(12):1472–1485
Kempton D., Pillai K.G., Angryk R.A. (2014) Iterative refinement of multiple targets tracking of solar events. In: 2014 IEEE international conference on big data, big data 2014, Washington, pp 36–44, doi:10.1109/BigData.2014.7004402, (to appear in print)
Kuhn K., Campbell-Lendrum D., Haines A., Cox J. (2005) Using climate to predict infectious disease epidemics. World Health Organ, Geneva
Langhoff S.R., Straume T. (2012) Highlights of the space weather risks and society? workshop. Space Weather 10(6)
Manning C.D., Raghavan P., Schu̇tze H. (2008) Introduction to information retrieval. Cambridge University Press
O’Neil P.E., Cheng E., Gawlick D., O’Neil E.J. (1996) The log-structured merge-tree (lsm-tree). Acta Inf 33(4):351–385
Pillai K.G., Angryk R.A., Aydin B. (2013) A filter-and-refine approach to mine spatiotemporal co-occurrences. In: 21st SIGSPATIAL international conference on advances in geographic information systems. SIGSPATIAL, Orlando, pp 104–113
Pillai K.G., Angryk R.A., Banda J.M., Schuh M.A., Wylie T. (2012) Spatio-temporal co-occurrence pattern mining in data sets with evolving regions. In: 12th IEEE international conference on data mining workshops, ICDM Workshops, Brussels, pp 805–812
Qian F., He Q., He J. (2009) Mining spread patterns of spatio-temporal co-occurrences over zones. In: Computational science and its applications - ICCSA 2009, international conference. Proceedings, Part II, Seoul, pp 677–692
Sen R., Farris A., Guerra P. (2013) Benchmarking apache accumulo bigdata distributed table store using its continuous test suite. In: IEEE international congress on big data. BigData Congress, pp 334–341
Shekhar S., Chawla S. (2003) Spatial databases - a tour. Prentice Hall
Shekhar S., Huang Y. (2001) Discovering spatial co-location patterns: A summary of results. In: Proceedings advances in spatial and temporal databases, 7th international symposium, SSTD 2001, Redondo Beach, pp 236–256
Vatsavai R.R., Ganguly A., Chandola V., Stefanidis A., Klasky S., Shekhar S. (2012) Spatiotemporal data mining in the era of big spatial data: Algorithms and applications. In: Proceedings of the 1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial ’12. ACM, New York, pp 1–10, doi:10.1145/2447481.2447482, (to appear in print)
Wong C.C., Loewke K.E., Bossert N.L., Behr B., De Jonge C.J., Baer T.M., Pera R.A.R. (2010) Non-invasive imaging of human embryos before embryonic genome activation predicts development to the blastocyst stage. Nat Biotechnol 28 (10):1115–1121
Yoo J.S., Shekhar S. (2004) A partial join approach for mining co-location patterns. In: Proceedings 12th ACM international workshop on geographic information systems, ACM-GIS 2004, Washington, pp 241–249
Yoo J.S., Shekhar S. (2006) A joinless approach for mining spatial colocation patterns. IEEE Trans Knowl Data Eng 18(10):1323–1337
Zhang Z., Wu W. (2008) Composite spatio-temporal co-occurrence pattern mining. In: Proceedings of Wireless algorithms, systems, and applications, third international conference, WASA 2008, Dallas, pp 454–465
Acknowledgments
This work was supported in part by two NASA Grant Awards (No. NNX11AM13A, and No. NNX15AF39G), and one NSF Grant Award (No. AC1443061). The NSF Grant Award has been supported by funding from the Division of Advanced Cyberinfrastructure within the Directorate for Computer and Information Science and Engineering, the Division of Astronomical Sciences within the Directorate for Mathematical and Physical Sciences, and the Division of Atmospheric and Geospace Sciences within the Directorate for Geosciences.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Aydin, B., Akkineni, V. & Angryk, R. Mining spatiotemporal co-occurrence patterns in non-relational databases. Geoinformatica 20, 801–828 (2016). https://doi.org/10.1007/s10707-016-0255-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-016-0255-0