ABSTRACT
The massive amounts of data series data continuously generated and collected by applications require new indices to speed up data series similarity queries on which various data mining techniques rely. However, the state-of-the-art iSAX-based indexing techniques do not scale well due to the binary fanout that leads to a highly deep index tree and suffer from accuracy degradation due to the character-level cardinality that leads to poor maintenance of the proximity. To address this problem, we recently proposed TARDIS to supports indexing and querying billion-scale data series datasets. It introduces a new iSAX-T signatures to reduce the cardinality conversion cost and corresponding sigTree to construct a compact index structure to preserve better similarity. The framework consists of one centralized index and local distributed indices to efficiently re-partition and index dimensional datasets. Besides, effective query strategies based on sigTree structure are proposed to greatly improve the accuracy. In this demonstration, we present GENET, a new interactive exploration demonstration that allows users to support Big Data Series Approximate Retrieval and Recursive Interactive Clustering in large-scale geospatial datasets using TARDIS index techniques.
- David A Kroodsma, Juan Mayorga, Timothy Hochberg, Nathan A Miller, Kristina Boerder, Francesco Ferretti, Alex Wilson, Bjorn Bergman, Timothy D White, et al. 2018. Tracking the global footprint of fisheries. Science (2018).Google Scholar
- U.S./Japan ASTER Science Team NASA/METI/AIST/Japan Spacesystems. 2019. ASTER Global Digital Elevation Model V003 [Dataset]. In NASA EOSDIS Land Processes DAAC.Google Scholar
- Themis Palpanas and Volker Beckmann. 2019. Report on the first and second interdisciplinary time series analysis workshop (itisa). ACM SIGMOD (2019).Google Scholar
- Jin Shieh and Eamonn Keogh. 2008. iSAX: indexing and mining terabyte sized time series. In SIGKDD. ACM, 623--631.Google Scholar
- Liang Zhang, Noura Alghamdi, Mohamed Y Eltabakh, and Elke A Rundensteiner. 2019. TARDIS: Distributed Indexing Framework for Big Time Series Data. In ICDE. IEEE, 1202--1213.Google Scholar
Index Terms
- Big Data Series Analytics Using TARDIS and its Exploitation in Geospatial Applications
Recommendations
Sampling Big Trajectory Data
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge ManagementThe increasing prevalence of sensors and mobile devices has led to an explosive increase of the scale of spatio-temporal data in the form of trajectories. A trajectory aggregate query, as a fundamental functionality for measuring trajectory data, aims ...
Comments