Abstract
This paper studies aggregate search in transaction time databases. Specifically, each object in such a database can be modeled as a horizontal segment, whose y-projection is its search key, and its x-projection represents the period when the key was valid in history. Given a query timestamp q t and a key range \(\vec{q_k}\) , a count-query retrieves the number of objects that are alive at q t , and their keys fall in \(\vec{q_k}\) . We provide a method that accurately answers such queries, with error less than \(\frac{1}{\varepsilon} + \varepsilon \cdot N_{\rm alive}(q_t)\) , where N alive(q t ) is the number of objects alive at time q t , and ɛ is any constant in (0, 1]. Denoting the disk page size as B, and n = N / B, our technique requires O(n) space, processes any query in O(log B n) time, and supports each update in O(log B n) amortized I/Os. As demonstrated by extensive experiments, the proposed solutions guarantee query results with extremely high precision (median relative error below 5%), while consuming only a fraction of the space occupied by the existing approaches that promise precise results.
Similar content being viewed by others
References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules, pp. 307–328 (1996)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD, pp. 439–450 (2000)
Arasu, A., Manku, G.S.: Approximate counts and quantiles over sliding windows. In: PODS, pp. 286–296 (2004)
Arge L. and Vahrenhold J. (2004). I/O-efficient dynamic planar point location. Comput. Geom. 29(2): 147–162
Becker B., Gschwind S., Ohler T., Seeger B. and Widmayer P. (1996). An asymptotically optimal multiversion b-tree. VLDB J. 5(4): 264–275
Beckmann, N., Kriegel, H.-P., Schneider, R., Seeger, B.: The R*-tree: an efficient and robust access method for points and rectangles. In: SIGMOD, pp. 322–331 (1990)
Berg M., Kreveld M., Overmars M. and Schwarzkopf O. (2000). Comput. Geom. Algorithms and Appl.. Springer, Heidelberg
Bohlen, M.H., Gamper, J., Jensen, C.S.: Multi-dimensional aggregation for temporal data. In: EDBT, pp. 257–275 (2006)
Bohlen, M.H., Jensen, C.S., Snodgrass, R.T.: Temporal statement modifiers. TODS, 25(4), (2000)
Chatziantoniou, D., Akinde, M.O., Johnson, T., Kim, S.: The md-join: an operator for complex olap. In: ICDE, pp. 524–533 (2001)
Chazelle B. (1988). A functional approach to data structures and its use in multidimensional searching. SIAM J. Comput. 17(3): 427–462
Chun, S.-J., Chung, C.-W., Lee, J.-H., Lee, S.-L.: Dynamic update cube for range-sum queries, pp. 521–530 (2001)
Geffner, S., Agrawal, D., Abbadi, A.E., Smith, T.R.: Relative prefix sums: an efficient approach for querying dynamic olap data cubes. In: ICDE, pp. 328–335 (1999)
Gendrano, J.A.G., Huang, B.C., Rodrigue, J.M., Moon, B., Snodgrass, R.T.: Parallel algorithms for computing temporal aggregates. In: ICDE, pp. 418–427 (1999)
Govindarajan, S., Agarwal, P.K., Arge, L.: CRB-tree: an efficient indexing scheme for range-aggregate queries. In: ICDT, pp. 143–157 (2003)
Greenwald, M., Khanna, S.: Space-efficient online computation of quantile summaries. In: SIGMOD, pp. 58–66 (2001)
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: SIGMOD, pp. 47–57 (1984)
Ho, C.-T., Agrawal, R., Megiddo, N., Srikant, R.: Range queries in olap data cubes. In: SIGMOD, pp. 73–88 (1997)
Jermaine, C., Pol, A., Arumugam, S.: Online maintenance of very large random samples. In: SIGMOD, pp. 299–310 (2004)
Jin, J., An, N., Sivasubramaniam, A.: Analyzing range queries on spatial data. In: ICDE, pp. 525–534 (2000)
Kim J.S., Kang S.T. and Kim M.H. (1999). On temporal aggregate processing based on time points. Inform. Process. Lett. (IPL) 71(5–6): 213–220
Kline, N., Snodgrass, R.T.: Computing temporal aggregates. In: ICDE, pp. 222–231 (1995)
Lazaridis, I., Mehrotra, S.: Progressive approximate aggregate queries with a multi-resolution tree structure. In: SIGMOD, pp. 401–412 (2001)
Lin, X., Lu, H., Xu, J., Yu, J.X.: Continuously maintaining quantile summaries of the most recent n elements over a data stream. In: ICDE, pp. 362–374 (2004)
Lin X., Xu J., Zhang Q., Lu H., Yu J.X., Zhou X. and Yuan Y. (2006). Approximate processing of massive continuous quantile queries over high-speed data streams. TKDE 18(5): 683–698
Lopez I.F.V., Snograss R. and Moon B. (2005). Spatiotemporal aggregate computation: a survey. IEEE Trans. Knowl. Data Eng. 17(2): 271–286
Moon B., Lopez I.F.V. and Immanuel V. (2003). Efficient algorithms for large-scale temporal aggregation. TKDE 15(3): 744–759
Papadias, D., Kalnis P., Zhang, J., Tao, Y.: Efficient olap operations in spatial data warehouses. In: SSTD, pp. 443–459 (2001)
Riedewald, M., Agrawal, D., Abbadi, A.E.: Efficient integration and aggregation of historical information. In SIGMOD, pp. 13–24 (2002)
Salzberg B. and Tsotras V.J. (1999). Comparison of access methods for time-evolving data. ACM Comput. Surv. 31(2): 158–221
Sun, C., Agrawal, D., Abbadi, A.E.: Exploring spatial datasets with histograms. In: ICDE, pp. 93–102 (2002)
Tao, Y., Papadias, D., Faloutsos, C.: Approximate temporal aggregation. In: ICDE, pp. 190–201 (2004)
Tao, Y., Papadias, D., Zhang, J.: Aggregate processing of planar points. In: EDBT, pp. 682–700 (2002)
Varman P.J. and Verma R.M. (1997). An efficient multiversion access structure. TKDE 9(3): 391–409
Yang J. and Widom J. (2003). Incremental computation and maintenance of temporal aggregates. VLDB J. 12(3): 262–283
Ye, X., Keane, J.A.: Processing temporal aggregates in parallel. In: International. Conference on Systems, Man, and Cybernetics, pp. 1373–1378 (1997)
Zhang, D., Gunopulos, D., Tsotras, V.J., Seeger, B.: Temporal aggregation over data streams using multiple granularities. In: EDBT, pp. 646–663 (2002)
Zhang, D., Markowetz, A., Tsotras, V.J., Gunopulos, D., Seeger, B.: Efficient computation of temporal aggregates with range predicates. In: PODS (2001)
Zhang D. and Tsotras V.J. (2005). Optimizing spatial min/max aggregations. VLDB J. 14(2): 170–181
Zhang, D., Tsotras, V.J., Gunopulos, D.: Efficient aggregation over objects with extent. In: PODS, pp. 121–132 (2002)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tao, Y., Xiao, X. Efficient temporal counting with bounded error. The VLDB Journal 17, 1271–1292 (2008). https://doi.org/10.1007/s00778-007-0066-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-007-0066-x