Skip to main content
Log in

Computing Unrestricted Synopses Under Maximum Error Bound

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

Constructing Haar wavelet synopses with guaranteed maximum error on data approximations has many real world applications. In this paper, we take a novel approach towards constructing unrestricted Haar wavelet synopses under maximum error metrics (L ). We first provide two linear time (logN)-approximation algorithms which have space complexities of O(logN) and O(N) respectively. These two algorithms have the advantage of being both simple in structure and naturally adaptable for stream data processing. Unlike traditional approaches for synopses construction that rely heavily on examining wavelet coefficients and their summations, the proposed methods are very compact and scalable, and sympathetic for online data processing. We then demonstrate that this technique can be extended to other findings such as Haar+ tree. Extensive experiments indicate that these techniques are highly practical. The proposed algorithms achieve a very attractive tradeoff between efficiency and effectiveness, surpassing contemporary (logN)-approximation algorithms in compressing qualities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Chakrabarti, K., Garofalakis, M., Rastogi, R., Shim, K.: Approximate query processing using wavelets. VLDB J. 10(2–3), 199–223 (2001)

    MATH  Google Scholar 

  2. Garofalakis, M., Gibbons, P.B.: Probabilistic wavelet synopses. ACM Trans. Database Syst. 29(1), 43–90 (2004). doi:10.1145/974750.974753

    Article  Google Scholar 

  3. Guha, S.: Space efficiency in synopsis construction algorithms. In: VLDB ’05: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 409–420. ACM, New York (2005)

    Google Scholar 

  4. Guha, S., Harb, B.: Wavelet synopsis for data streams: minimizing non-Euclidean error. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, KDD ’05, pp. 88–97. ACM, New York (2005). doi:10.1145/1081870.1081884

    Chapter  Google Scholar 

  5. Guha, S., Harb, B.: Approximation algorithms for wavelet transform coding of data streams. IEEE Trans. Inf. Theory 54(2), 811–830 (2008). doi:10.1109/TIT.2007.913569

    Article  MathSciNet  Google Scholar 

  6. Guha, S., Shim, K., Woo, J.: Rehist: relative error histogram construction algorithms. In: VLDB ’04: Proceedings of the Thirtieth International Conference on Very Large Data Bases, pp. 300–311. Morgan Kaufmann, San Mateo (2004)

    Google Scholar 

  7. Karras, P., Mamoulis, N.: One-pass wavelet synopses for maximum-error metrics. In: VLDB ’05: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 421–432. ACM, New York (2005)

    Google Scholar 

  8. Karras, P., Mamoulis, N.: Hierarchical synopses with optimal error guarantees. ACM Trans. Database Syst. 33(3), 1–53 (2008). doi:10.1145/1386118.1386124

    Article  Google Scholar 

  9. Karras, P., Sacharidis, D., Mamoulis, N.: Exploiting duality in summarization with deterministic guarantees. In: KDD ’07: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 380–389. ACM, New York (2007). doi:10.1145/1281192.1281235

    Chapter  Google Scholar 

  10. Matias, Y., Urieli, D.: Optimal workload-based weighted wavelet synopses. In: Proceedings of International Conference on Database Theory (ICDT), pp. 368–382 (2005)

    Google Scholar 

  11. Matias, Y., Vitter, J.S., Wang, M.: Wavelet-based histograms for selectivity estimation. SIGMOD Rec. 27(2), 448–459 (1998). doi:10.1145/276305.276344

    Article  Google Scholar 

  12. Muthukrishnan, S.: Subquadratic algorithms for workload-aware Haar wavelet synopses. In: Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS), pp. 285–296 (2005)

    Google Scholar 

  13. Pang, C., Zhang, Q., Hansen, D., Maeder, A.: Building data synopses within a known maximum error bound. In: APWeb/WAIM’07: Proceedings of the Joint 9th Asia-Pacific Web and 8th International Conference on Web-Age Information Management Conference on Advances in Data and Web Management, pp. 463–470. Springer, Berlin (2007)

    Google Scholar 

  14. Pang, C., Zhang, Q., Hansen, D., Maeder, A.: Unrestricted wavelet synopses under maximum error bound. In: EDBT ’09: Proceedings of the 12th International Conference on Extending Database Technology, pp. 732–743. ACM, New York (2009). doi:10.1145/1516360.1516445

    Chapter  Google Scholar 

  15. Reiss, F., Garofalakis, M., Hellerstein, J.M.: Compact histograms for hierarchical identifiers. In: Proceedings of the 32nd International Conference on Very Large Data Bases, VLDB ’06, pp. 870–881. ACM, New York (2006). http://portal.acm.org/citation.cfm?id=1182635.1164202

    Google Scholar 

  16. Stollnitz, E.J., Derose, T.D., Salesin, D.H.: Wavelets for Computer Graphics: Theory and Applications. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  17. UCI KDD archive. http://kdd.ics.uci.edu

  18. Vitter, J.S., Wang, M., Iyer, B.: Data cube approximation and histograms via wavelets. In: CIKM ’98: Proceedings of the Seventh International Conference on Information and Knowledge Management, pp. 96–104. ACM, New York (1998). doi:10.1145/288627.288645

    Chapter  Google Scholar 

  19. Zhang, Q., Pang, C., Hansen, D.: On multidimensional wavelet synopses for maximum error bounds. In: DASFAA ’09: Proceedings of the 14th International Conference on Database Systems for Advanced Applications, pp. 646–661 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chaoyi Pang.

Additional information

Part of the results in this paper appeared in Proceedings of the 12th International Conference on Extending Database Technology (EDBT) [14].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pang, C., Zhang, Q., Zhou, X. et al. Computing Unrestricted Synopses Under Maximum Error Bound. Algorithmica 65, 1–42 (2013). https://doi.org/10.1007/s00453-011-9571-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-011-9571-9

Keywords

Navigation