Skip to main content

Approximate Counting of Frequent Query Patterns over XQuery Stream

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2973))

Included in the following conference series:

Abstract

One efficient approach to improve the performance of XML management systems is to cache the frequently retrieved results. This entails the discovery of frequent query patterns that are issued by users. In this paper, we model user queries as a stream of XML query pattern trees and mine for frequent query patterns in a batch-wise manner. We design a novel data structure called D-GQPT to merge the pattern trees of the batches seen so far, and to dynamically mark the active portion of the current batch. With the D-GQPT, we are able to limit the enumeration of candidate trees to only the currently active pattern trees. We also design a summary data structure called ECTree to incrementally compute the frequent tree patterns over the query stream. Based on the above two constructs, we present the frequent query pattern mining algorithm called AppXQSMiner over the XML query stream. Experiment results show that the proposed approach is both efficient and scalable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asai, T., Arimura, H., et al.: Online Algorithms for Mining Semi-structured Data Stream. IEEE ICDM, 27–34 (2002)

    Google Scholar 

  2. Asai, T., Abe, K., Kawasoe, S., et al.: Efficient Substructure Discovery from Large Semi-structured Data. In: 2nd SIAM Int. Conference on Data Mining (2002)

    Google Scholar 

  3. Charikar, M., Chen, K., Farach-Colton, M.: Finding Frequent Items in Data Streams. In: 29th Int. Colloquium on Automata, Languages and Programming (2002)

    Google Scholar 

  4. Charikar, M., Chaudhuri, S., Motwani, R., Narasayya, V.R.: Towards Estimation Error Guarantees for Distinct Values. In: ACM PODS, pp. 268–279 (2000)

    Google Scholar 

  5. Chen, L., Rundensteiner, E.A., Wang, S.: XCache-A Semantic Caching System for XML Queries. ACM SIGMOD, 618 (2002)

    Google Scholar 

  6. Gibbons, P.B., Matias, Y.: New Sampling-Based Summary Statistics for Improving Approximate Query Answers. ACM SIGMOD, 331–342 (1998)

    Google Scholar 

  7. Guha, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering Data Streams: Theory and Practice. IEEE Transactions on Knowledge and Data Engineering, 515–528 (2003)

    Google Scholar 

  8. Hidber, C.: Online Association Rule Mining. ACM SIGMOD, 145–156 (1999)

    Google Scholar 

  9. Manku, G.S., Motwani, R.: Approximate Frequency Counts over Data Streams. In: VLDB, pp. 346–357 (2002)

    Google Scholar 

  10. Termier, A., Rousset, M.C., Sebag, M.: TreeFinder: a First Step towards XML Data Mining. In: IEEE ICDM (2002)

    Google Scholar 

  11. Wang, K., Liu, H.: Discovering Structural Association of Semistructured data. IEEE TKDE 12(3), 353–371 (2000)

    Google Scholar 

  12. Yang, L.H., Lee, M.L., Hsu, W.: Mining Frequent Query Patterns in XML. In: DASFAA, pp. 355–362 (2003)

    Google Scholar 

  13. Yang, L.H., Lee, M.L., Hsu, W.: Efficient Mining of Frequent Query Patterns for Caching. In: VLDB (2003)

    Google Scholar 

  14. Zaki, M.: Efficiently Mining Frequent Trees in a Forest. In: ACM SIGKDD (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, L.H., Lee, M.L., Hsu, W. (2004). Approximate Counting of Frequent Query Patterns over XQuery Stream. In: Lee, Y., Li, J., Whang, KY., Lee, D. (eds) Database Systems for Advanced Applications. DASFAA 2004. Lecture Notes in Computer Science, vol 2973. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24571-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24571-1_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21047-4

  • Online ISBN: 978-3-540-24571-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics