Approximate Counting of Frequent Query Patterns over XQuery Stream

Yang, Liang Huai; Lee, Mong Li; Hsu, Wynne

doi:10.1007/978-3-540-24571-1_6

Liang Huai Yang⁸,
Mong Li Lee⁸ &
Wynne Hsu⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2973))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

979 Accesses
2 Citations

Abstract

One efficient approach to improve the performance of XML management systems is to cache the frequently retrieved results. This entails the discovery of frequent query patterns that are issued by users. In this paper, we model user queries as a stream of XML query pattern trees and mine for frequent query patterns in a batch-wise manner. We design a novel data structure called D-GQPT to merge the pattern trees of the batches seen so far, and to dynamically mark the active portion of the current batch. With the D-GQPT, we are able to limit the enumeration of candidate trees to only the currently active pattern trees. We also design a summary data structure called ECTree to incrementally compute the frequent tree patterns over the query stream. Based on the above two constructs, we present the frequent query pattern mining algorithm called AppXQSMiner over the XML query stream. Experiment results show that the proposed approach is both efficient and scalable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Asai, T., Arimura, H., et al.: Online Algorithms for Mining Semi-structured Data Stream. IEEE ICDM, 27–34 (2002)
Google Scholar
Asai, T., Abe, K., Kawasoe, S., et al.: Efficient Substructure Discovery from Large Semi-structured Data. In: 2nd SIAM Int. Conference on Data Mining (2002)
Google Scholar
Charikar, M., Chen, K., Farach-Colton, M.: Finding Frequent Items in Data Streams. In: 29th Int. Colloquium on Automata, Languages and Programming (2002)
Google Scholar
Charikar, M., Chaudhuri, S., Motwani, R., Narasayya, V.R.: Towards Estimation Error Guarantees for Distinct Values. In: ACM PODS, pp. 268–279 (2000)
Google Scholar
Chen, L., Rundensteiner, E.A., Wang, S.: XCache-A Semantic Caching System for XML Queries. ACM SIGMOD, 618 (2002)
Google Scholar
Gibbons, P.B., Matias, Y.: New Sampling-Based Summary Statistics for Improving Approximate Query Answers. ACM SIGMOD, 331–342 (1998)
Google Scholar
Guha, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering Data Streams: Theory and Practice. IEEE Transactions on Knowledge and Data Engineering, 515–528 (2003)
Google Scholar
Hidber, C.: Online Association Rule Mining. ACM SIGMOD, 145–156 (1999)
Google Scholar
Manku, G.S., Motwani, R.: Approximate Frequency Counts over Data Streams. In: VLDB, pp. 346–357 (2002)
Google Scholar
Termier, A., Rousset, M.C., Sebag, M.: TreeFinder: a First Step towards XML Data Mining. In: IEEE ICDM (2002)
Google Scholar
Wang, K., Liu, H.: Discovering Structural Association of Semistructured data. IEEE TKDE 12(3), 353–371 (2000)
Google Scholar
Yang, L.H., Lee, M.L., Hsu, W.: Mining Frequent Query Patterns in XML. In: DASFAA, pp. 355–362 (2003)
Google Scholar
Yang, L.H., Lee, M.L., Hsu, W.: Efficient Mining of Frequent Query Patterns for Caching. In: VLDB (2003)
Google Scholar
Zaki, M.: Efficiently Mining Frequent Trees in a Forest. In: ACM SIGKDD (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, National University of Singapore,
Liang Huai Yang, Mong Li Lee & Wynne Hsu

Authors

Liang Huai Yang
View author publications
You can also search for this author in PubMed Google Scholar
Mong Li Lee
View author publications
You can also search for this author in PubMed Google Scholar
Wynne Hsu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, KAIST, 373-1 Guseong-dong Yuseong-gu, 305-701, Daejeon, Korea
YoonJoon Lee
School of Computer Science and Technology, Heilongjiang University, China
Jianzhong Li
Computer Science Department and, Advanced Information Technology Research Center(AITrc), KAIST, Korea
Kyu-Young Whang
Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, 373-1 Guseong-dong, Yuseong-gu, 305-701, Daejeon, Republic of Korea
Doheon Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, L.H., Lee, M.L., Hsu, W. (2004). Approximate Counting of Frequent Query Patterns over XQuery Stream. In: Lee, Y., Li, J., Whang, KY., Lee, D. (eds) Database Systems for Advanced Applications. DASFAA 2004. Lecture Notes in Computer Science, vol 2973. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24571-1_6

Download citation

DOI: https://doi.org/10.1007/978-3-540-24571-1_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21047-4
Online ISBN: 978-3-540-24571-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics