skip to main content
10.1145/1363686.1363918acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Accurate histogram-based XML summarization

Published: 16 March 2008 Publication History

Abstract

In this paper, we propose the use of histograms to characterize node set distributions in an XML document, which then can be recursively evaluated for query optimization tasks. We identify and deal with special cases for effectively using histograms to summarize structural aspects of XML documents. To reveal the potential of our approach, we perform comparative experiments on our native XML database management system called XTC.

References

[1]
Aboulnaga, A., Alameldeen, A. R., Naughton, J. F.: Estimating the Selectivity of XML Path Expressions for Internet Scale Applications. In Proc. VLDB Conf., Rome, 2001, pp. 591--600.
[2]
Freire, J., Haritsa, Jayant R., Ramanath, M., Roy, P., Simeon, J.: StatiX: making XML count. In Proc. ACM SIGMOD Conf., 2002, pp. 181--191.
[3]
Härder, T., Haustein, M. P., Mathis, C., Wagner, M.: Node Labeling Schemes for Dynamic XML Documents Reconsidered. In Data&Knowledge Engineering 60:1, 2007, pp. 126--149.
[4]
Haustein, M. P., Härder, T.: An Efficient Infrastructure for Native Transactional XML Processing. In Data & Knowledge Engineering 61:3, 2007, pp. 500--523.
[5]
Ioannidis, Y.: The History of Histograms (abridged). In Proc. VLDB Conf., Berlin, Germany, 2003, pp. 19--30.
[6]
Ioannidis, Y., Poosala, V.: Balancing histogram optimality and practicality for query result size estimation. In Proc. ACM SIGMOD Conf., 1995, pp. 233--244.
[7]
Lim, L., Wang, M., Padmanabahn, S., Vitter, Jeffrey S., Parr, R. XPathLearner: An On-Line Self Tuning Markov Histogram for XML Path Selectivity Estimation. In Proc. VLDB Conf., Hong Kong, China, 2002, pp. 442--453.
[8]
Piatetski-Shapiro, G., Connel, C.: Accurate Estimation of the Number of Tuples Satisfying a Condition. In Proc. ACM SIGMOD Conf., 1984, pp. 256--276.
[9]
Polyzotis, N., Garofalakis, M. Structure and Value Synopses for XML Data Graphs. In Proc. VLDB Conf., Hong Kong, China, 2002, pp. 466--477.
[10]
Poosala, V., Ioannidis, Y., Hass, Peter J., Shekita, Eugene J.: Improved Histograms for Selectivity Estimation of Range Predicates. In Proc. ACM SIGMOD Conf., 1993, pp. 294--305.
[11]
Wang, W., Jiang, H., Lu, H., Yu, J. X.: Bloom Histogram: Path Selectivity Estimation for XML Data with Updates. In Proc. VLDB Conf., Toronto, Canada, 2004, pp.240--251.
[12]
XPATH XML Path Language 2.0. W3C Candidate Release (Nov. 2005).
[13]
XQuery 1.0 and XPath 2.0 Data Model (XDM) W3C Recommendation (Jan. 2007)
[14]
Zhang, N., Özsu, M. T., Aboulnaga, A., Ilyas, I. F.: XSeed: Accurate and Fast Cardinality Estimation for XPath Queries. In Proc. ICDE, 2006, pp. 61--66.

Cited By

View all
  • (2008)EXsumProceedings of the 2008 international symposium on Database engineering & applications10.1145/1451940.1451961(139-148)Online publication date: 10-Sep-2008

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '08: Proceedings of the 2008 ACM symposium on Applied computing
March 2008
2586 pages
ISBN:9781595937537
DOI:10.1145/1363686
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 March 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. XML documents
  2. XML query processing
  3. hierarchical summarization
  4. histograms
  5. selectivity estimation

Qualifiers

  • Research-article

Conference

SAC '08
Sponsor:
SAC '08: The 2008 ACM Symposium on Applied Computing
March 16 - 20, 2008
Fortaleza, Ceara, Brazil

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2008)EXsumProceedings of the 2008 international symposium on Database engineering & applications10.1145/1451940.1451961(139-148)Online publication date: 10-Sep-2008

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media