skip to main content
10.1145/1871940.1871953acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Repairing OLAP queries in databases with referential integrity errors

Published: 30 October 2010 Publication History

Abstract

Many database applications and OLAP tools dynamically generate SQL queries involving join operators and aggregate functions and send these queries to a database server for execution. This dynamically generated SQL code normally assumes the underlying tables and columns are clean and lacks the necessary robustness to deal with foreign keys with null and invalid or undefined values that are ubiquitous in databases with inconsistent or incomplete content. The outcome is that at query time, several issues arise mostly as inconsistencies in answer sets, difficult to detect and explain by users of OLAP tools. In this article, we present an automated query rewriting method for automatically generated OLAP queries that are executed over tables with foreign key columns having potentially null or invalid values. Our method is applicable in queries that use join operators and aggregate functions obeying the summarizability property (e.g. sum(), count()). If a user of an OLAP tool wants or requests it, using our method the queries that use join operators may be rewritten and he or she may be warned of the referential integrity condition of the underlying database and the answer sets may present alternative consistent results in the case aggregate functions are involved. Preliminary experimental evaluation shows rewritten queries provide valuable information on referential integrity and take almost the same time as original queries, highlighting efficiency is good and overhead is minimal.

References

[1]
Stefan Brass and Christian Goldberg. Proving the safety of SQL queries. In QSIC '05, pages 197--204, 2005.
[2]
H. Brink, R. Leek, and J. Visser. Quality assessment for embedded SQL. In SCAM '07, pages 163--170, 2007.
[3]
R. Elmasri and S. B. Navathe. Fundamentals of Database Systems. Addison/Wesley, Redwood City, California, 3rd edition, 2000.
[4]
J. García-García and C. Ordonez. Consistency-aware evaluation of OLAP queries in replicated data warehouses. In ACM DOLAP '09, pages 73--80, 2009.
[5]
J. García-García and C. Ordonez. Extended aggregations for databases with referential integrity issues. Data Knowl. Eng., 69(1):73--95, 2010.
[6]
C. Gould, Z. Su, and P. Devanbu. Static checking of dynamically generated queries in database applications. In ICSE '04, pages 645--654, 2004.
[7]
J. Horner, I. Song, and P. Chen. An analysis of additivity in OLAP systems. In DOLAP '04, pages 83--91, 2004.
[8]
R. Kimball and J. Caserta. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. John Wiley & Sons, 2004.
[9]
H. J. Lenz and A. Shoshani. Summarizability in OLAP and statistical data bases. In SSDBM Conference, pages 132--143, 1997.
[10]
C. Ordonez and Z. Chen. Evaluating statistical tests on OLAP cubes to compare degree of disease. IEEE Trans. Info. Tech. Biomed., 13(5):756--765, 2009.
[11]
C. Ordonez and J. García-García. Referential integrity quality metrics. Decision Support Systems Journal, 44(2):495--508, 2008.
[12]
J. Tuya, M.J. Suárez-Cabal, and C. Riva. A practical guide to SQL white-box testing. SIGPLAN Not., 41(4):36--41, 2006.

Cited By

View all
  • (2010)DOLAP 2010 workshop summaryProceedings of the 19th ACM international conference on Information and knowledge management10.1145/1871437.1871792(1973-1974)Online publication date: 26-Oct-2010

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DOLAP '10: Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
October 2010
112 pages
ISBN:9781450303835
DOI:10.1145/1871940
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dbms
  2. sql

Qualifiers

  • Research-article

Conference

CIKM '10

Acceptance Rates

Overall Acceptance Rate 29 of 79 submissions, 37%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2010)DOLAP 2010 workshop summaryProceedings of the 19th ACM international conference on Information and knowledge management10.1145/1871437.1871792(1973-1974)Online publication date: 26-Oct-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media