skip to main content
10.1145/1012453.1012460acmconferencesArticle/Chapter ViewAbstractPublication PagesiqisConference Proceedingsconference-collections
Article

Utility-based resolution of data inconsistencies

Published: 18 June 2004 Publication History

Abstract

A virtual database system is software that provides unified access to multiple information sources. If the sources are overlapping in their contents and independently maintained, then the likelihood of inconsistent answers is high. Solutions are often based on ranking (which sorts the different answers according to recurrence) and on fusion (which synthesizes a new value from the different alternatives according to a specific formula). In this paper we argue that both methods are flawed, and we offer alternative solutions that are based on knowledge about the performance of the source data; including features such as recentness, availability, accuracy and cost. These features are combined in a flexible utility function that expresses the overall value of a data item to the user. Utility allows us to (1) define meaningful ranking on the inconsistent set of answers, and offer the topranked answer as a preferred answer; (2) determine whether a fusion value is indeed better than the initial values, by calculating its utility and comparing it to the utilities of the initial values; and (3) discover the best fusion: the fusion formula that optimizes the utility. The advantages of such performance-based and utility-driven ranking and fusion are considerable.

References

[1]
P. Anokhin. A Comprehensive Approach to the Resolution of Inconsistencies in Multidatabase Environments. PhD thesis, School of Information Technology and Engineering, George Mason University, Fairfax, VA, 2001.
[2]
A. L. P. Chen and C. S. Chang. Determining probabilities for probabilistic partial values. In Proceedings of the International Conference on Data and Knowledge Systems for Manufacturing and Engineering, 1994.
[3]
L. G. DeMichiel. Resolving database incompatibility: An approach to performing relational operations over mismatched domains. IEEE Transactions on Knowledge and Data Engineering, 1(4):485--493, 1989.
[4]
T. Kurien, A. Chao, and S. W. Gully. Information fusion for onboard and offboard avionics. In Proceedings of the SPIE International Conference on Sensor Fusion, Vol. 3376, 1998.
[5]
M. J. Larkin. Fusion of multiple active sonar waveforms. In Proceedings of the 3rd SPIE International Conference on Sensor Fusion, 1999.
[6]
H.-J. Lenz. Multi-data sources and data fusion. In Seminar on New Techniques and Technologies for Statistics. EuroStat, 1998.
[7]
E.-P. Lim, J. Srivastava, and S. Shekhar. Resolving attribute incompatibility in database integration: An evidential reasoning approach. In Proceedings of ICDE-94, 10th International Conference on Data Engineering, pages 154--163, 1994.
[8]
E. W. Measure. A neural network data fusion for retrieval of atmospheric temperature profiles from satellite and surface based radiometry. In Proceedings of Fusion-98, First International Conference on Multisource-Multisensor Information Fusion, 1998.
[9]
A. Mirabad, N. Mort, and F. Schmid. A fault tolerant train navigation system using multisensor, multifilter integration techniques. In Proceedings of Fusion-98, First International Conference on Multisource-Multisensor Information Fusion, 1998.
[10]
M. H. Montague. Metasearch: Data Fusion for Document Retrieval. PhD thesis, Dartmouth College, Computer Science, Hanover, NH, May 2002. TR2002-424.
[11]
A. Motro, P. Anokhin, and J. Berlin. Intelligent methods in virtual databases. In Proceedings of FQAS-00, Fourth International Conference on Flexible Query Answering Systems, pages 580--591. Advances in Soft Computing, Physica-Verlag, Heildelberg, Germany, 2000.
[12]
F. Naumann. Data fusion and data quality. In Seminar on New Techniques and Technologies for Statistics. EuroStat, 1998.
[13]
B. Pernici and M. Scannapieco. Data quality in web information systems. In Proceedings of ER-2002, 21st International Conference on Conceptual Modeling, Tampere, Finland, 2002. Springer-Verlag.
[14]
T. C. Redman. Data Quality for the Information Age. Artech House, 1997.
[15]
A. S. Rosenthal, D. M. Wood, and E. R. Hughes. Methodology for intelligence database data quality. MITRE Corp., 2002.
[16]
D. M. Strong, Y. W. Lee, and R. Y. Wang. Data quality in context. Communications of the ACM, 40(5):103--110, 1997.
[17]
A. Sun. An expert network for factory automation using multisensor information fusion. In Proceedings of Fusion-98, First International Conference on Multisource-Multisensor Information Fusion, 1998.
[18]
F.S.-C. Tseng, A.L.P. Chen, and W.-P. Yang. Answering heterogeneous database queries with degrees of uncertainty. Distributed and Parallel Databases, 1(3):281--302, 1993.
[19]
T. Tsikrika and M. Lalmas. Merging techniques for performing data fusion on the web. In Proceedings of the 10th International Conference on Information and Knowledge Management, pages 127--134. ACM Press, 2001.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IQIS '04: Proceedings of the 2004 international workshop on Information quality in information systems
June 2004
81 pages
ISBN:1581139020
DOI:10.1145/1012453
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2004

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

IQIS04
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Resolving Power Equipment Data Inconsistency via Heterogeneous Network AlignmentIEEE Access10.1109/ACCESS.2023.325351811(23980-23988)Online publication date: 2023
  • (2020)Counting and enumerating preferred database repairsTheoretical Computer Science10.1016/j.tcs.2020.05.016837(115-157)Online publication date: Oct-2020
  • (2020)Does data cleaning improve heart disease prediction?Procedia Computer Science10.1016/j.procs.2020.09.109176(1131-1140)Online publication date: 2020
  • (2020)Data reconciliation and fusion methods: a surveyApplied Computing and Informatics10.1016/j.aci.2019.07.00118:3/4(182-194)Online publication date: 20-Jul-2020
  • (2017)Data quality of electricity consumption data in a smart grid environmentRenewable and Sustainable Energy Reviews10.1016/j.rser.2016.10.05475(98-105)Online publication date: Aug-2017
  • (2016)Dynamical order construction in data fusionInformation Fusion10.1016/j.inffus.2015.05.00127:C(1-18)Online publication date: 1-Jan-2016
  • (2015)Integrate inconsistent and heterogeneous data based on user feedbackInternational Journal of Intelligent Computing and Cybernetics10.1108/IJICC-04-2014-00138:2(187-203)Online publication date: 8-Jun-2015
  • (2012)A Decision-Theoretic Framework for Numerical Attribute Value ReconciliationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2011.7524:7(1153-1169)Online publication date: 1-Jul-2012
  • (2012)Heterogeneous data-integration and data quality: Overview of conflicts2012 6th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT)10.1109/SETIT.2012.6482029(867-874)Online publication date: Mar-2012
  • (2012)Characterization and Resolution of Incompleteness in (World-Wide-Web) Information ExtractionProceedings of the 2012 23rd International Workshop on Database and Expert Systems Applications10.1109/DEXA.2012.35(241-245)Online publication date: 3-Sep-2012
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media