Synonyms
Conflict resolution; Truth discovery
Definition
Consider a set of data sources\(\mathcal {S}\) and a set of data items\(\mathcal {D}\). A data item represents a particular aspect of a real-world entity, such as the authors of a book or the headquarters of a company; in a relational database, a data item corresponds to a cell in a table. For each data item \(D \in \mathcal {D}\), each source \(S \in \mathcal {S}\) can (but not necessarily) provide a value; the value can be atomic (e.g., scheduled departure time), a set (e.g., a set of phone numbers), or a list (e.g., a list of book authors). Different sources may provide various values for a data item. A value is considered truein the following three cases: if there is a single atomic value that captures the real world, that value is considered true; if there are multiple atomic values that are consistent with the real world, the set of all such values is considered true; if there is an ordering of the atomic values that...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Dayal U. Processing queries over generalization hierarchies in a multidatabase system. In: Proceedings of the 9th International Conference on Very Large Data Bases; 1983. p. 342–53.
Bleiholder J, Naumann F. Data fusion. ACM Comput Surv. 2008;41(1):1–41.
Pasternack J, Roth D. Knowing what to believe (when you already know something). In: Proceedings of the 23rd International Conference on Computational Linguistics; 2010. p. 877–85.
Galland A, Abiteboul S, Marian A, Senellart P. Corroborating information from disagreeing views. In: Proceedings of the 3rd ACM International Conference on Web Search and Data Mining; 2010. p. 131–40.
Dong XL, Berti-Equille L, Srivastava D. Integrating conflicting data: the role of source dependence. Proc VLDB Endow. 2009;2(1):550–61.
Zhao B, Rubinstein BIP, Gemmell J, Han J. A Bayesian approach to discovering truth from conflicting sources for data integration. Proc VLDB Endow. 2012;5(6):550–61.
Pochampally R, Sarma AD, Dong XL, Meliou A, Srivastava D. Fusing data with correlations. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2014.
Dong XL, Gabrilovich E, Heitz G, Horn W, Murphy K, Sun S, et al. From data fusion to knowledge fusion. In: Proc VLDB Endow. 2014.
Dong XL, Saha B, Srivastava D. Less is more: selecting sources wisely for integration. Proc VLDB Endow. 2013;6(2):37–48.
Li X, Dong XL, Lyons KB, Meng W, Srivastava D. Truth finding on the Deep Web: is the problem solved? Proc VLDB Endow. 2013;6(2):97–108.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Dong, X.L., Srivastava, D. (2018). Data Fusion. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_2354
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_2354
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering