Abstract
In many circumstances, such as internet of things or data fusion, a common scenario is that more than one sources provide the data of the same object, but the data quality of the sources are different. Therefore, when querying the sources which may provide low quality data, the query results should include high quality data. In this paper, we define quality-aware query, and build a model to describe the quality-aware query scenario, which aims to get high quality results from multi-sources which may have different data quality scores. Uncertain graph is used to simulate the relative source quality, and a method to compute the quality of the query results is provided.
The first two authors have the same contributions to this paper.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abiteboul, S., Kanellakis, P., Grahne, G.: On the representation and querying of sets of possible worlds. Theor. Comput. Sci. 78(1), 159–187 (1991)
Cao, Y., Fan, W., Yu, W.: Determining the relative accuracy of attributes. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 565–576. ACM (2013)
Chu, X., Ilyas, I.F., Papotti, P.: Holistic data cleaning: putting violations into context. In: The IEEE 29th International Conference on Data Engineering (ICDE), pp. 458–469 (2013)
Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: the role of source dependence. PVLDB 2(1), 550–561 (2009)
Dong, X.L., et al.: Knowledge-based trust: estimating the trustworthiness of web sources. Proc. VLDB Endow. 8(9), 938–949 (2015)
Eckerson, W.: Data warehousing special report: data quality and the bottom line. Appl. Dev. Trends 1(1), 1–9 (2002)
Fan, W., Geerts, F.: Foundations of data quality management. Synth. Lect. Data Manag. 4(5), 1–217 (2012)
Ilyas, I.F., Chu, X., et al.: Trends in cleaning relational data: consistency and deduplication. Found. Trends® Databases 5(4), 281–393 (2015)
Li, Q., et al.: A confidence-aware approach for truth discovery on long-tail data. Proc. VLDB Endow. 8(4), 425–436 (2014)
Li, Q., Li, Y., Gao, J., Zhao, B., Fan, W., Han, J.: Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 1187–1198. ACM (2014)
Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Commun. ACM 45(4), 211–218 (2002)
Rammelaere, J., Geerts, F., Goethals, B.: Cleaning data with forbidden itemsets. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 897–908 (2017)
Rekatsinas, T., Joglekar, M., Garcia-Molina, H., Parameswaran, A., Ré, C.: Slimfast: guaranteed results for data fusion and source reliability. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1399–1414. ACM (2017)
Zou, Z., Gao, H., Li, J.: Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, pp. 633–642 (2010)
Zou, Z., Li, J., Gao, H., Zhang, S.: Frequent subgraph pattern mining on uncertain graph data. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 583–592 (2009)
Acknowledgments
This work is supported by the National Natural Science Foundation of China (No.61572153, No. 61702220, No. 61702223).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, M., Sun, Y., Wang, L., Lu, H. (2018). Quality-Aware Query Based on Relative Source Quality. In: Sun, X., Pan, Z., Bertino, E. (eds) Cloud Computing and Security. ICCCS 2018. Lecture Notes in Computer Science(), vol 11064. Springer, Cham. https://doi.org/10.1007/978-3-030-00009-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-00009-7_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00008-0
Online ISBN: 978-3-030-00009-7
eBook Packages: Computer ScienceComputer Science (R0)