Skip to main content

Quality-Aware Query Based on Relative Source Quality

  • Conference paper
  • First Online:
  • 1879 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11064))

Abstract

In many circumstances, such as internet of things or data fusion, a common scenario is that more than one sources provide the data of the same object, but the data quality of the sources are different. Therefore, when querying the sources which may provide low quality data, the query results should include high quality data. In this paper, we define quality-aware query, and build a model to describe the quality-aware query scenario, which aims to get high quality results from multi-sources which may have different data quality scores. Uncertain graph is used to simulate the relative source quality, and a method to compute the quality of the query results is provided.

The first two authors have the same contributions to this paper.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abiteboul, S., Kanellakis, P., Grahne, G.: On the representation and querying of sets of possible worlds. Theor. Comput. Sci. 78(1), 159–187 (1991)

    Article  MathSciNet  Google Scholar 

  2. Cao, Y., Fan, W., Yu, W.: Determining the relative accuracy of attributes. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 565–576. ACM (2013)

    Google Scholar 

  3. Chu, X., Ilyas, I.F., Papotti, P.: Holistic data cleaning: putting violations into context. In: The IEEE 29th International Conference on Data Engineering (ICDE), pp. 458–469 (2013)

    Google Scholar 

  4. Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: the role of source dependence. PVLDB 2(1), 550–561 (2009)

    Google Scholar 

  5. Dong, X.L., et al.: Knowledge-based trust: estimating the trustworthiness of web sources. Proc. VLDB Endow. 8(9), 938–949 (2015)

    Article  Google Scholar 

  6. Eckerson, W.: Data warehousing special report: data quality and the bottom line. Appl. Dev. Trends 1(1), 1–9 (2002)

    Google Scholar 

  7. Fan, W., Geerts, F.: Foundations of data quality management. Synth. Lect. Data Manag. 4(5), 1–217 (2012)

    Article  Google Scholar 

  8. Ilyas, I.F., Chu, X., et al.: Trends in cleaning relational data: consistency and deduplication. Found. Trends® Databases 5(4), 281–393 (2015)

    Google Scholar 

  9. Li, Q., et al.: A confidence-aware approach for truth discovery on long-tail data. Proc. VLDB Endow. 8(4), 425–436 (2014)

    Article  Google Scholar 

  10. Li, Q., Li, Y., Gao, J., Zhao, B., Fan, W., Han, J.: Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 1187–1198. ACM (2014)

    Google Scholar 

  11. Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Commun. ACM 45(4), 211–218 (2002)

    Article  Google Scholar 

  12. Rammelaere, J., Geerts, F., Goethals, B.: Cleaning data with forbidden itemsets. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 897–908 (2017)

    Google Scholar 

  13. Rekatsinas, T., Joglekar, M., Garcia-Molina, H., Parameswaran, A., Ré, C.: Slimfast: guaranteed results for data fusion and source reliability. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1399–1414. ACM (2017)

    Google Scholar 

  14. Zou, Z., Gao, H., Li, J.: Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, pp. 633–642 (2010)

    Google Scholar 

  15. Zou, Z., Li, J., Gao, H., Zhang, S.: Frequent subgraph pattern mining on uncertain graph data. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 583–592 (2009)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No.61572153, No. 61702220, No. 61702223).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Le Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, M., Sun, Y., Wang, L., Lu, H. (2018). Quality-Aware Query Based on Relative Source Quality. In: Sun, X., Pan, Z., Bertino, E. (eds) Cloud Computing and Security. ICCCS 2018. Lecture Notes in Computer Science(), vol 11064. Springer, Cham. https://doi.org/10.1007/978-3-030-00009-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00009-7_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00008-0

  • Online ISBN: 978-3-030-00009-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics