Skip to main content

Faster Algorithm for Truth Discovery via Range Cover

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10389))

Abstract

Truth discovery is a key problem in data analytics which has received a great deal of attention in recent years. In this problem, we seek to obtain trustworthy information from data aggregated from multiple (possibly) unreliable sources. Most of the existing approaches for this problem are of heuristic nature and do not provide any quality guarantee. Very recently, the first quality-guaranteed algorithm has been discovered. However, the running time of the algorithm depends on the spread ratio of the input points and is fully polynomial only when the spread ratio is relatively small. This could severely restrict the applicability of the algorithm. To resolve this issue, we propose in this paper a new algorithm which yields a \((1+\epsilon )\)-approximation in near quadratic time for any dataset with constant probability. Our algorithm relies on a data structure called range cover, which is interesting in its own right. The data structure provides a general approach for solving some high dimensional optimization problems by breaking them down into a small number of parametrized cases.

The research of the first and third authors was supported in part by NSF through grants CCF-1422324, IIS-1422591, and CNS-1547167. The research of the second author was supported by a start-up fund from Michigan State University.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ding, H., Gao, J., Xu, J.: Finding Global Optimum for Truth Discovery: Entropy Based Geometric Variance. In: Leibniz International Proceedings in Informatics (LIPIcs), 32nd International Symposium on Computational Geometry (SoCG 2016), vol. 51, 34: 1–34: 16(2016)

    Google Scholar 

  2. Ding, H., Xu, J.: A Unified Framework for Clustering Constrained Data without Locality Property. In: Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA 2015), pp. January 4–6, San Diego, California, USA, 1471–1490 (2015)

    Google Scholar 

  3. Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: The role of source dependence. PVLDB 2(1), 550–561 (2009)

    Google Scholar 

  4. Li, Y., Gao, J., Meng, C., Li, Q., Su, L., Zhao, B., Fan, W., Han, J.: A Survey on Truth Discovery. CoRR abs/1505.02463 (2015)

    Google Scholar 

  5. Har-Peled, S.: Geometric approximation algorithms, vol. 173. American mathematical society, Boston (2011)

    Google Scholar 

  6. Li, H., Zhao, B., Fuxman, A.: The Wisdom of Minority: Discovering And Targeting The Right Group of Workers for Crowdsourcing. In: Proc. of the International Conference on World Wide Web (WWW 2014), pp. 165–176 (2014)

    Google Scholar 

  7. Li, Q., Li, Y., Gao, J., Su, L., Zhao, B., Demirbas, M., Fan, W., Han, J.: A Confidence- Aware Approach for Truth Discovery on Long-Tail Data. PVLDB 8(4), 425–436 (2014)

    Google Scholar 

  8. Li, Q., Li, Y., Gao, J., Zhao, B., Fan, W., Han, J.: Resolving Conflicts in Heterogeneous Data by Truth Discovery and Source Reliability Estimation. In: Proc. the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD 2014), pp. 1187–1198 (2014)

    Google Scholar 

  9. Pasternack, J., Roth, D.: Knowing what to believe (when you already know something). In: Proc. of the International Conference on Computational Linguistics (COLING 2010), pp. 877–885 (2010)

    Google Scholar 

  10. Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose Vote Should Count More: Optimal Integration of Labelers of Unknown Expertise. In: Advances in Neural Information Processing Systems (NIPS 2009), pp. 2035–2043 (2009)

    Google Scholar 

  11. Yin, X., Han, J., Yu, P.S.: Truth discovery with multiple conflicting information providers on the web. In: Proc. of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), pp. 1048–1052 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ziyun Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Huang, Z., Ding, H., Xu, J. (2017). Faster Algorithm for Truth Discovery via Range Cover. In: Ellen, F., Kolokolova, A., Sack, JR. (eds) Algorithms and Data Structures. WADS 2017. Lecture Notes in Computer Science(), vol 10389. Springer, Cham. https://doi.org/10.1007/978-3-319-62127-2_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-62127-2_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-62126-5

  • Online ISBN: 978-3-319-62127-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics