skip to main content
10.1145/2939672.2939837acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

From Truth Discovery to Trustworthy Opinion Discovery: An Uncertainty-Aware Quantitative Modeling Approach

Published: 13 August 2016 Publication History

Abstract

In this era of information explosion, conflicts are often encountered when information is provided by multiple sources. Traditional truth discovery task aims to identify the truth the most trustworthy information, from conflicting sources in different scenarios. In this kind of tasks, truth is regarded as a fixed value or a set of fixed values. However, in a number of real-world cases, objective truth existence cannot be ensured and we can only identify single or multiple reliable facts from opinions. Different from traditional truth discovery task, we address this uncertainty and introduce the concept of trustworthy opinion of an entity, treat it as a random variable, and use its distribution to describe consistency or controversy, which is particularly difficult for data which can be numerically measured, i.e. quantitative information. In this study, we focus on the quantitative opinion, propose an uncertainty-aware approach called Kernel Density Estimation from Multiple Sources (KDEm) to estimate its probability distribution, and summarize trustworthy information based on this distribution. Experiments indicate that KDEm not only has outstanding performance on the classical numeric truth discovery task, but also shows good performance on multi-modality detection and anomaly detection in the uncertain-opinion setting.

References

[1]
A. Berlinet and C. Thomas-Agnan. Reproducing kernel Hilbert spaces in probability and statistics. Springer Science & Business Media, 2011.
[2]
D. P. Bertsekas. Nonlinear programming. Athena Scientific, 1999.
[3]
X. L. Dong, L. Berti-Equille, and D. Srivastava. Integrating conflicting data: the role of source dependence. PVLDB, 2(1):550--561, 2009.
[4]
X. L. Dong, L. Berti-Equille, and D. Srivastava. Data fusion: resolving conflicts from multiple sources. In WAIM, 2013.
[5]
A. Galland, S. Abiteboul, A. Marian, and P. Senellart. Corroborating information from disagreeing views. In WSDM, 2010.
[6]
A. Hinneburg and H.-H. Gabriel. Denclue 2.0: Fast clustering based on kernel density estimation. In Advances in Intelligent Data Analysis VII, pages 70--80. Springer, 2007.
[7]
J. Kim and C. D. Scott. Robust kernel density estimation. JMLR, 13(1):2529--2565, 2012.
[8]
L.-W. Ku, Y.-T. Liang, and H.-H. Chen. Opinion extraction, summarization and tracking in news and blog corpora. In AAAI spring symposium: Computational approaches to analyzing weblogs, volume 100107, 2006.
[9]
Q. Li, Y. Li, J. Gao, L. Su, B. Zhao, M. Demirbas, W. Fan, and J. Han. A confidence-aware approach for truth discovery on long-tail data. PVLDB, 8(4):425--436, 2014.
[10]
Q. Li, Y. Li, J. Gao, B. Zhao, W. Fan, and J. Han. Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In SIGMOD, 2014.
[11]
X. Li, W. Meng, and C. Yu. T-verifier: Verifying truthfulness of fact statements. In ICDE, 2011.
[12]
Y. Li, J. Gao, C. Meng, Q. Li, L. Su, B. Zhao, W. Fan, and J. Han. A survey on truth discovery. ACM SIGKDD Explorations Newsletter, 17(2):1--16, 2016.
[13]
R. W. Ouyang, L. Kaplan, P. Martin, A. Toniolo, M. Srivastava, and T. J. Norman. Debiasing crowdsourced quantitative characteristics in local businesses and services. In IPSN, 2015.
[14]
E. Parzen. On estimation of a probability density function and mode. The annals of mathematical statistics, pages 1065--1076, 1962.
[15]
J. Pasternack and R. Dan. Making better informed trust decisions with generalized fact-finding. In IJCAI, 2011.
[16]
J. Pasternack and D. Roth. Knowing what to believe (when you already know something). In COLING, 2010.
[17]
J. Pasternack and D. Roth. Latent credibility analysis. In WWW, 2013.
[18]
G. J. Qi, C. C. Aggarwal, J. Han, and T. Huang. Mining collective intelligence in diverse groups. In WWW, 2013.
[19]
V. G. V. Vydiswaran, C. X. Zhai, and D. Roth. Content-driven trust propagation framework. In SIGKDD, 2011.
[20]
D. Wang, L. Kaplan, H. Le, and T. Abdelzaher. On truth discovery in social sensing: A maximum likelihood estimation approach. In IPSN, 2012.
[21]
H. Wang, Y. Lu, and C. Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In SIGKDD, 2010.
[22]
H. Wang, Y. Lu, and C. Zhai. Latent aspect rating analysis without aspect keyword supervision. In SIGKDD, 2011.
[23]
X. Yin, J. Han, and P. S. Yu. Truth discovery with multiple conflicting information providers on the web. Knowledge and Data Engineering, IEEE Transactions on, 20(6):796--808, 2008.
[24]
B. Zhao and J. Han. A probabilistic model for estimating real-valued truth from conflicting sources. QDB Workshop, 2012.
[25]
B. Zhao, B. I. P. Rubinstein, J. Gemmell, and J. Han. A bayesian approach to discovering truth from conflicting sources for data integration. PVLDB, 5(6):550--561, 2012.
[26]
S. Zhi, B. Zhao, W. Tong, J. Gao, D. Yu, H. Ji, and J. Han. Modeling truth existence in truth discovery. In SIGKDD, 2011.
[27]
D. Zhou, J. C. Platt, S. Basu, and Y. Mao. Learning from the wisdom of crowds by minimax entropy. In NIPS, 2012.

Cited By

View all
  • (2022)Crowd Bus Sensing: Resolving Conflicts Between the Ground Truth and Map AppsIEEE Transactions on Mobile Computing10.1109/TMC.2022.3231085(1-15)Online publication date: 2022
  • (2022)An Unsupervised Bayesian Neural Network for Truth Discovery in Social NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.305485334:11(5182-5195)Online publication date: 1-Nov-2022
  • (2021)Data Poisoning Attacks and Defenses to Crowdsourcing SystemsProceedings of the Web Conference 202110.1145/3442381.3450066(969-980)Online publication date: 19-Apr-2021
  • Show More Cited By

Index Terms

  1. From Truth Discovery to Trustworthy Opinion Discovery: An Uncertainty-Aware Quantitative Modeling Approach

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
      August 2016
      2176 pages
      ISBN:9781450342322
      DOI:10.1145/2939672
      © 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 13 August 2016

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. kernel density estimation
      2. source reliability
      3. truth discovery

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      KDD '16
      Sponsor:

      Acceptance Rates

      KDD '16 Paper Acceptance Rate 66 of 1,115 submissions, 6%;
      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Upcoming Conference

      KDD '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)130
      • Downloads (Last 6 weeks)18
      Reflects downloads up to 02 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Crowd Bus Sensing: Resolving Conflicts Between the Ground Truth and Map AppsIEEE Transactions on Mobile Computing10.1109/TMC.2022.3231085(1-15)Online publication date: 2022
      • (2022)An Unsupervised Bayesian Neural Network for Truth Discovery in Social NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.305485334:11(5182-5195)Online publication date: 1-Nov-2022
      • (2021)Data Poisoning Attacks and Defenses to Crowdsourcing SystemsProceedings of the Web Conference 202110.1145/3442381.3450066(969-980)Online publication date: 19-Apr-2021
      • (2021)Crowdsourcing System for Numerical Tasks based on Latent Topic Aware Worker ReliabilityIEEE INFOCOM 2021 - IEEE Conference on Computer Communications10.1109/INFOCOM42981.2021.9488748(1-10)Online publication date: 10-May-2021
      • (2020)From Appearance to EssenceACM Transactions on Intelligent Systems and Technology10.1145/341174911:6(1-24)Online publication date: 11-Sep-2020
      • (2020)A Reliability-Aware Vehicular Crowdsensing System for Pothole ProfilingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/33698153:4(1-26)Online publication date: 14-Sep-2020
      • (2020)Estimating Consensus from Crowdsourced Continuous Annotations2020 3rd International Conference on Communication System, Computing and IT Applications (CSCITA)10.1109/CSCITA47329.2020.9137784(156-161)Online publication date: Apr-2020
      • (2020)Claim verification under positive unlabeled learningProceedings of the 12th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.1109/ASONAM49781.2020.9381336(143-150)Online publication date: 7-Dec-2020
      • (2020)CrowdQM: Learning Aspect-Level User Reliability and Comment Trustworthiness in Discussion ForumsAdvances in Knowledge Discovery and Data Mining10.1007/978-3-030-47426-3_46(592-605)Online publication date: 6-May-2020
      • (2019)Privacy-aware synthesizing for crowdsourced dataProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367243.3367393(2542-2548)Online publication date: 10-Aug-2019
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media