Abstract
The rapid development of the Internet makes it easier to spread and obtain data. However, conflicting descriptions of an object from different sources make identifying trustworthy information challenging. This is known as the truth discovery task. In truth discovery, an object may have multiple values, such as a book written by multiple authors. Existing multi-truth discovery methods primarily focus on the probability of each candidate value being correct and provide a point estimate. However, practical applications face the problem of unbalanced object distribution, where a single point estimate may overlook critical confidence information. Additionally, ambiguous terms like “etc.” and “et. al” can lead to estimation deviations. To address these issues, we propose MTD_VCI, an optimization model for confidence perception of multiple truths to detect truth from unbalanced data distribution. MTD_VCI estimates the credibility score of each candidate value and considers the confidence interval to reflect the unevenness distribution, improving decision-making. Additionally, the number of values claimed by ambiguous sources is re-estimated using other sources as a reference. Experiment results on real-world and simulated datasets demonstrate that MTD_VCI produces better results and effective confidence intervals for each value.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Azzalini, F., Piantella, D., Rabosio, E., Tanca, L.: Enhancing domain-aware multi-truth data fusion using copy-based source authority and value similarity. VLDB 32, 1–26 (2022)
Fang, X.S., Sheng, Q.Z., Wang, X., Chu, D., Ngu, A.H.: SmartVote: a full-fledged graph-based model for multi-valued truth discovery. WWW 22(4), 1855–1885 (2019)
Li, Q., et al.: A confidence-aware approach for truth discovery on long-tail data. VLDB 8(4), 425–436 (2014)
Li, Y., et al.: A survey on truth discovery. SIGKDD 17(2), 1–16 (2016)
Lin, X., Chen, L.: Domain-aware multi-truth discovery from conflicting sources. VLDB 11(5), 635–647 (2018)
Lyu, S., Ouyang, W., Shen, H., Cheng, X.: Truth discovery by claim and source embedding. In: CIKM, pp. 2183–2186 (2017)
Pasternack, J., Roth, D.: Knowing what to believe (when you already know something). In: COLING, pp. 877–885 (2010)
Pochampally, R., Das Sarma, A., Dong, X.L., Meliou, A., Srivastava, D.: Fusing data with correlations. In: SIGMOD, pp. 433–444 (2014)
Shao, H., et al.: Truth discovery with multi-modal data in social sensing. TC 70(9), 1325–1337 (2020)
Wang, X., Sheng, Q.Z., Fang, X.S., Yao, L., Xu, X., Li, X.: An integrated Bayesian approach for effective multi-truth discovery. In: CIKM, pp. 493–502 (2015)
Wang, X., et al.: Empowering truth discovery with multi-truth prediction. In: CIKM, pp. 881–890 (2016)
Wang, X., et al.: Truth discovery via exploiting implications from multi-source data. In: CIKM, pp. 861–870 (2016)
Xiao, H., et al.: Towards confidence in the truth: a bootstrapping based truth discovery approach. In: SIGKDD, pp. 1935–1944 (2016)
Xiao, H., et al.: Towards confidence interval estimation in truth discovery. TKDE 31(3), 575–588 (2018)
Yan, L., Yang, K., Yang, S.: Reputation-based truth discovery with long-term quality of source in internet of things. IOT 9(7), 5410–5421 (2021)
Yang, J., Tay, W.P.: An unsupervised Bayesian neural network for truth discovery in social networks. TKDE 34(11), 5182–5195 (2022)
Ye, C., et al.: Constrained truth discovery. TKDE 34(1), 205–218 (2020)
Yin, X., Han, J., Yu, P.S.: Truth discovery with multiple conflicting information providers on the web. In: SIGKDD, pp. 1048–1052 (2007)
Zhao, B., Rubinstein, B.I., Gemmell, J., Han, J.: A Bayesian approach to discovering truth from conflicting sources for data integration. VLDB 5(6), 550–561 (2012)
Zhi, S., Yang, F., Zhu, Z., Li, Q., Wang, Z., Han, J.: Dynamic truth discovery on numerical data. In: ICDM, pp. 817–826 (2018)
Acknowledgements
This work was supported by Fundamental Research Funds for the Central Universities (No. 23D111204, 22D111210), Shanghai Science and Technology Commission (No. 22YF1401100), and National Science Fund for Young Scholars (No. 62202095).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Fang, X., Shen, C., Sheng, Q.Z., Sun, G., Tang, Y., Zhuo, H. (2023). A Multi-truth Discovery Approach Based on Confidence Interval Estimation of Truths. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14180. Springer, Cham. https://doi.org/10.1007/978-3-031-46677-9_41
Download citation
DOI: https://doi.org/10.1007/978-3-031-46677-9_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46676-2
Online ISBN: 978-3-031-46677-9
eBook Packages: Computer ScienceComputer Science (R0)