skip to main content
10.1145/3308558.3313712acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

A Semi-Supervised Active-learning Truth Estimator for Social Networks

Published:13 May 2019Publication History

ABSTRACT

This paper introduces an active-learning-based truth estimator for social networks, such as Twitter, that enhances estimation accuracy significantly by requesting a well-selected (small) fraction of data to be labeled. Data assessment and truth discovery from arbitrary open online sources are a hard problem due to uncertainty regarding source reliability. Multiple truth finding systems were developed to solve this problem. Their accuracy is limited by the noisy nature of the data, where distortions, fabrications, omissions, and duplication are introduced. This paper presents a semi-supervised truth estimator for social networks, in which a portion of inputs are carefully selected to be reliably verified. The challenge is to find the subset of observations to verify that would maximally enhance the overall fact-finding accuracy. This work extends previous passive approaches to recursive truth estimation, as well as semi-supervised approaches where the estimator has no control over the choice of data to be labeled. Results show that by optimally selecting claims to be verified, we improve estimated accuracy by 12% over unsupervised baseline, and by 5% over previous semi-supervised approaches.

References

  1. Md Tanvir Al Amin, Charu Aggarwal, Shuochao Yao, Tarek Abdelzaher, and Lance Kaplan. 2017. Unveiling polarization in social networks: A matrix factorization approach. Technical Report. IEEE.Google ScholarGoogle Scholar
  2. Jeffrey A Burke, Deborah Estrin, Mark Hansen, Andrew Parker, Nithya Ramanathan, Sasank Reddy, and Mani B Srivastava. 2006. Participatory sensing. Center for Embedded Network Sensing(2006).Google ScholarGoogle Scholar
  3. Hang Cui, Tarek Abdelzaher, and Lance Kaplan. 2018. Recursive Truth Estimation of Time-Varying Sensing Data from Online Open Sources. In International Conference on Distributed Computing in Sensor Systems (DCOSS). New York, NY.Google ScholarGoogle ScholarCross RefCross Ref
  4. Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava. 2009. Integrating conflicting data: the role of source dependence. Proceedings of the VLDB Endowment 2, 1 (2009), 550-561. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava. 2009. Truth discovery and copying detection in a dynamic world. Proceedings of the VLDB Endowment 2, 1 (2009), 562-573. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Luyang Liu, Hongyu Li, Jian Liu, Cagdas Karatas, Yan Wang, Marco Gruteser, Yingying Chen, and Richard P Martin. 2017. Bigroad: Scaling road data acquisition for dependable self-driving. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 371-384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chuishi Meng, Houping Xiao, Lu Su, and Yun Cheng. 2016. Tackling the Redundancy and Sparsity in Crowd Sensing Applications.. In SenSys. 150-163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Alan Mislove, Massimiliano Marcon, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, 29-42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Praneeth Netrapalli and Sujay Sanghavi. 2012. Learning the Graph of Epidemic Cascades. SIGMETRICS Perform. Eval. Rev. 40, 1 (June 2012), 211-222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Praneeth Netrapalli and Sujay Sanghavi. 2012. Learning the graph of epidemic cascades. In ACM SIGMETRICS Performance Evaluation Review, Vol. 40. ACM, 211-222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jeff Pasternack and Dan Roth. 2010. Knowing what to believe (when you already know something). In Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, 877-885. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jeff Pasternack and Dan Roth. 2013. Latent credibility analysis. In Proceedings of the 22nd international conference on World Wide Web. ACM, 1009-1020. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Tauhidur Rahman, Alexander Travis Adams, Perry Schein, Aadhar Jain, David Erickson, and Tanzeem Choudhury. 2016. Nutrilyzer: A Mobile System for Characterizing Liquid Food with Photoacoustic Effect.. In SenSys. 123-136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dong Wang, Md Tanvir Amin, Shen Li, Tarek Abdelzaher, Lance Kaplan, Siyu Gu, Chenji Pan, Hengchang Liu, Charu C Aggarwal, Raghu Ganti, 2014. Using humans as sensors: an estimation-theoretic perspective. In Information Processing in Sensor Networks, IPSN-14 Proceedings of the 13th International Symposium on. IEEE, 35-46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dong Wang, Lance Kaplan, Hieu Le, and Tarek Abdelzaher. 2012. On truth discovery in social sensing: A maximum likelihood estimation approach. In Information Processing in Sensor Networks (IPSN), 2012 ACM/IEEE 11th International Conference on. IEEE, 233-244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Shiguang Wang, Dong Wang, Lu Su, Lance Kaplan, and Tarek F Abdelzaher. 2014. Towards cyber-physical systems in social spaces: The data reliability challenge. In Real-Time Systems Symposium (RTSS), 2014 IEEE. IEEE, 74-85.Google ScholarGoogle ScholarCross RefCross Ref
  17. Shuochao Yao, Md Tanvir Amin, Lu Su, Shaohan Hu, Shen Li, Shiguang Wang, Yiran Zhao, Tarek Abdelzaher, Lance Kaplan, Charu Aggarwal, 2016. Recursive ground truth estimator for social data streams. In Information Processing in Sensor Networks (IPSN), 2016 15th ACM/IEEE International Conference on. IEEE, 1-12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Shuochao Yao, Md Tanvir Amin, Lu Su, Shaohan Hu, Shen Li, Shiguang Wang, Yiran Zhao, Tarek Abdelzaher, Lance Kaplan, Charu Aggarwal, and Aylin Yener. 2016. Recursive Ground Truth Estimator for Social Data Streams. In Proceedings of the 15th International Conference on Information Processing in Sensor Networks(IPSN '16). IEEE Press, Piscataway, NJ, USA, Article 14, 12 pages. http://dl.acm.org/citation.cfm?id=2959355.2959369 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Xiaoxin Yin, Jiawei Han, and S Yu Philip. 2008. Truth discovery with multiple conflicting information providers on the web. IEEE Transactions on Knowledge and Data Engineering 20, 6(2008), 796-808. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Xiaoxin Yin and Wenzhao Tan. 2011. Semi-supervised truth discovery. In Proceedings of the 20th international conference on World wide web. ACM, 217-226. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    WWW '19: The World Wide Web Conference
    May 2019
    3620 pages
    ISBN:9781450366748
    DOI:10.1145/3308558

    Copyright © 2019 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 13 May 2019

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate1,899of8,196submissions,23%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format