Skip to main content

Efficient Privacy-Preserving Truth Discovery and Copy Detection in Crowdsourcing

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Research Track (ECML PKDD 2024)

Abstract

Researchers continue to focus on privacy-preserving truth discovery and achieve certain results with the increasingly popular trend of privacy protection. However, the common existence of copiers among workers is overlooked in existing privacy-preserving truth discovery, which causes decreased accuracy. Since methods based on encryption or perturbation may easily introduce noise and lose correlation between original data, it is challenging to detect copiers on privacy-preserving data. To address this challenge, in this paper, we propose an anti-copy iterative model based on lightweight homomorphic encryption, called CAPP-TD. First, we propose a lightweight privacy protection mechanism based on Paillier homomorphic encryption that preserves the correlation of privacy data. Compared with traditional homomorphic encryption-based algorithms, it requires less communication and computation overhead to perform truth discovery with copy detection. We then propose an iterative truth discovery method that can efficiently detect copy relationships in encrypted data and exclude copiers from truth inference to improve accuracy. Experimental results on both real-world and synthetic datasets and thorough security analysis demonstrate that CAPP-TD protects crowdsourcing systems from adversaries and enables highly accurate truth discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Due to economic and ethical reasons, we were unable to find datasets with private content, so we used these datasets for simulation.

References

  1. Yuan, D., Li, G., Li, Q., et al.: Sybil defense in crowdsourcing platforms. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM), Singapore, Singapore, pp. 1529–1538 (2017)

    Google Scholar 

  2. Tang, J., Fu, S., Liu, X., et al.: Achieving privacy-preserving and lightweight truth discovery in mobile crowdsensing. IEEE Trans. Knowl. Data Eng. 34(11), 5140–5153 (2021)

    Article  Google Scholar 

  3. Zheng, Y., Li, G., Li, Y., et al.: Truth inference in crowdsourcing: is the problem solved? Proc. VLDB Endow. 10(5), 541–552 (2017)

    Article  Google Scholar 

  4. Zhou, D., Basu, S., Mao, Y., et al.: Learning from the wisdom of crowds by minimax entropy. In: Proceedings of the 2012 Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA, pp. 2195–2203 (2012)

    Google Scholar 

  5. Li, Q., Li, Y., Gao, J., et al.: Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD), Snowbird, Utah, USA, pp. 1187–1198 (2014)

    Google Scholar 

  6. Li, Y., Xiao, H., Qin, Z., et al.: Towards differentially private truth discovery for crowd sensing systems. In: Proceedings of the 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS), Coimbatore, Tamilnadu, India, pp. 1156–1166 (2020)

    Google Scholar 

  7. Pang, X., Wang, Z., Liu, D., et al.: Towards personalized privacy-preserving truth discovery over crowdsourced data streams. IEEE/ACM Trans. Networking 30(1), 327–340 (2021)

    Article  Google Scholar 

  8. Sun, P., Wang, Z., Wu, L., et al.: Towards personalized privacy-preserving incentive for truth discovery in mobile crowdsensing systems. IEEE Trans. Mob. Comput. 21(1), 352–365 (2020)

    Article  Google Scholar 

  9. Ding, X., Lv, R., Pang, X., et al.: Privacy-preserving task allocation for edge computing-based mobile crowdsensing. Comput. Electr. Eng. 97, 107528 (2022)

    Article  Google Scholar 

  10. Jiang, L., Niu, X., Xu, J., et al.: Incentivizing the workers for truth discovery in crowdsourcing with copiers. In: Proceedings of the 39th International Conference on Distributed Computing Systems (ICDCS), Dallas, Texas, USA, pp. 1286–1295 (2019)

    Google Scholar 

  11. Wang, X., Sheng, Q.Z., Fang, X.S., et al.: An integrated Bayesian approach for effective multi-truth discovery. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM), Melbourne, Australia, pp. 493–502 (2015)

    Google Scholar 

  12. Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: the role of source dependence. Proc. VLDB Endow. 2(1), 550–561 (2009)

    Article  Google Scholar 

  13. Dong, X.L., Berti-Equille, L., Hu, Y., Srivastava, D.: Global detection of complex copying relationships between sources. Proc. VLDB Endow. 3(1–2), 1358–1369 (2010)

    Article  Google Scholar 

  14. Zhang, C., Zhu, L., Xu, C., et al.: LPTD: achieving lightweight and privacy-preserving truth discovery in CIoT. Futur. Gener. Comput. Syst. 90, 175–184 (2019)

    Article  Google Scholar 

  15. Dong, C., Wang, Y., Aldweesh, A., et al.: Betrayal, distrust, and rationality: Smart counter-collusion contracts for verifiable cloud computing. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS), New York, NY, USA, pp. 211–227 (2017)

    Google Scholar 

  16. Franklin, M.J., Kossmann, D., Kraska, T., et al.: CrowdDB: answering queries with crowdsourcing. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (SIGMOD), New York, NY, USA, pp. 61–72 (2011)

    Google Scholar 

  17. Wu, G., Zhou, L., Xia, J., et al.: Crowdsourcing truth inference based on label confidence clustering. ACM Trans. Knowl. Discov. Data 17(4), 1–20 (2023)

    Article  Google Scholar 

  18. Chang, J.C., Amershi, S., Kamar, E.: Revolt: collaborative crowdsourcing for labeling machine learning datasets. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI), New York, NY, USA, pp. 2334–2346 (2017)

    Google Scholar 

  19. Li, X., Dong, X.L., Lyons, K.B., et al.: Scaling up copy detection. In: Proceedings of 31st International Conference on Data Engineering (ICDE), Seoul, Korea, pp. 89–100 (2015)

    Google Scholar 

  20. Chen, P., Sun, H., Fang, Y., et al.: CONAN: a framework for detecting and handling collusion in crowdsourcing. Inf. Sci. 515, 44–63 (2020)

    Article  Google Scholar 

  21. Waguih, D.A., Berti-Équille, L.: Truth Discovery Algorithms: An Experimental Evaluation. Doctoral dissertation, Qatar Foundation; QCRI (2014)

    Google Scholar 

  22. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_16

    Chapter  Google Scholar 

Download references

Acknowledgments

This work was supported by Shanghai Science and Technology Commission (No. 22YF1401100), Fundamental Research Funds for the Central Universities (No. 22D111210), and National Science Fund for Young Scholars (No. 62202095, No. 62102058).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guohao Sun .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fang, X.S., Du, X., Chen, H., Wei, Z., Zhan, Y., Sun, G. (2024). Efficient Privacy-Preserving Truth Discovery and Copy Detection in Crowdsourcing. In: Bifet, A., Davis, J., KrilaviÄŤius, T., Kull, M., Ntoutsi, E., Ĺ˝liobaitÄ—, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14943. Springer, Cham. https://doi.org/10.1007/978-3-031-70352-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70352-2_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70351-5

  • Online ISBN: 978-3-031-70352-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics