Skip to main content

Advertisement

Log in

Outsourcing collaboration analysis of multiparty privacy data using the improved Yannakakis

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Many organizations are producing or collecting private data in fields such as medical research and government regulation. Due to current privacy protection, laws and regulations, commercial competition, and other issues, these institutions cannot directly share their data. Collaborative analysis of private data from multiple institutions will benefit each institution and create profits together. Therefore, we propose a Yannakakis-based multiparty outsourcing collaboration analysis scheme. It enables organizations to collaboratively analyze private data from multiple organizations according to their needs while ensuring that private data are not leaked to each other. Our scheme is based on the improved Yannakakis algorithm to build a series of query components, such as Semi-join, Join, Order-by, etc. We also optimized the join operation. By confusing the input tuples and protecting their authenticity through annotations, the join operation can be directly joined through the hash value without disclosing the join results. Through this series of configurations, we can execute a query with \( O(\textrm{IN}+\textrm{OUT}) \) runtime and communication, where \( \textrm{IN} \) is the total number of tuples in the input relationship, and \( \textrm{OUT} \) is the output size. We have carried out a series of comparative experiments, and the results show that our system is 1.3X–7.4X faster than the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Algorithm 5
Algorithm 6
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Francesca C, Stefano C, Domenico D, Gambardella SM, Giuseppe P (2022) Social network data analysis to highlight privacy threats in sharing data. J Big Data 9(1):19

    Article  Google Scholar 

  2. Yin L, Feng J, Lin S, Cao Z, Sun Z (2021) A blockchain-based collaborative training method for multi-party data sharing. Comput Commun 173:70–78

    Article  MATH  Google Scholar 

  3. Dhinakaran D, Prathap PJ (2022) Protection of data privacy from vulnerability using two-fish technique with a priori algorithm in data mining. J Supercomput 78(16):17 559-17 593

    Article  MATH  Google Scholar 

  4. Narayan A, Haeberlen A (2012) Djoin: differentially private join queries over distributed databases. In: USENIX Symposium on Operating Systems Design and Implementation

  5. Zheng W, Dave A, Beekman JG, Popa RA, Gonzalez JE, Stoica I (2017) Opaque: an oblivious and encrypted distributed analytics platform. In: 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pp 283–298

  6. Bater J, Elliott G, Eggen C, Goel S, Kho AN, Rogers J (2017) Smcql: secure query processing for private data networks. Proc VLDB Endow 10(6):673–684

    Article  Google Scholar 

  7. Su H, Zhao D, Elmannai H, Heidari A, Bourouis S, Wu Z, Cai Z, Gui W, Chen M (2022) Multilevel threshold image segmentation for COVID-19 chest radiography: a framework using horizontal and vertical multiverse optimization. Comput Biol Med 146:105618

    Article  Google Scholar 

  8. Qi A, Zhao D, Yu F, Heidari A, Wu Z, Cai Z, Alenezi F, Mansour R, Chen H, Chen M (2022) Directional mutation and crossover boosted ant colony optimization with application to COVID-19 X-ray image segmentation. Comput Biol Med 148:105810

    Article  Google Scholar 

  9. Hu K, Zhao L, Feng S, Zhang S, Zhou Q, Gao X, Guo Y (2022) Colorectal polyp region extraction using saliency detection network with neutrosophic enhancement. Comput Biol Med 147:105760

    Article  Google Scholar 

  10. Jiang X, Ding Y, Liu M, Wang Y, Li Y, Wu Z (2023) BiFTransNet: a unified and simultaneous segmentation network for gastrointestinal images of CT & MRI. Comput Biol Med 165:107326

    Article  Google Scholar 

  11. Papadimitriou A, Narayan A, Haeberlen A (2017) Dstress: efficient differentially private computations on distributed data. In: Proceedings of the Twelfth European Conference on Computer Systems, pp 560–574

  12. Bisias D, Flood M, Lo AW, Valavanis S (2012) A survey of systemic risk analytics. Annu Rev Financ Econ 4(1):255–296

    Article  MATH  Google Scholar 

  13. Poddar R, Kalra S, Yanai A, Deng R, Popa RA, Hellerstein JM (2020) Senate: a maliciously-secure mpc platform for collaborative analytics. In: IACR Cryptology ePrint Archive

  14. Lindell Y (2020) Secure multiparty computation. Commun ACM 64(1):86–96

    Article  MATH  Google Scholar 

  15. Feng D, Yang K (2022) Concretely efficient secure multi-party computation protocols: survey and more. Sec Saf 1:2021001

    MATH  Google Scholar 

  16. Volgushev N, Schwarzkopf M, Getchell B, Varia M, Lapets A, Bestavros A (2019) Conclave: secure multi-party computation on big data. In: Proceedings of the Fourteenth EuroSys Conference, vol 2019, pp 1–18

  17. Wang Y, Yi K (2021) Secure yannakakis: join-aggregate queries over private data. In: Proceedings of the 2021 International Conference on Management of Data, pp 1969–1981

  18. Bater J, He X, Ehrich W, Machanavajjhala A, Rogers J (2018) Shrinkwrap: efficient sql query processing in differentially private data federations. Proc VLDB Endow 12(3):307–320

    Article  Google Scholar 

  19. Al-Juaid N, Lisitsa A, Schewe S (2022) Smpg: secure multi party computation on graph databases. In: ICISSP, pp 463–471

  20. Liagouris J, Kalavri V, Faisal M, Varia M (2021) Secrecy: secure collaborative analytics on secret-shared data, arXiv arXiv:2102.01048

  21. Han F, Zhang L, Feng H, Liu W, Li X (2022) Scape: scalable collaborative analytics system on private database with malicious security. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp 1740–1753

  22. Javid T, Gupta MK, Gupta A (2022) A hybrid-security model for privacy-enhanced distributed data mining. J King Saud Univ Comput Inf Sci 34(6):3602–3614

    MATH  Google Scholar 

  23. Deng Z, Zhang Y, Zhang X, Li L (2019) Privacy-preserving quantum multi-party computation based on circular structure. J Inf Secur Appl 47:120–124

    MATH  Google Scholar 

  24. Wang X, Fan W, Hu X, He J, Chi C-H (2023) Differential privacy-preserving of multi-party collaboration under federated learning in data center networks. In: IEEE Transactions on Emerging Topics in Computational Intelligence, pp 1–15

  25. Bater J, Park Y, He X, Wang X, Duggan J (2020) Saqe: practical privacy-preserving approximate query processing for data federations. Proc VLDB Endow 13:2691–2705

    Article  Google Scholar 

  26. Liu C, Wang XS, Nayak K, Huang Y, Shi E (2015) Oblivm: a programming framework for secure computation. In: 2015 IEEE Symposium on Security and Privacy, pp 359–376

  27. Tong Y, Pan X, Zeng Y, Shi Y, Xue C, Zhou Z, Zhang X, Chen L, Xu Y, Xu K et al (2022) Hu-fu: efficient and secure spatial queries over data federation. Proc VLDB Endow 15(5):1159

    Article  Google Scholar 

  28. Luo Q, Wang Y, Yi K, Wang S, Li F (2023) Secure sampling for approximate multi-party query processing. Proc ACM Manag Data 1(3):1–27

    MATH  Google Scholar 

  29. Zhang H, Gao P, Yu J, Lin J, Xiong N (2021) Machine learning on cloud with blockchain: a secure, verifiable and fair approach to outsource the linear regression. IEEE Trans Netw Sci Eng 9(6):3956–3967

    Article  MathSciNet  MATH  Google Scholar 

  30. Huo L, Wu L, Zhang Z, Li C, He D, Wang J (2024) Libras: a fair, secure, verifiable and scalable outsourcing computation scheme based on blockchain. IEEE Trans Inf Forensics Sec 19:5725–5737

    Article  MATH  Google Scholar 

  31. Mei Z, Yu J, Zhang C, Wu B, Yao S, Shi J, Wu Z (2024) Secure multi-dimensional data retrieval with access control and range query in the cloud. Inf Syst 122:102343

    Article  MATH  Google Scholar 

  32. Wu Z, Liu H, Xie J, Xu G, Li G, Lu C (2023) An effective method for the protection of user health topic privacy for health information services. World Wide Web 26(6):3837–3859

    Article  MATH  Google Scholar 

  33. Joglekar MR, Puttagunta R, Ré C (2016) Ajar: aggregations and joins over annotated relations. In: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems

  34. Bagan G, Durand A, Grandjean E (2007) On acyclic conjunctive queries and constant delay enumeration. In: Annual Conference for Computer Science Logic

  35. Yao AC-C (1986) How to generate and exchange secrets. In: 27th Annual Symposium on Foundations of Computer Science (sfcs 1986), pp 162–167

  36. Evans D, Kolesnikov V, Rosulek M et al (2018) A pragmatic introduction to secure multi-party computation. Foundat Trends® Privacy Sec 2(2–3):70–246

    Article  MATH  Google Scholar 

  37. Yannakakis M (1981) Algorithms for acyclic database schemes. In: Very Large Data Bases Conference

  38. Li Y, Ghosh D, Gupta P, Mehrotra S, Panwar N, Sharma S (2021) Prism: private verifiable set computation over multi-owner outsourced databases. In: Proceedings of the 2021 International Conference on Management of Data

  39. Huang Y, Evans D, Katz J (2012) Private set intersection: Are garbled circuits better than custom protocols?” In: Network and Distributed System Security Symposium

  40. Mohassel P, Zhang Y (2017) Secureml: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP), pp 19–38

  41. Demmler D, Schneider T, Zohner M (2015) ABY—a framework for efficient mixed-protocol secure two-party computation. In: Network and Distributed System Security Symposium

Download references

Acknowledgements

The academic staffs of School of Cyber Security and Information Law, Chongqing University of Posts and Telecommunications, China conducted this study. The authors would like to express their gratitude to the institute for its assistance with various resources as well as its assistance during the research process. The authors appreciate the reviewers for their helpful suggestions.

Funding

This work is supported by the Opening Project of Intelligent Policing Key Laboratory of Sichuan Province, No.ZNJW2022KFZD002.

Author information

Authors and Affiliations

Authors

Contributions

Zigang Chen performed conceptualization, methodology, and software. Zhenjiang Zhang presented formal analysis, data curation, writing—original draft, and writing—review & editing. Haihua Zhu and Tao Leng carried out visualization, investigation, and data curation. Yuhong Liu analyzed conceptualization and methodology. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Tao Leng or Haihua Zhu.

Ethics declarations

Declaration of conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., Zhang, Z., Leng, T. et al. Outsourcing collaboration analysis of multiparty privacy data using the improved Yannakakis. J Supercomput 81, 497 (2025). https://doi.org/10.1007/s11227-025-06994-5

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-025-06994-5

Keywords