Abstract
Many organizations are producing or collecting private data in fields such as medical research and government regulation. Due to current privacy protection, laws and regulations, commercial competition, and other issues, these institutions cannot directly share their data. Collaborative analysis of private data from multiple institutions will benefit each institution and create profits together. Therefore, we propose a Yannakakis-based multiparty outsourcing collaboration analysis scheme. It enables organizations to collaboratively analyze private data from multiple organizations according to their needs while ensuring that private data are not leaked to each other. Our scheme is based on the improved Yannakakis algorithm to build a series of query components, such as Semi-join, Join, Order-by, etc. We also optimized the join operation. By confusing the input tuples and protecting their authenticity through annotations, the join operation can be directly joined through the hash value without disclosing the join results. Through this series of configurations, we can execute a query with \( O(\textrm{IN}+\textrm{OUT}) \) runtime and communication, where \( \textrm{IN} \) is the total number of tuples in the input relationship, and \( \textrm{OUT} \) is the output size. We have carried out a series of comparative experiments, and the results show that our system is 1.3X–7.4X faster than the baseline.















Similar content being viewed by others
Data Availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Francesca C, Stefano C, Domenico D, Gambardella SM, Giuseppe P (2022) Social network data analysis to highlight privacy threats in sharing data. J Big Data 9(1):19
Yin L, Feng J, Lin S, Cao Z, Sun Z (2021) A blockchain-based collaborative training method for multi-party data sharing. Comput Commun 173:70–78
Dhinakaran D, Prathap PJ (2022) Protection of data privacy from vulnerability using two-fish technique with a priori algorithm in data mining. J Supercomput 78(16):17 559-17 593
Narayan A, Haeberlen A (2012) Djoin: differentially private join queries over distributed databases. In: USENIX Symposium on Operating Systems Design and Implementation
Zheng W, Dave A, Beekman JG, Popa RA, Gonzalez JE, Stoica I (2017) Opaque: an oblivious and encrypted distributed analytics platform. In: 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pp 283–298
Bater J, Elliott G, Eggen C, Goel S, Kho AN, Rogers J (2017) Smcql: secure query processing for private data networks. Proc VLDB Endow 10(6):673–684
Su H, Zhao D, Elmannai H, Heidari A, Bourouis S, Wu Z, Cai Z, Gui W, Chen M (2022) Multilevel threshold image segmentation for COVID-19 chest radiography: a framework using horizontal and vertical multiverse optimization. Comput Biol Med 146:105618
Qi A, Zhao D, Yu F, Heidari A, Wu Z, Cai Z, Alenezi F, Mansour R, Chen H, Chen M (2022) Directional mutation and crossover boosted ant colony optimization with application to COVID-19 X-ray image segmentation. Comput Biol Med 148:105810
Hu K, Zhao L, Feng S, Zhang S, Zhou Q, Gao X, Guo Y (2022) Colorectal polyp region extraction using saliency detection network with neutrosophic enhancement. Comput Biol Med 147:105760
Jiang X, Ding Y, Liu M, Wang Y, Li Y, Wu Z (2023) BiFTransNet: a unified and simultaneous segmentation network for gastrointestinal images of CT & MRI. Comput Biol Med 165:107326
Papadimitriou A, Narayan A, Haeberlen A (2017) Dstress: efficient differentially private computations on distributed data. In: Proceedings of the Twelfth European Conference on Computer Systems, pp 560–574
Bisias D, Flood M, Lo AW, Valavanis S (2012) A survey of systemic risk analytics. Annu Rev Financ Econ 4(1):255–296
Poddar R, Kalra S, Yanai A, Deng R, Popa RA, Hellerstein JM (2020) Senate: a maliciously-secure mpc platform for collaborative analytics. In: IACR Cryptology ePrint Archive
Lindell Y (2020) Secure multiparty computation. Commun ACM 64(1):86–96
Feng D, Yang K (2022) Concretely efficient secure multi-party computation protocols: survey and more. Sec Saf 1:2021001
Volgushev N, Schwarzkopf M, Getchell B, Varia M, Lapets A, Bestavros A (2019) Conclave: secure multi-party computation on big data. In: Proceedings of the Fourteenth EuroSys Conference, vol 2019, pp 1–18
Wang Y, Yi K (2021) Secure yannakakis: join-aggregate queries over private data. In: Proceedings of the 2021 International Conference on Management of Data, pp 1969–1981
Bater J, He X, Ehrich W, Machanavajjhala A, Rogers J (2018) Shrinkwrap: efficient sql query processing in differentially private data federations. Proc VLDB Endow 12(3):307–320
Al-Juaid N, Lisitsa A, Schewe S (2022) Smpg: secure multi party computation on graph databases. In: ICISSP, pp 463–471
Liagouris J, Kalavri V, Faisal M, Varia M (2021) Secrecy: secure collaborative analytics on secret-shared data, arXiv arXiv:2102.01048
Han F, Zhang L, Feng H, Liu W, Li X (2022) Scape: scalable collaborative analytics system on private database with malicious security. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp 1740–1753
Javid T, Gupta MK, Gupta A (2022) A hybrid-security model for privacy-enhanced distributed data mining. J King Saud Univ Comput Inf Sci 34(6):3602–3614
Deng Z, Zhang Y, Zhang X, Li L (2019) Privacy-preserving quantum multi-party computation based on circular structure. J Inf Secur Appl 47:120–124
Wang X, Fan W, Hu X, He J, Chi C-H (2023) Differential privacy-preserving of multi-party collaboration under federated learning in data center networks. In: IEEE Transactions on Emerging Topics in Computational Intelligence, pp 1–15
Bater J, Park Y, He X, Wang X, Duggan J (2020) Saqe: practical privacy-preserving approximate query processing for data federations. Proc VLDB Endow 13:2691–2705
Liu C, Wang XS, Nayak K, Huang Y, Shi E (2015) Oblivm: a programming framework for secure computation. In: 2015 IEEE Symposium on Security and Privacy, pp 359–376
Tong Y, Pan X, Zeng Y, Shi Y, Xue C, Zhou Z, Zhang X, Chen L, Xu Y, Xu K et al (2022) Hu-fu: efficient and secure spatial queries over data federation. Proc VLDB Endow 15(5):1159
Luo Q, Wang Y, Yi K, Wang S, Li F (2023) Secure sampling for approximate multi-party query processing. Proc ACM Manag Data 1(3):1–27
Zhang H, Gao P, Yu J, Lin J, Xiong N (2021) Machine learning on cloud with blockchain: a secure, verifiable and fair approach to outsource the linear regression. IEEE Trans Netw Sci Eng 9(6):3956–3967
Huo L, Wu L, Zhang Z, Li C, He D, Wang J (2024) Libras: a fair, secure, verifiable and scalable outsourcing computation scheme based on blockchain. IEEE Trans Inf Forensics Sec 19:5725–5737
Mei Z, Yu J, Zhang C, Wu B, Yao S, Shi J, Wu Z (2024) Secure multi-dimensional data retrieval with access control and range query in the cloud. Inf Syst 122:102343
Wu Z, Liu H, Xie J, Xu G, Li G, Lu C (2023) An effective method for the protection of user health topic privacy for health information services. World Wide Web 26(6):3837–3859
Joglekar MR, Puttagunta R, Ré C (2016) Ajar: aggregations and joins over annotated relations. In: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
Bagan G, Durand A, Grandjean E (2007) On acyclic conjunctive queries and constant delay enumeration. In: Annual Conference for Computer Science Logic
Yao AC-C (1986) How to generate and exchange secrets. In: 27th Annual Symposium on Foundations of Computer Science (sfcs 1986), pp 162–167
Evans D, Kolesnikov V, Rosulek M et al (2018) A pragmatic introduction to secure multi-party computation. Foundat Trends® Privacy Sec 2(2–3):70–246
Yannakakis M (1981) Algorithms for acyclic database schemes. In: Very Large Data Bases Conference
Li Y, Ghosh D, Gupta P, Mehrotra S, Panwar N, Sharma S (2021) Prism: private verifiable set computation over multi-owner outsourced databases. In: Proceedings of the 2021 International Conference on Management of Data
Huang Y, Evans D, Katz J (2012) Private set intersection: Are garbled circuits better than custom protocols?” In: Network and Distributed System Security Symposium
Mohassel P, Zhang Y (2017) Secureml: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP), pp 19–38
Demmler D, Schneider T, Zohner M (2015) ABY—a framework for efficient mixed-protocol secure two-party computation. In: Network and Distributed System Security Symposium
Acknowledgements
The academic staffs of School of Cyber Security and Information Law, Chongqing University of Posts and Telecommunications, China conducted this study. The authors would like to express their gratitude to the institute for its assistance with various resources as well as its assistance during the research process. The authors appreciate the reviewers for their helpful suggestions.
Funding
This work is supported by the Opening Project of Intelligent Policing Key Laboratory of Sichuan Province, No.ZNJW2022KFZD002.
Author information
Authors and Affiliations
Contributions
Zigang Chen performed conceptualization, methodology, and software. Zhenjiang Zhang presented formal analysis, data curation, writing—original draft, and writing—review & editing. Haihua Zhu and Tao Leng carried out visualization, investigation, and data curation. Yuhong Liu analyzed conceptualization and methodology. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Declaration of conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, Z., Zhang, Z., Leng, T. et al. Outsourcing collaboration analysis of multiparty privacy data using the improved Yannakakis. J Supercomput 81, 497 (2025). https://doi.org/10.1007/s11227-025-06994-5
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-025-06994-5