MLChain: a privacy-preserving model learning framework using blockchain

Bansal, Vidhi; Baliyan, Niyati; Ghosh, Mohona

doi:10.1007/s10207-023-00754-3

MLChain: a privacy-preserving model learning framework using blockchain

Regular Contribution
Published: 28 September 2023

Volume 23, pages 649–677, (2024)
Cite this article

International Journal of Information Security Aims and scope Submit manuscript

183 Accesses
Explore all metrics

Abstract

In this work, we present a blockchain-based secure and flexible distributed privacy-preserving online model that helps in sharing key features of datasets across multiple organizations without violating the privacy of data. In our model, all members are encouraged to participate, discouraged to write fake data. Learning is carried out without sharing of raw data, and data sharing is immutable that improves prediction results of the data held by each member of an industry. We also propose a new consensus algorithm—Proof of Share for adding a valid transaction to the blockchain, thus preventing non participating members from reading any of the data shared by the peer and discouraging fake writes. We evaluated our model on 3, 5, and 10 members setup by applying decision tree, logistic regression, Gaussian naive Bayes, and support vector machine classifiers. The maximum increase of \(26.9231\%\) was observed in accuracy where results of a member’s data were taken as baseline. \(F_{\beta }(\beta =0.5)\) score increased by 0.4533 and \(F_{1}\) score by 0.0800. The proposed model to the best of our knowledge is the only one that encourages all members to participate, rather than being passive listeners and discourages a member from forging results thus rendering it suitable for utilization in domains like health care, finance, education, etc. where data are unevenly split and secrecy of data and peers is required.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survey on the Convergence of Machine Learning and Blockchain

Analysis on Interaction of Machine Learning with BlockChain

A Blockchain-Based Decentralized Machine Learning Framework for Performance Management: A Systematic Review

Data availability

Data are publicly available on UCI machine learning Repository https://archive.ics.uci.edu/ml/datasets/EEG+Eye+State [37]

References

Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system. Technical Report Manubot (2019)
Zheng, Z., Xie, S., Dai, H.-N., Chen, X., Wang, H.: Blockchain challenges and opportunities: a survey. 505 Int. J. Web Grid Serv. 14, 352–375 (2018)
Article Google Scholar
Kuo, T.-T., Ohno-Machado, L.: Modelchain: decentralized privacy-preserving healthcare predictive modeling framework on private blockchain networks. arXiv:1802.01746 (2018)
Omar, I.A., Jayaraman, R., Salah, K., Yaqoob, I., Ellahham, S.: Applications of blockchain technology in clinical trials: review and open challenges. Arabian J. Sci. Eng. 46, 3001–3015 (2020)
Article Google Scholar
Yuølnes, S., Ubacht, J., Janssen, M.: Blockchain in government: benefits and implications of distributed ledger technology for information sharing. Gov. Inf. Q. 34, 355–364 (2017)
Article Google Scholar
Vacca, A., Di Sorbo, A., Visaggio, C.A., Canfora, G.: A systematic literature review of blockchain and smart contract development: techniques, tools, and open challenges. J. Syst. Softw. 174, 110891 (2021). https://doi.org/10.1016/j.jss.2020.110891
Article Google Scholar
Liu, M., Wu, K., Xu, J.J.: How will blockchain technology impact auditing and accounting: permissionless versus permissioned blockchain. Current Issues Audit. 13, A19–A29 (2019)
Article Google Scholar
Mingxiao, D., Xiaofeng, M., Zhe, Z., Xiangwei, W., Qijun, C.: A review on consensus algorithm of blockchain. In: 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2567–2572. IEEE (2017)
Woo, T.Y., Lam, S.S.: Authentication for distributed systems. Computer 25, 39–52 (1992)
Article Google Scholar
Swain, P.H., Hauska, H.: The decision tree classifier: design and potential. IEEE Trans. Geosci. Electron. 15, 142–147 (1977)
Article Google Scholar
Song, Y.-Y., Ying, L.: Decision tree methods: applications for classification and prediction. Shanghai Arch. Psychiatry 27, 130 (2015)
Google Scholar
Wright, R.E.: Logistic regression (1995)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet Google Scholar
Langley, P., Iba, W., Thompson, K. et al.: An analysis of Bayesian classifiers. In: Aaai pp. 223–228. Citeseer volume 90, (1992)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the 5th Annual Workshop on Computational Learning Theory, pp. 144–152 (1992)
Wu, Y., Jiang, X., Kim, J., Ohno-Machado, L.: G rid Binary LO gistic RE gression (GLORE): building shared models without sharing data. J. Am. Med. Inf. Assoc. 19, 758–764 (2012)
Article Google Scholar
Jiang, W., Li, P., Wang, S., Wu, Y., Xue, M., Ohno-Machado, L., Jiang, X.: Webglore: a web service for grid logistic regression. Bioinformatics 29, 3238–3240 (2013)
Article Google Scholar
Shi, H., Jiang, C., Dai, W., Jiang, X., Tang, Y., Ohno-Machado, L., Wang, S.: Secure multi-pArty computation grid LOgistic REgression (SMAC-GLORE). BMC Med. Inform. Decis. Mak. 16, 175–187 (2016)
Article Google Scholar
Wang, S., Jiang, X., Wu, Y., Cui, L., Cheng, S., Ohno-Machado, L.: Expectation propagation logistic regression (explorer): distributed privacy-preserving online model learning. J. Biomed. Inf. 46, 480–496 (2013)
Article Google Scholar
Li, Y., Jiang, X., Wang, S., Xiong, H., Ohno-Machado, L.: Vertical grid logistic regression (vertigo). J. Am. Med. Inform. Assoc. 23, 570–579 (2016)
Article Google Scholar
Huang, L., Shea, A.L., Qian, H., Masurkar, A., Deng, H., Liu, D.: Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. J. Biomed. Inform. 99, 103291 (2019)
Article Google Scholar
Wang, S., Chang, T.-H.: Federated clustering via matrix factorization models: from model averaging to gradient sharing. arXiv:2002.04930, (2020)
Mohassel, P., Zhang, Y.: Secureml: A system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 19–38. IEEE (2017)
Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1310–1321 (2015)
Phuong, T.T., et al.: Privacy-preserving deep learning via weight transmission. IEEE Trans. Inf. Forensics Secur. 14, 3003–3015 (2019)
Article Google Scholar
Aono, Y., Hayashi, T., Wang, L., Moriai, S., et al.: Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur. 13, 1333–1345 (2017)
Google Scholar
Brisimi, T.S., Chen, R., Mela, T., Olshevsky, A., Paschalidis, I.C., Shi, W.: Federated learning of predictive models from federated electronic health records. Int. J Med. Inform. 112, 59–67 (2018)
Article Google Scholar
Duan, M., Liu, D., Chen, X., Tan, Y., Ren, J., Qiao, L., Liang, L.: Astraea: Self-balancing federated learning for improving classification accuracy of mobile deep learning applications. In: 2019 IEEE 37th International Conference on Computer Design (ICCD), pp. 246–254. IEEE (2019)
Xie, M., Long, G., Shen, T., Zhou, T., Wang, X., Jiang, J.: Multi-center federated learning. arXiv:2005.01026, (2020)
Kim, Y., Hakim, E. A., Haraldson, J., Eriksson, H., Silva Jr., J. M.B.D., Fischione, C.: Dynamic clustering in federated learning. arXiv:2012.03788 (2020)
Choudhury, O., Gkoulalas-Divanis, A., Salonidis, T., Sylla, I., Park, Y., Hsu, G., Das, A.: Differential privacyenabled federated learning for sensitive health data. arXiv:1910.02578, (2019)
Bouacida, N., Mohapatra, P.: Vulnerabilities in federated learning. IEEE Access 23(9), 63229–49 (2021)
Article Google Scholar
Kuo, T.-T., Kim, J., Gabriel, R.A.: Privacy-preserving model learning on a blockchain network-of networks. J. Am. Med. Inform. Assoc. 27, 343–354 (2020)
Article Google Scholar
Kuo, T.-T., Gabriel, R.A., Ohno-Machado, L.: Fair compute loads enabled by blockchain: sharing models by alternating client and server roles. J. Am. Med. Inform. Assoc. 26, 392–403 (2019)
Article Google Scholar
Kuo, T.-T., Gabriel, R.A., Cidambi, K.R., Ohno-Machado, L.: Ex pectation p ropagation logistic regression on permissioned block chain (explorerchain): decentralized online healthcare/genomics predictive model learning. J. Am. Med. Inform. Assoc. 27, 747–756 (2020)
Article Google Scholar
Kennedy, R.L., Fraser, H.S., McStay, L.N., Harrison, R.F.: Early diagnosis of acute myocardial infarction using clinical and electrocardiographic data at presentation: derivation and evaluation of logistic regression models. Eur. Heart J. 17(8), 1181–91 (1996)
Article Google Scholar
Dua, D., Graff, C.: UCI machine learning repository. URL:http://archive.ics.uci.edu/ml (2017)
Jere, M.S., Farnan, T., Koushanfar, F.: A taxonomy of attacks on federated learning. IEEE Secur. Privacy 19(2), 20–8 (2020)
Article Google Scholar
Issa, W., Moustafa, N., Turnbull, B., Sohrabi, N., Tari, Z.: Blockchain-based federated learning for securing internet of things: a comprehensive survey. ACM Comput. Surv. 55(9), 1–43 (2023)
Article Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1, 81–106 (1986)
Article Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Routledge, Milton Park (2017)
Book Google Scholar
Daemen, J., Rijmen, V.: Aes proposal: Rijndael, (1999)
Standard, D.E., et al.: Data encryption standard. Federal Information Processing Standards Publication, 112 (1999)
Kim, H., Park, J., Bennis, M., Kim, S.L.: Blockchained on-device federated learning. IEEE Commun. Lett. 24(6), 1279–1283 (2019)
Article Google Scholar
Short, A.R., Leligou, H.C., Papoutsidakis, M., Theocharis, E.: Using blockchain technologies to improve security in federated learning systems. In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 1183-1188. IEEE (2020 Jul 13)
Yin, X., Zhu, Y., Hu, J.: A comprehensive survey of privacy-preserving federated learning: a taxonomy, review, and future directions. ACM Comput. Surv. (CSUR) 54(6), 1–36 (2021)
Article Google Scholar
Wei, K., Li, J., Ding, M., Ma, C., Yang, H.H., Farokhi, F., Jin, S., Quek, T.Q., Poor, H.V.: Federated learning with differential privacy: algorithms and performance analysis. IEEE Trans. Inf. Forensics Secur. 17(15), 3454–69 (2020)
Article Google Scholar
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd International Conference on Data Engineering, pp. 106-115. IEEE (2006, April)

Download references

Funding

The study has not been funded by any institute or agency.

Author information

Authors and Affiliations

Department of Information Technology, Indira Gandhi Delhi Technical University for Women, New Delhi, India
Vidhi Bansal & Mohona Ghosh
Department of Computer Engineering, National Institute of Technology Kurukshetra, Kurukshetra, India
Niyati Baliyan

Authors

Vidhi Bansal
View author publications
You can also search for this author in PubMed Google Scholar
Niyati Baliyan
View author publications
You can also search for this author in PubMed Google Scholar
Mohona Ghosh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohona Ghosh.

Ethics declarations

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 Tables and figures of results on 3-node setup

See Figs. 12, 13, 14, 15. Tables 11, 12, 13, 14.

Table 12 Performance of Logistic Regression on 3 node setup

Full size table

Table 13 Performance of Gaussian naive Bayes on 3 node setup

Full size table

Table 14 Performance of support vector machine on 3 node setup

Full size table

1.2 Figures of results on 5-node setup with a subset of EEG dataset

See Figs. 16, 17, 18, 19.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bansal, V., Baliyan, N. & Ghosh, M. MLChain: a privacy-preserving model learning framework using blockchain. Int. J. Inf. Secur. 23, 649–677 (2024). https://doi.org/10.1007/s10207-023-00754-3

Download citation

Published: 28 September 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s10207-023-00754-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MLChain: a privacy-preserving model learning framework using blockchain

Abstract

Access this article

Similar content being viewed by others

Survey on the Convergence of Machine Learning and Blockchain

Analysis on Interaction of Machine Learning with BlockChain

A Blockchain-Based Decentralized Machine Learning Framework for Performance Management: A Systematic Review

Data availability

References

Funding