Skip to main content
Log in

A privacy-preserving algorithm for distributed training of neural network ensembles

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Distributed training allows individual entities to benefit from the training sets owned by other entities. Nevertheless, distributed training also causes serious privacy concerns. Hence, it is highly important to protect privacy in distributed training. In this paper, we study the privacy protection in distributed training of neural network ensembles. We design a privacy-preserving distributed algorithm for training neural network ensembles using AdaBoost.M2. We also analyze the security and complexity of our algorithm. Furthermore, we perform experiments on two data sets of the UCI repository to verify the algorithm’s effectiveness and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. For the ease of description, we write h t (x i y) as h t (iy) in the entire paper.

  2. In an asymmetric cryptographic scheme, keys are generated in pairs (K +, K−). A published public key K+ is used for encryption and a private key K− is used for decryption.

  3. Note this scaling operation can be easily implemented in the privacy-preserving training algorithm proposed in paper [6] since the authors use linear functions to estimate the sigmoid function.

  4. For the sakes of clarity and accurateness, we omit the bagging line and also add test cases of ensembles with 4, 6, 8 component networks.

  5. Here we assume that the exhaustive search is performed using the same searching order (i.e., increasing order and decreasing order) everytime.

References

  1. Flouri K, Beferull-Lozano B, Tsakalides P (2006) Training a SVM-based classifier in distributed sensor networks. In: European signal processing conference, Florence, Italy, Sep

  2. Stolfo SJ, Prodromidis AL, Tselepis S, Lee W, Fan DW, Chan PK (1997) JAM: java agent for meta-learning over distributed databases. In: Proceedings of ACM SIGKDD international conference on knowledge discovery data mining, pp 74–81

  3. Navia-Vázquez A, Gutiérrez-González D, Parrado-Hernández E, Navarro-Abellán JJ (2006) Distributed support vector machines. IEEE Trans Neural Netw 17(4):1091–1097

    Google Scholar 

  4. Samet S, Miri A (2008) Privacy-perserving protocols for perceptron learning algorithm in neural networks. In: 2008 4th international IEEE conference on intelligent system

  5. Secretan J, Georgiopoulos M, Castro J (2007) A privacy preserving probabilistic neural network for horizontally partitioned databases. In: Proceedings of international joint conference on neural networks, Orlando, FL, USA

  6. Chen T, Zhong S (2009) Privacy-preserving backpropagation neural network learning. IEEE Trans Neural Netw 20(10):1554–1564

    Google Scholar 

  7. Goh WY, Lim CP, Peh KK (2003) Predicting drug dissolution profiles with an ensemble of boosted neural networks: a time series approach. IEEE Trans Neural Netw 14(2):459–463

    Google Scholar 

  8. Sharkey AJ (1999) Combining artificial neural nets: ensemble and modular multi-net systems. Springer-Verlag New York, Inc., Secaucus, NJ

    Book  MATH  Google Scholar 

  9. Sollich P, Krogh A (1996) Learning with ensembles: how over-fitting can be useful. In: Touretzky DS, Mozer MC, Hasselmo ME (eds) Advances, in neural information processing systems 8, Denvor, CO, MIT Press, Cambridge, MA, pp 190–196

    Google Scholar 

  10. Hansen LK, Liisberg L, Salamon P (1992) Ensemble methods for handwritten digit recognition. In: Proceedings of IEEE workship on neural network for signal processing, Helsingoer, Denmark, IEEE Press, Piscataway, NJ, pp 333–342

  11. West D, Dellana S, Qian J (2005) Neural network ensemble strategies for financial decision applications. Comput Oper Res 32:2543–2559

    Article  MATH  Google Scholar 

  12. Tsai CF, Wu JW (2008) Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst Appl 34:2639–2649

    Article  Google Scholar 

  13. Zhou ZH, Jiang Y, Yang YB, Chen SF (2002) Lung cancer cell identification based on artificial neural network ensembles. Artif Intell Med 24(1):25–36

    Article  MATH  Google Scholar 

  14. Das R, Turkoglu I, Sengur A (2009) Diagnosis of valvular heart disease through neural networks ensembles. Comput Methods Programs Biomed 93(2):185–191

    Article  Google Scholar 

  15. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MathSciNet  MATH  Google Scholar 

  16. Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227

    Google Scholar 

  17. Freund Y, Schapire RE (1997) A decision theoretic generalization of online learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    Article  MathSciNet  MATH  Google Scholar 

  18. Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Machine learning: proceedings of the thirteenth international conference, Morgan Kaufman, San Francisco, pp 148–156

  19. Goldreich O (2001) Foundations of cryptography, vols 1 and 2. Cambridge University Press, Cambridge

    Book  Google Scholar 

  20. Goethals B, Laur S, Lipmma H, Mielikäinen T (2004) On private scalar product computation for privacy-preserving data mining. In: Park C, Chee S (eds) Information security and cryptology—ICISC 2004, vol 3506 of lecture notes in computer science. Springer, Berlin, pp 104–120

  21. Zhong S (2007) Privacy-preserving algorithms for distributed mining of frequent itemsets. Inf Sci 17(2):490–503

    Article  Google Scholar 

  22. Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: EUROCRYPT, pp 223–238

  23. ElGamal T (1985) A public-key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans Inf Theory IT-31(4):469–472

    Article  MathSciNet  Google Scholar 

  24. Boneh D (1998) The decision Diffie-Hellman problem. In: Proceedings of 3rd algorithmic number theory symposium, pp 48–63

  25. Barni M, Orlandi C, Piva A (2006) A privacy-preserving protocol for neural-network-based computation. In: Proceedings of 8th workshop multimedia security, New York, pp 146–151

  26. Orlandi C, Piva A, Barni M (2007) Oblivious neural network computing via homomorphic encryption. EURASIP J Inf Secur 2007:18:1–18:10

    Google Scholar 

  27. Wan L, Ng WK, Lee VCS (2007) Privacy-preservation for gradient descent methods. In: Proceedings of ACM SIGKDD international conference on knowledge discovery data mining, pp 775–783

  28. Abramowitz M, Stegun IA (1970) Handbook of mathematical functions with formulas, graphs, and mathematical tables. Dover Publications, New York (Ninth printing)

  29. Blake CL, Merz CJ (1998) UCI repository of machine learning database. University of California, Department of Information and Computer Science, Irvine, CA [Online]. Available:http://www.ics.uci.edu/mlearn/MLRepository.htm

  30. Dai W (2010) The Crypto++ library 5.6.0. http://www.cryptopp.co

  31. Lazarevic A, Obradovic Z (2001) The distributed boosting algorithm. In: Proceedings of 7th international conference on knowledge discovery data mining, pp 311–316

  32. Lazarevic A, Obradovic Z (2002) Boosting algorithms for parallel and distributed learning. Distrib Parallel Databases 11(2):203–229

    Article  MATH  Google Scholar 

  33. Fan W, Stolfo S, Zhang J (1999) The application of AdaBoost for distributed, scalable and on-line learning. In: Proceedings of 5th international conference on knowledge discovery data mining, pp 362–366

  34. Gambs S, Kégl B, Aïmeur E (2007) Privacy-preserving boosting. Data Min Knowl Disc 14:131–170

    Article  Google Scholar 

  35. Agrawal D, Srikant R (2000) Privacy-preserving data mining. In: Proceedings of ACM SIGMOD, pp 439–450

  36. Lindell Y, Pinkas B (2000) Privacy preserving data mining. In: Lecture notes in computer science, vol 1880. Springer, Berlin, pp 36–44

  37. Du W, Zhan Z (2003) Using randomized response techniques for privacy-preserving data mining. In: KDD

  38. Chen K, Liu L (2005) Privacy preserving data classification with rotation perturbation. In: Proceedings of international conference on data mining, pp 589–592

  39. Yang Z, Zhong S, Wright R (2005) Privacy-preserving classification of customer data without loss of accuracy. In: 2005 SIAM international conference on data mining(SDM2005)

  40. Wright R, Yang Z (2004) Privacy-preserving Bayesian network structure computation on distributed heterogeneous data. In: Proceedings of 10th ACM SIGKDD international conference on knowledge discovery data mining, pp 713–718

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheng Zhong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Zhong, S. A privacy-preserving algorithm for distributed training of neural network ensembles. Neural Comput & Applic 22 (Suppl 1), 269–282 (2013). https://doi.org/10.1007/s00521-012-1000-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-012-1000-8

Keywords

Navigation