Skip to main content

Efficient Pruning for Machine Learning Under Homomorphic Encryption

  • Conference paper
  • First Online:
Computer Security – ESORICS 2023 (ESORICS 2023)

Abstract

Privacy-preserving machine learning (PPML) solutions are gaining widespread popularity. Among these, many rely on homomorphic encryption (HE) that offers confidentiality of the model and the data, but at the cost of large latency and memory requirements. Pruning neural network (NN) parameters improves latency and memory in plaintext ML but has little impact if directly applied to HE-based PPML.

We introduce a framework called HE-PEx that comprises new pruning methods, on top of a packing technique called tile tensors, for reducing the latency and memory of PPML inference. HE-PEx uses permutations to prune additional ciphertexts, and expansion to recover inference loss. We demonstrate the effectiveness of our methods for pruning fully-connected and convolutional layers in NNs on PPML tasks, namely, image compression, denoising, and classification, with autoencoders, multilayer perceptrons (MLPs) and convolutional neural networks (CNNs).

We implement and deploy our networks atop a framework called HElayers, which shows a 10–35% improvement in inference speed and a 17–35% decrease in memory requirement over the unpruned network, corresponding to 33–65% fewer ciphertexts, within a 2.5% degradation in inference accuracy over the unpruned network. Compared to the state-of-the-art pruning technique for PPML, our techniques generate networks with 70% fewer ciphertexts, on average, for the same degradation limit.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Gl/L1/Neu and Gl/L1/Chan were not evaluated since PyTorch limits the scope of global pruning to unstructured methods only.

References

  1. Aharoni, E., Drucker, N., Ezov, G., Shaul, H., Soceanu, O.: Complex encoded tile tensors: accelerating encrypted analytics. IEEE Secur. Priv. 20, 35–43 (2021)

    Article  Google Scholar 

  2. Aharoni, E., et al.: HELayers: a tile tensors framework for large neural networks on encrypted data. PoPETs (2023)

    Google Scholar 

  3. Akavia, A., Vald, M.: On the privacy of protocols based on CPA-secure homomorphic encryption. Cryptology ePrint Archive (2021)

    Google Scholar 

  4. Albrecht, M., et al.: Homomorphic encryption security standard. Technical report, HomomorphicEncryption.org, November 2018

    Google Scholar 

  5. Baruch, M., et al.: Sensitive tuning of large scale CNNs for E2E secure prediction using homomorphic encryption. arXiv preprint arXiv:2304.14836 (2023)

  6. Baruch, M., Drucker, N., Greenberg, L., Moshkowich, G.: A methodology for training homomorphic encryption friendly neural networks. In: Zhou, J., et al. (eds.) ACNS 2022. Lecture Notes in Computer Science, vol. 13285, pp. 536–553. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16815-4_29

    Chapter  Google Scholar 

  7. Baruch, M., Drucker, N., Greenberg, L., Moshkowich, G.: A methodology for training homomorphic encryption friendly neural networks. In: Zhou, J., et al. (eds.) ACNS 2022. LNCS, vol. 13285, pp. 536–553. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16815-4_29

    Chapter  Google Scholar 

  8. Blalock, D., Gonzalez Ortiz, J.J., Frankle, J., Guttag, J.: What is the state of neural network pruning? Proc. Mach. Learn. Syst. 2, 129–146 (2020)

    Google Scholar 

  9. Boemer, F., Costache, A., Cammarota, R., Wierzynski, C.: NGraph-HE2: a high-throughput framework for neural network inference on encrypted data. In: Proceedings of the 7th ACM Workshop on Encrypted Computing & Applied Homomorphic Cryptography, WAHC 2019. Association for Computing Machinery (2019)

    Google Scholar 

  10. Cai, Y., Zhang, Q., Ning, R., Xin, C., Wu, H.: Hunter: HE-friendly structured pruning for efficient privacy-preserving deep learning. In: Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security (CCS) (2022)

    Google Scholar 

  11. Cheon, J.H., Kim, A., Kim, M., Song, Y.: Homomorphic encryption for arithmetic of approximate numbers. In: Takagi, T., Peyrin, T. (eds.) ASIACRYPT 2017. LNCS, vol. 10624, pp. 409–437. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70694-8_15

    Chapter  Google Scholar 

  12. Chou, E., Beal, J., Levy, D., Yeung, S., Haque, A., Fei-Fei, L.: Faster cryptonets: Leveraging sparsity for real-world encrypted inference. arXiv preprint (2018)

    Google Scholar 

  13. EU General Data Protection Regulation: Regulation (EU) 2016/679 Directive 95/46/EC (General Data Protection Regulation). Official Journal of the European Union 119 (2016). http://data.europa.eu/eli/reg/2016/679/oj

  14. Gong, Y., et al.: A privacy-preserving-oriented DNN pruning and mobile acceleration framework. In: Proceedings of the 2020 on Great Lakes Symposium on VLSI (2020)

    Google Scholar 

  15. Gottemukkula, V.: Polynomial activation functions (2019). https://openreview.net/pdf?id=rkxsgkHKvH

  16. Gunraj, H., Wang, L., Wong, A.: COVIDNet-CT: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest CT images. Front. Med. 7, 608525 (2020)

    Article  Google Scholar 

  17. Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In: 4th International Conference on Learning Representations, ICLR 2016 (2016)

    Google Scholar 

  18. Juvekar, C., Vaikuntanathan, V., Chandrakasan, A.: GAZELLE: a low latency framework for secure neural network inference. In: 27th USENIX Security Symposium, pp. 1651–1669. USENIX Association, Baltimore, August 2018

    Google Scholar 

  19. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Neural Inf. Process. Syst. 25 (2012)

    Google Scholar 

  20. Lehmkuhl, R., Mishra, P., Srinivasan, A., Popa, R.A.: Muse: secure inference resilient to malicious clients. In: USENIX Security 2021. USENIX Association (2021)

    Google Scholar 

  21. Lou, Q., Jiang, L.: HEMET: a homomorphic-encryption-friendly privacy-preserving mobile neural network architecture. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 7102–7110. PMLR (2021)

    Google Scholar 

  22. Luo, J.H., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5058–5066 (2017)

    Google Scholar 

  23. Microsoft: SEAL (release 3.5). https://github.com/Microsoft/SEAL

  24. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)

    Google Scholar 

  25. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_16

    Chapter  Google Scholar 

  26. Riazi, M.S., Samragh, M., Chen, H., Laine, K., Lauter, K., Koushanfar, F.: XONN: XNOR-based oblivious deep neural network inference. In: 28th USENIX Security Symposium, pp. 1501–1518. USENIX Association, Santa Clara, August 2019

    Google Scholar 

  27. The HEBench Organization: HEBench (2022). https://hebench.github.io/

  28. Wang, C., Zhang, G., Grosse, R.: Picking winning tickets before training by preserving gradient flow. arXiv preprint arXiv:2002.07376 (2020)

  29. Wang, J., Jin, C., Meftah, S., Aung, K.M.M.: Popcorn: Paillier Meets Compression For Efficient Oblivious Neural Network Inference (2021)

    Google Scholar 

  30. Yang, Y., Kuppannagari, S.R., Kannan, R., Prasanna, V.K.: FPGA accelerator for homomorphic encrypted sparse convolutional neural network inference. In: 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 1–9 (2022)

    Google Scholar 

  31. Yuan, X., Zhang, L.: Membership inference attacks and defenses in neural network pruning. In: USENIX Security 2022

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nir Drucker .

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Additional Evaluation

Selection of P4E prune\(^\textbf{pack}\) Threshold. We justify our choice of selecting a threshold for the prune\(^\textrm{pack}\) step (threshold % of zeros in the tile, above which the tile is pruned) of P4E by running a sweep and reporting the results in Fig. 12, for our AE-Compressor with CIFAR-10. The behavior is similar for other networks and datasets. We observe that large values of the threshold do not introduce sufficient new zero tiles to make any meaningful improvement upon P3E. At the other extreme, too small values of this threshold aggressively remove weights, leading to degradation in test loss. We select 93.8% as our choice here, but posit that our results can be further improved by tuning this as a hyperparameter.

Fig. 12.
figure 12

Test loss vs. tile sparsity (% of zero tiles) for different prune\(^\textrm{pack}\) thresholds of our P4E scheme, considering four tile shapes using the AE-Compressor with CIFAR-10.

Comparison of Permutation Algorithms. We compare different permutation algorithms for the permute step in HE-PEx.

Fig. 13.
figure 13

Increase in tile sparsity (log scale) with different permutation algorithms for AlexNet on the COVIDx CT-2A dataset, after Lc/L1/Wei pruning.

Figure 13 shows the results of AlexNet with Lc/L1/Wei pruning of 97.5% of the weights, which is the same scenario discussed in Sect. 6.3. As expected, “Random Reordering” for the row-column co-permutation provides little-to-no benefit, due to the sheer number of possible permutations of the entire network. Interestingly, k-Means alone only improves the benefits to <10%. Moreover, the use of a “Grouped Hamming” variant (Sect. 4.2) does not show significance for the smaller tile shapes. However, as seen in the \(64\times 64\) case, grouped Hamming distance is critical for grouping zeros together for large tile shapes. Finally, the impact of balancing the clusters in k-Means is paramount, as it brings in the critical information of the first dimension of the tile shape into the optimization algorithm. In summary, our permutation algorithm variant improves the tile sparsity by \(\sim \)4–1,000\(\%\) for this network.

Table 2. Network configurations and hyperparameters. [HL = hidden layer]
Fig. 14.
figure 14

Flowchart illustrating the P4E strategy with local pruning scope for the prune step and global scope for prune\(^\textrm{pack}\). This is executed as a pre-deployment step on the entity that has access to the plaintext network parameters.

1.2 A.2 Experimental Setup

Our experiments involving the methods in Sect. 4.1 are done on a cluster of systems equipped with Tesla A100 GPUs and Xeon Gold 6258R CPUs. We used PyTorch version 1.11.0 accelerated with CUDA 11.6. For end-to-end evaluation using HElayers [2], we used a 44-core Xeon CPU E5-2699 @2.20 GHz machine with 750 GB memory. We observed that for our pruned networks, not all of the cores were fully utilized and performance saturated around 8 cores. Thus, for a fair comparison, we limited the system to use 8 cores for all of our experiments. We use a modified HElayers [2] with the CKKS SEAL [23] implementation targeting 128-bit security, and the reported results are the average of 10 runs.

Network Configurations and Model Hyperparameters. Table 2 shows details of the activation functions used in each network, the network architecture, and training hyperparameters for each of the networks and datasets in Sect. 5.

Enhancing HElayers. We integrate HE-PEx with HeLayers [2] and show a flowchart for P4E in Fig. 14. This can be extended as needed to our other strategies. We illustrate a grid search approach in the flowchart with {thres_all} and {tile_shapes} being a pre-selected set of values. The output of the framework is a deployment-ready model that meets an objective function based on accuracy, latency, throughput, and memory. A local search strategy may be considered to navigate the state space with fewer training and permutation rounds, which are the most time-consuming segments of this flow. The zero-tile–skipping logic is programmed into HeLayers by associating a zero_flag bit with each ciphertext, which is set during the HE Enc procedure (Sect. 3.1).

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Aharoni, E. et al. (2024). Efficient Pruning for Machine Learning Under Homomorphic Encryption. In: Tsudik, G., Conti, M., Liang, K., Smaragdakis, G. (eds) Computer Security – ESORICS 2023. ESORICS 2023. Lecture Notes in Computer Science, vol 14347. Springer, Cham. https://doi.org/10.1007/978-3-031-51482-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-51482-1_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-51481-4

  • Online ISBN: 978-3-031-51482-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics