Abstract
Somewhat Homomorphic Encryption (SHE) allows arbitrary computation with finite multiplicative depths to be performed on encrypted data, but its overhead is high due to memory transfer incurred by large ciphertexts. Recent research has recognized the shortcomings of general-purpose computing for high-performance SHE, and has begun to pioneer the use of hardware-based SHE acceleration with hardware including FPGAs, GPUs, and Compute-Enabled RAM (CE-RAM). CE-RAM is well-suited for SHE, as it is not limited by the separation between memory and processing that bottlenecks other hardware. Further, CE-RAM does not move data between different processing elements. Recent research has shown the high effectiveness of CE-RAM for SHE as compared to highly-optimized CPU and FPGA implementations. However, algorithmic optimization for the implementation on CE-RAM is underexplored. In this work, we examine the effect of existing algorithmic optimizations upon a CE-RAM implementation of the B/FV scheme [19], and further introduce novel optimization techniques for the Full RNS Variant of B/FV [6]. Our experiments show speedups of up to 784x for homomorphic multiplication, 143x for decryption, and 330x for encryption against a CPU implementation. We also compare our approach to similar work in CE-RAM, FPGA, and GPU acceleration, and note general improvement over existing work. In particular, for homomorphic multiplication we see speedups of 506.5x against CE-RAM [34], 66.85x against FPGA [36], and 30.8x against GPU [3] as compared to existing work in hardware acceleration of B/FV.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agarwal, R., Burrus, C.: Fast convolution using Fermat number transforms with applications to digital filtering. IEEE Trans. Acoust. Speech Signal Process. 22(2), 87–97 (1974)
Al Badawi, A., Veeravalli, B., Mun, C.F., Aung, K.M.M.: High-performance FV somewhat homomorphic encryption on GPUs: an implementation using CUDA. In: IACR CHES, pp. 70–95 (2018)
Al Badawi, A.Q.A., Polyakov, Y., Aung, K.M.M., Veeravalli, B., Rohloff, K.: Implementation and performance evaluation of RNS variants of the BFV homomorphic encryption scheme. IEEE TETC (2019)
Albrecht, M., Bai, S., Ducas, L.: A subfield lattice attack on overstretched NTRU assumptions. In: Robshaw, M., Katz, J. (eds.) CRYPTO 2016. LNCS, vol. 9814, pp. 153–178. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53018-4_6
Albrecht, M., et al.: Homomorphic encryption security standard. Technical report, HomomorphicEncryption.org, Toronto, Canada (2018)
Bajard, J.-C., Eynard, J., Hasan, M.A., Zucca, V.: A full RNS variant of FV like somewhat homomorphic encryption schemes. In: Avanzi, R., Heys, H. (eds.) SAC 2016. LNCS, vol. 10532, pp. 423–442. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69453-5_23
Bajard, J.-C., Eynard, J., Martins, P., Sousa, L., Zucca, V.: Note on the noise growth of the RNS variants of the BFV scheme
Bajard, J.-C., Eynard, J., Merkiche, N.: Montgomery reduction within the context of residue number system arithmetic. J. Cryptogr. Eng. 8(3), 189–200 (2017). https://doi.org/10.1007/s13389-017-0154-9
Barrett, P.: Implementing the Rivest Shamir and Adleman public key encryption algorithm on a standard digital signal processor. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263, pp. 311–323. Springer, Heidelberg (1987). https://doi.org/10.1007/3-540-47721-7_24
Chen, H., Han, K., Huang, Z., Jalali, A., Laine, K.: Simple encrypted arithmetic library v2. 3.0. Microsoft (2017)
Cheon, J.H., Kim, A., Kim, M., Song, Y.: Homomorphic encryption for arithmetic of approximate numbers. In: Takagi, T., Peyrin, T. (eds.) ASIACRYPT 2017. LNCS, vol. 10624, pp. 409–437. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70694-8_15
Cilardo, A., Argenziano, D.: Securing the cloud with reconfigurable computing: an FPGA accelerator for homomorphic encryption. In: 2016 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 1622–1627 (2016)
Cousins, D., Rohloff, K., Polyakov, Y., Ryan, G.J.: The PALISADE lattice cryptography library (2015–2020). https://palisade-crypto.org/
Cousins, D.B., Rohloff, K., Sumorok, D.: Designing an FPGA-accelerated homomorphic encryption co-processor. IEEE ToETiC 5(2), 193–206 (2017)
Crandall, R., Pomerance, C.B.: Prime Numbers: A Computational Perspective, vol. 182. Springer, New York (2006). https://doi.org/10.1007/978-1-4684-9316-0
Doröz, Y., Öztürk, E., Sunar, B.: A million-bit multiplier architecture for fully homomorphic encryption. Microprocess. Microsyst. 38(8), 766–775 (2014)
Duarte, J.P., et al.: BSIM-CMG: standard FinFET compact model for advanced circuit design. In: ESSCIRC, pp. 196–201, September 2015
C. D. Environment: Cadence design systems. Inc. (2005). www.cadence.com (2005)
Fan, J., Vercauteren, F.: Somewhat practical fully homomorphic encryption. IACR Cryptology ePrint Archive 2012:144 (2012)
Gentry, C., et al.: Fully homomorphic encryption using ideal lattices. STOC 9, 169–178 (2009)
Halevi, S., Polyakov, Y., Shoup, V.: An improved RNS variant of the BFV homomorphic encryption scheme. In: Matsui, M. (ed.) CT-RSA 2019. LNCS, vol. 11405, pp. 83–105. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12612-4_5
Halevi, S., Shoup, V.: Bootstrapping for HElib. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015. LNCS, vol. 9056, pp. 641–670. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46800-5_25
Jayet-Griffon, C., Cornelie, M., Maistri, P., Elbaz-Vincent, P., Leveugle, R.: Polynomial multipliers for fully homomorphic encryption on FPGA. In: 2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig), pp. 1–6 (2015)
Kim, S., Lee, K., Cho, W., Nam, Y., Cheon, J.H., Rutenbar, R.A.: Hardware architecture of a number theoretic transform for a bootstrappable RNS-based homomorphic encryption scheme. In: IEEE FCCM, pp. 56–64. IEEE (2020)
Langlois, A., Stehlé, D.: Hardness of decision (R) LWE for any modulus. Technical report, Citeseer (2012)
Lepoint, T.: FV-NFLlib: library implementing the Fan-Vercauteren homomorphic encryption scheme
Longa, P., Naehrig, M.: Speeding up the number theoretic transform for faster ideal lattice-based cryptography. In: Foresti, S., Persiano, G. (eds.) CANS 2016. LNCS, vol. 10052, pp. 124–139. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48965-0_8
López-Alt, A., Tromer, E., Vaikuntanathan, V.: On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption. In: ACM STOC, pp. 1219–1234 (2012)
Oder, T., Güneysu, T., Valencia, F., Khalid, A., O’Neill, M., Regazzoni, F.: Lattice-based cryptography: from reconfigurable hardware to ASIC. In: ISIC, pp. 1–4 (2016)
Öztürk, E., Doröz, Y., Sunar, B., Savas, E.: Accelerating somewhat homomorphic evaluation using FPGAS. IACR Cryptology ePrint Archive 2015:294 (2015)
Pöppelmann, T., Naehrig, M., Putnam, A., Macias, A.: Accelerating homomorphic evaluation on reconfigurable hardware. In: Güneysu, T., Handschuh, H. (eds.) CHES 2015. LNCS, vol. 9293, pp. 143–163. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48324-4_8
Pourbigharaz, F., Yassine, H.M.: Intermediate signed-digit stage to perform residue to binary transformations based on CRT. In: IEEE ISCAS, vol. 2, pp. 353–356 (1994)
Rader, C.M.: Discrete convolutions via Mersenne transforms. IEEE Trans. Comput. C-21, 1269–1273 (1972)
Reis, D., Takeshita, J., Jung, T., Niemier, M., Hu, X.S.: Computing-in-memory for performance and energy-efficient homomorphic encryption. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 28(11), 2300–2313 (2020)
Riazi, M.S., Laine, K., Pelton, B., Dai, W.: HEAX: an architecture for computing on encrypted data. In: ACM ASPLOS 2020, pp. 1295–1309 (2020)
Roy, S.S., Turan, F., Jarvinen, K., Vercauteren, F., Verbauwhede, I.: FPGA-based high-performance parallel architecture for homomorphic computing on encrypted data. In: 2019 IEEE International Symposium on HPCA, pp. 387–398. IEEE (2019)
Shenoy, A., Kumaresan, R.: Fast base extension using a redundant modulus in RNS. IEEE Trans. Comput. 38(2), 292–297 (1989)
Synopsys Inc.: HSPICE. Version O-2018.09-1 (2018)
Takeshita, J., Schoenbauer, M., Karl, R., Jung, T.: Enabling faster operations for deeper circuits in full RNS variants of FV-like somewhat homomorphic encryption
Wang, W., Swamy, M., Ahmad, M.O., Wang, Y.: A study of the residue-to-binary converters for the three-moduli sets. IEEE ToCS I Fund. Theory Appl. 50(2), 235–243 (2003)
Öztürk, E., Doröz, Y., Savaş, E., Sunar, B.: A custom accelerator for homomorphic encryption applications. IEEE Trans. Comput. 66(1), 3–16 (2017)
Acknowledgements
The authors thank Matthew Schoenbauer (University of Notre Dame), Carl Pomerance (Dartmouth College), Kim Laine (Microsoft Research), and M. Sadegh Riazi (UC San Diego) for their insights.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Proofs for Novel Optimizations
In this section, we present proofs of correctness for our novel RNS optimizations.
1.1 A.1 Proof of Theorem 1
Proof
If x is even, then \(x = 2y\), and \(x\cdot 2^{g-1} = (2y)2^{g-1} = 2^g y \equiv y \mod 2^g - 1\). If x is odd, then \(x = 2y+1\), and \(x2^{g-1} = (2y+1)2^{g-1} = y2^g + 2^{g-1} \equiv 2^{g-1} + y \mod (2^g - 1)\). \(\square \)
1.2 A.2 Proof of Lemma 2
Proof
\(x(2^g - 1) = x\cdot 2^g - x \equiv -x \mod 2^g\). \(\square \)
1.3 A.3 Proof of Theorem 2
Proof
If x is even, then \(x = 2y\), and \(x(2^{g-1}+1) = (2y)(2^{g-1}+1) = 2^g y + 2y \equiv -y + 2y \equiv y \mod 2^g + 1\). If x is odd, then \(x = 2y+1\), and \(x(2^{g-1}+1) = (2y+1)(2^{g-1}+1) = y2^g + 2y + 2^{g-1} + 1 \equiv -y + 2y + 2^{g-1} + 1 \equiv y + 2^{g-1} + 1 \mod 2^g - 1\). \(\square \)
B Proofs for Fermat-like Coprimes
Let the terms \(q_i\) be elements of \(S_2 = \{2^m-1, 2^m+1, 2^{2m}+1, 2^{4m}+1, \cdots 2^{2^{f-1}m}+1, 2^{2^f m}+1\}\), as in Sect. 3.2. Then the following results hold [32]:
Lemma 3
For \(q_i \in S_2\) with \(q_0 = 2^m-1\), \(|\frac{q_0}{q}|_{q_0} = 2^{m-(k-1)}\).
Proof
Note that \(\frac{q}{q_0} = (2^m+1)(2^{2m})(2^{4m})\cdots (2^{2^f m})\). Because \(2^m\) is equal to 1 modulo \(q_0\), each of the \(f+1\) terms in this product is equal to two. Thus \(|\frac{q}{q_0}|_{q_0} = |2^{f+1}|_{q_0}\). The inverse of this is \(|\frac{q_0}{q}|_{q_0} = 2^{m-(f+1)} = 2^{m-(k-1)}\). \(\square \)
Lemma 4
For \(q_i = 2^{2^{i} m}+1\), \(i \in [1,k]\), \(|\frac{q_i}{q}|_{q_i}\) is \(2^{2^{i-1}m-(f-i+2)}\).
Proof
Note that \(\frac{q}{q_i} = (2^{2^{i} m}-1) (2^{2^{i+1} m}+1) (2^{2^{i+2} m}+1) \cdots (2^{2^{f} m}+1)\). We see that \(|(2^{2^{i} m}-1)|_{q_i} = |(-1)-1|_{q_i} = |-2|_{q_i}\). For the remaining terms \(2^{2^{j} m}+1\) (for \(j \in [i+1, f]\)), we have \(|2^{2^{j} m}+1|_{q_i} = |(2^{2^{i-1} m})^{2^{j-(i-1)}}+1|_{q_i} = |(-1)^{2^{j-(i-2)}}+1|_{q_i}\). Because \(j > i\), this is equal to \(|1+1|_{q_i} = 2\). Combining these, we see that \(|\frac{q}{q_i}|_{q_i} = |(-2)2^{f-(i-1)}|_{q_i} = |-2^{f-i+2}|_{q_i}\). Then the inverse term \(|\frac{q_i}{q}|_{q_i}\) is \(2^{2^{i-1}m-(f-i+2)}\). \(\square \)
In both of these terms, the exponent of two should always be positive; if this is not the case then too many moduli have been chosen for too small a dynamic range [39].
C NTT Algorithm
Algorithm 7 gives the algorithm of the Number-Theoretic Transform.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Takeshita, J., Reis, D., Gong, T., Niemier, M., Hu, X.S., Jung, T. (2021). Algorithmic Acceleration of B/FV-Like Somewhat Homomorphic Encryption for Compute-Enabled RAM. In: Dunkelman, O., Jacobson, Jr., M.J., O'Flynn, C. (eds) Selected Areas in Cryptography. SAC 2020. Lecture Notes in Computer Science(), vol 12804. Springer, Cham. https://doi.org/10.1007/978-3-030-81652-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-81652-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-81651-3
Online ISBN: 978-3-030-81652-0
eBook Packages: Computer ScienceComputer Science (R0)