Abstract
Particular instantiations of the Offset Merkle Damgård authenticated encryption scheme (OMD) represent highly secure alternatives for AES-GCM. It is already a fact that OMD can be efficiently implemented in software. Given this, in our paper we focus on speeding-up OMD in hardware, more precisely on FPGA platforms. Thus, we propose a new OMD instantiation based on the compression function of BLAKE2b. Moreover, to the best of our knowledge, we present the first FPGA implementation results for the SHA-512 instantiation of OMD as well as the first architecture of an online authenticated encryption system based on OMD.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
All the withdrawn schemes are listed on the competition’s website. Almost all the withdrawn submissions are due to attacks reported by the community. It can easily be observed that for OMD no attack was presented.
- 2.
Throughput to Area ratio.
- 3.
Number used only once.
- 4.
From a security perspective.
- 5.
xcvu9p-flga2104-2L-e.
- 6.
Moreover, the attack does not apply to the OMD CAESAR submission and to the misuse-resistant variants of [14].
References
OMDv2 CAESAR Submission. https://competitions.cr.yp.to/round2/omdv20c.pdf
Password Hashing Competition. https://password-hashing.net
Source Code. https://github.com/megastefan22/OMD
Ashur, T., Mennink, B.: Trivial Nonce-misusing attack on pure OMD. IACR Cryptology ePrint Archive (2015)
Aumasson, J.-P., Meier, W., Phan, R.C.-W., Henzen, L.: The Hash Function BLAKE. ISC. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44757-4
Cogliani, S., et al.: OMD: a compression function mode of operation for authenticated encryption. In: Joux, A., Youssef, A. (eds.) SAC 2014. LNCS, vol. 8781, pp. 112–128. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13051-4_7
Diehl, W., Gaj, K.: RTL implementations and FPGA benchmarking of selected CAESAR round two authenticated ciphers. Microprocess. Microsyst. (2017). https://www.sciencedirect.com/science/article/abs/pii/S0141933117300352
Homsirikamol, E., Rogawski, M., Gaj, K.: Comparing hardware performance of fourteen round two SHA-3 candidates using FPGAs. IACR Cryptology ePrint Archive (2010). http://eprint.iacr.org/2010/445
Krovetz, T., Rogaway, P.: The software performance of authenticated-encryption modes. In: Joux, A. (ed.) FSE 2011. LNCS, vol. 6733, pp. 306–327. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21702-9_18
Maimuţ, D.: Authentication and encryption protocols: design, attacks and algorithmic tools. Ph.D. thesis, École normale supérieure (2015)
Maimuţ, D., Reyhanitabar, R.: Authenticated encryption: toward next-generation algorithms. IEEE Secur. Privacy 12(2), 70–72 (2014)
National Institute of Standards and Technology: FIPS PUB 180–4: Secure Hash Standard. NIST, August 2015
Reyhanitabar, R., Vaudenay, S., Vizár, D.: Misuse-resistant variants of the OMD authenticated encryption mode. In: Chow, S.S.M., Liu, J.K., Hui, L.C.K., Yiu, S.M. (eds.) ProvSec 2014. LNCS, vol. 8782, pp. 55–70. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12475-9_5
Reyhanitabar, R., Vaudenay, S., Vizár, D.: Boosting OMD for almost free authentication of associated data. In: Leander, G. (ed.) FSE 2015. LNCS, vol. 9054, pp. 411–427. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48116-5_20
Rogaway, P.: Authenticated-encryption with associated-data. In: CCS 2002, pp. 98–107. ACM (2002)
Rogaway, P.: Nonce-based symmetric encryption. In: Roy, B., Meier, W. (eds.) FSE 2004. LNCS, vol. 3017, pp. 348–358. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25937-4_22
Acknowledgments
The authors would like to thank Traian Neacşa and George Teşeleanu for their helpful comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A OMD Pseudocode
B The sha-512 and blake2b Compression Functions
1.1 B.1 Preliminaries
In the following, by “word” we mean a group of \(w = 64\) bits. Namely, in sha-512 each word is a 64-bit string.
\(ROTR^n(x)\) and \(SHR^n(x)\): Let x be a w-bit word and n an integer with \(0 \le n < w\). The rotate right (circular right shift) operation is defined by \(ROTR^n(x) = (x \gg n)\vee (x \ll w-n)\). The right shift operation is defined by \(SHR^n(x) = (x \gg n)\).
Choice and Majority Functions. The choice function and majority function (also called the median operator) functions can be defined as follows:
1.2 B.2 The sha-512 Compression Function
Sigma Functions. The functions \(\varSigma ^{\{512\}}_0\) and \(\varSigma ^{\{512\}}_1\) are defined as follows:
The \(\sigma ^{\{512\}}_0\) and \(\sigma ^{\{512\}}_1\) functions are defined as follows:
The Process. The sha-512 compression function is defined as:
Let M be the 1024-bit message input and H the 512-bit hash input (chaining input). These two inputs are represented respectively by an array of 16 64-bit words \(M_0 \Vert \cdots \Vert M_{15}\), and an array of 8 64-bit words \(H_0 \Vert \cdots \Vert H_7\). The 512-bit output value C is also represented as an array of 8 64-bit words \(D_0 \Vert \cdots \Vert D_7\).
Let H be the 512-bit hash input (chaining input) and M be the 1024-bit message input. These two inputs are represented respectively by an array of 8 64-bit words \(H_0 \Vert \cdots \Vert H_7\) (see Table 2) and an array of 16 64-bit words \(M_0 \Vert \cdots \Vert M_{15}\). The 512-bit output value D is also represented as an array of 8 64-bit words \(D_0 \Vert \cdots \Vert D_7\).
During the process of compression, a sequence of 80 constant 64-bit words \(K^{ \{ 512 \} }_0, \) \(..., K^{ \{ 512 \} }_{79}\) is used. These 64-bit words represent the first 64 bits of the fractional parts of the cube roots of the first 80 prime numbers. In hex, these constant words are given in Table 3 (from left to right).
We further provide the reader with the description of the sha-512 compression function. The addition (\(+\)) is performed modulo \(2^{64}\).
-
1.
Preparing the message schedule, \(\{W_t\}\):
$$ W_t = \left\{ \begin{array}{lr} M_t, &{} 0 \le t \le 15 \\ \sigma ^{\{512\}}_1(W_{t-2}) + W_{t-7} + \sigma ^{\{512\}}_0(W_{t-15}) + W_{t-16}, &{} 16 \le t \le 79 \end{array} \right. $$ -
2.
Initialize the eight working variables, a, b, c, d, e, f, g and h with the hash input value H:
\(a = H_0 \qquad b = H_1 \qquad c = H_2 \qquad d = H_3\)
\(e = H_4 \qquad f = H_5 \qquad g = H_6 \qquad h = H_7\)
-
2.
For \(t=0\) to 79, do:
{
\(T_1\ =\ h+\displaystyle {\varSigma ^{\{512\}}_1}(e)+Ch(e,f,g)+K_t^{ \{ 512 \} }+W_t\)
\(T_2 = \displaystyle {\varSigma _0^{ \{ 512 \} }}(a)+Maj(a,b,c)\)
\(h = g \qquad g = f \qquad f = e \qquad e = d+ T_1\)
\(d = c \qquad c = b \qquad ~b = a \qquad a = T_1 + T_2\)
}
-
3.
Computing the 512-bit output (hash) value \(C=C_0 \cdots C_7\) as:
\(C_0 = a + H_0 \qquad C_1 = b + H_1 \qquad C_2 = c + H_2 \qquad C_3 = d + H_3\)
\(C_4 = e + H_4 \qquad C_5 = f + H_5 \qquad C_6 = g + H_6 \qquad C_7 = h + H_7\)
1.3 B.3 The blake2b Compression Function
The initial values H of blake2b was chosen precisely as the ones for SHA-512 (given in Table 2). These values “were obtained by taking the first sixty-four bits of the fractional parts of the square roots of the first eight prime numbers”, according to [13].
Thus, the compression function blake2b takes as input:
Let the round permutations \(\sigma _r\) be in accordance with Table 4, where \(r=\overline{0,9}\). Note that for rounds \(r \ge 10\) the permutation used is \(\sigma _{r \bmod 10}\). The core function G of blake2b is defined as follows:
\( a := \ a + b + m_{\sigma _r(2i)} \quad \ \ \ d := \ ROTR^{32}(d \oplus a) \quad c := \ c + d \quad b := \ ROTR^{24}(b \oplus c) \)
\(a := \ a + b + m_{\sigma _r(2i+1)} \quad d := \ ROTR^{16}(d \oplus a) \quad c := \ c + d \quad b := \ ROTR^{63}(b \oplus ~c)\)
C Explicit Performance Metrics
The main performance metrics which are used in our work are throughput, area and throughput to area ratio (Tp/A). They are presented in Figs. 4, 5 and 6 in comparison with the block size of a message (which is represented on the x axis in all plots). The block size n goes from 64 bytes to 9600 bytes (we chose these values in order to meet the requirements of an online system). Both OMD-sha-512 and OMD-blake2b instantiations are taken into consideration.
All formulas used to generate the plots are based on the metrics described in Table 1. We recall that, during the following, Setup represents the number of clock cycles necessary in the initialization phase, Message refers to the number of clock cycles necessary to process a message composed of n blocks, Tag represents the number of clock cycles necessary to calculate Tag and Frequency refers to the frequency of the FPGA circuit.
In Fig. 4 we use latency as the parameter represented by the y axis. We computed latency by the following formula:
We computed the throughput by applying the following formula:
The computation of Tp/A is straightforward.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Maimuţ, D., Mega, Ş.A. (2020). Speeding up OMD Instantiations in Hardware. In: Simion, E., Géraud-Stewart, R. (eds) Innovative Security Solutions for Information Technology and Communications. SecITC 2019. Lecture Notes in Computer Science(), vol 12001. Springer, Cham. https://doi.org/10.1007/978-3-030-41025-4_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-41025-4_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41024-7
Online ISBN: 978-3-030-41025-4
eBook Packages: Computer ScienceComputer Science (R0)