Speeding up OMD Instantiations in Hardware

Maimuţ, Diana; Mega, Ştefan Alexandru

doi:10.1007/978-3-030-41025-4_13

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12001))

Included in the following conference series:

International Conference on Information Technology and Communications Security

399 Accesses

Abstract

Particular instantiations of the Offset Merkle Damgård authenticated encryption scheme (OMD) represent highly secure alternatives for AES-GCM. It is already a fact that OMD can be efficiently implemented in software. Given this, in our paper we focus on speeding-up OMD in hardware, more precisely on FPGA platforms. Thus, we propose a new OMD instantiation based on the compression function of BLAKE2b. Moreover, to the best of our knowledge, we present the first FPGA implementation results for the SHA-512 instantiation of OMD as well as the first architecture of an online authenticated encryption system based on OMD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
All the withdrawn schemes are listed on the competition’s website. Almost all the withdrawn submissions are due to attacks reported by the community. It can easily be observed that for OMD no attack was presented.
2.
Throughput to Area ratio.
3.
Number used only once.
4.
From a security perspective.
5.
xcvu9p-flga2104-2L-e.
6.
Moreover, the attack does not apply to the OMD CAESAR submission and to the misuse-resistant variants of [14].

References

CAESAR. https://competitions.cr.yp.to/caesar.html
OMDv2 CAESAR Submission. https://competitions.cr.yp.to/round2/omdv20c.pdf
Password Hashing Competition. https://password-hashing.net
Source Code. https://github.com/megastefan22/OMD
Ashur, T., Mennink, B.: Trivial Nonce-misusing attack on pure OMD. IACR Cryptology ePrint Archive (2015)
Google Scholar
Aumasson, J.-P., Meier, W., Phan, R.C.-W., Henzen, L.: The Hash Function BLAKE. ISC. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44757-4
Book MATH Google Scholar
Cogliani, S., et al.: OMD: a compression function mode of operation for authenticated encryption. In: Joux, A., Youssef, A. (eds.) SAC 2014. LNCS, vol. 8781, pp. 112–128. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13051-4_7
Chapter Google Scholar
Diehl, W., Gaj, K.: RTL implementations and FPGA benchmarking of selected CAESAR round two authenticated ciphers. Microprocess. Microsyst. (2017). https://www.sciencedirect.com/science/article/abs/pii/S0141933117300352
Homsirikamol, E., Rogawski, M., Gaj, K.: Comparing hardware performance of fourteen round two SHA-3 candidates using FPGAs. IACR Cryptology ePrint Archive (2010). http://eprint.iacr.org/2010/445
Krovetz, T., Rogaway, P.: The software performance of authenticated-encryption modes. In: Joux, A. (ed.) FSE 2011. LNCS, vol. 6733, pp. 306–327. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21702-9_18
Chapter MATH Google Scholar
Maimuţ, D.: Authentication and encryption protocols: design, attacks and algorithmic tools. Ph.D. thesis, École normale supérieure (2015)
Google Scholar
Maimuţ, D., Reyhanitabar, R.: Authenticated encryption: toward next-generation algorithms. IEEE Secur. Privacy 12(2), 70–72 (2014)
Article Google Scholar
National Institute of Standards and Technology: FIPS PUB 180–4: Secure Hash Standard. NIST, August 2015
Google Scholar
Reyhanitabar, R., Vaudenay, S., Vizár, D.: Misuse-resistant variants of the OMD authenticated encryption mode. In: Chow, S.S.M., Liu, J.K., Hui, L.C.K., Yiu, S.M. (eds.) ProvSec 2014. LNCS, vol. 8782, pp. 55–70. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12475-9_5
Chapter Google Scholar
Reyhanitabar, R., Vaudenay, S., Vizár, D.: Boosting OMD for almost free authentication of associated data. In: Leander, G. (ed.) FSE 2015. LNCS, vol. 9054, pp. 411–427. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48116-5_20
Chapter Google Scholar
Rogaway, P.: Authenticated-encryption with associated-data. In: CCS 2002, pp. 98–107. ACM (2002)
Google Scholar
Rogaway, P.: Nonce-based symmetric encryption. In: Roy, B., Meier, W. (eds.) FSE 2004. LNCS, vol. 3017, pp. 348–358. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25937-4_22
Chapter MATH Google Scholar

Download references

Acknowledgments

The authors would like to thank Traian Neacşa and George Teşeleanu for their helpful comments.

Author information

Authors and Affiliations

Advanced Technologies Institute, 10 Dinu Vintilă, Bucharest, Romania
Diana Maimuţ & Ştefan Alexandru Mega
Politehnica University of Bucharest, Bucharest, Romania
Ştefan Alexandru Mega

Authors

Diana Maimuţ
View author publications
You can also search for this author in PubMed Google Scholar
Ştefan Alexandru Mega
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diana Maimuţ .

Editor information

Editors and Affiliations

Polytechnic University of Bucharest, Bucharest, Romania
Emil Simion
École Normale Supérieure, Paris, France
Rémi Géraud-Stewart

Appendices

A OMD Pseudocode

B The sha-512 and blake2b Compression Functions

1.1 B.1 Preliminaries

In the following, by “word” we mean a group of $w = 64$ bits. Namely, in sha-512 each word is a 64-bit string.

$ROTR^n(x)$ and $SHR^n(x)$: Let x be a w-bit word and n an integer with $0 \le n < w$. The rotate right (circular right shift) operation is defined by $ROTR^n(x) = (x \gg n)\vee (x \ll w-n)$. The right shift operation is defined by $SHR^n(x) = (x \gg n)$.

Choice and Majority Functions. The choice function and majority function (also called the median operator) functions can be defined as follows:

$$\begin{aligned} Ch: \left| \begin{array}{rcl} \{0,1\}^{m} \times \{0,1\}^{m} \times \{0,1\}^{m}&{} \longrightarrow &{} \{0,1\}^{m} \\ x, y, z &{} \longmapsto &{} (x \wedge y) \oplus (\lnot x \wedge z) \\ \end{array} \right. \end{aligned}$$

$$\begin{aligned} Maj: \left| \begin{array}{rcl} \{0,1\}^{m} \times \{0,1\}^{m} \times \{0,1\}^{m}&{} \longrightarrow &{} \{0,1\}^{m} \\ x, y, z &{} \longmapsto &{} (x \wedge y) \oplus (x \wedge z) \oplus (y \wedge z) \\ \end{array} \right. \end{aligned}$$

1.2 B.2 The sha-512 Compression Function

Sigma Functions. The functions $\varSigma ^{\{512\}}_0$ and $\varSigma ^{\{512\}}_1$ are defined as follows:

$$\begin{aligned} \varSigma ^{\{512\}}_0 \left| \begin{array}{rcl} \{0,1\}^{64} &{} \longrightarrow &{} \{0,1\}^{64} \\ x &{} \longmapsto &{}ROTR^{28}(x) \oplus ROTR^{34}(x) \oplus ROTR^{39}(x) \\ \end{array} \right. \end{aligned}$$

$$\begin{aligned} \varSigma ^{\{512\}}_1 \left| \begin{array}{rcl} \{0,1\}^{64} &{} \longrightarrow &{} \{0,1\}^{64} \\ x &{} \longmapsto &{} ROTR^{14}(x) \oplus ROTR^{18}(x) \oplus ROTR^{41}(x) \\ \end{array} \right. \end{aligned}$$

The $\sigma ^{\{512\}}_0$ and $\sigma ^{\{512\}}_1$ functions are defined as follows:

$$\begin{aligned} \sigma ^{\{512\}}_0 \left| \begin{array}{rcl} \{0,1\}^{64} &{} \longrightarrow &{} \{0,1\}^{64} \\ x &{} \longmapsto &{}ROTR^{1}(x) \oplus ROTR^{8}(x) \oplus SHR^{7}(x) \\ \end{array} \right. \end{aligned}$$

$$\begin{aligned} \sigma ^{\{512\}}_0 \left| \begin{array}{rcl} \{0,1\}^{64} &{} \longrightarrow &{} \{0,1\}^{64} \\ x &{} \longmapsto &{} SHR^{19}(x) \oplus SHR^{61}(x) \oplus SHR^{6}(x) \\ \end{array} \right. \end{aligned}$$

The Process. The sha-512 compression function is defined as:

$$\begin{aligned} sha-512 \left| \begin{array}{rcl} \{0,1\}^{512} \times \{0,1\}^{1024} &{} \longrightarrow &{} \{0,1\}^{512} \\ H, M &{} \longmapsto &{} D \\ \end{array} \right. \end{aligned}$$

Let M be the 1024-bit message input and H the 512-bit hash input (chaining input). These two inputs are represented respectively by an array of 16 64-bit words $M_0 \Vert \cdots \Vert M_{15}$, and an array of 8 64-bit words $H_0 \Vert \cdots \Vert H_7$. The 512-bit output value C is also represented as an array of 8 64-bit words $D_0 \Vert \cdots \Vert D_7$.

Let H be the 512-bit hash input (chaining input) and M be the 1024-bit message input. These two inputs are represented respectively by an array of 8 64-bit words $H_0 \Vert \cdots \Vert H_7$ (see Table 2) and an array of 16 64-bit words $M_0 \Vert \cdots \Vert M_{15}$. The 512-bit output value D is also represented as an array of 8 64-bit words $D_0 \Vert \cdots \Vert D_7$.

Table 2. sha-512 initial values

Full size table

During the process of compression, a sequence of 80 constant 64-bit words $K^{ \{ 512 \} }_0, $ $..., K^{ \{ 512 \} }_{79}$ is used. These 64-bit words represent the first 64 bits of the fractional parts of the cube roots of the first 80 prime numbers. In hex, these constant words are given in Table 3 (from left to right).

Table 3. sha-512 constants

Full size table

We further provide the reader with the description of the sha-512 compression function. The addition ($+$) is performed modulo $2^{64}$.

1.
Preparing the message schedule, $\{W_t\}$:
$$ W_t = \left\{ \begin{array}{lr} M_t, &{} 0 \le t \le 15 \\ \sigma ^{\{512\}}_1(W_{t-2}) + W_{t-7} + \sigma ^{\{512\}}_0(W_{t-15}) + W_{t-16}, &{} 16 \le t \le 79 \end{array} \right. $$
2.
Initialize the eight working variables, a, b, c, d, e, f, g and h with the hash input value H:

$a = H_0 \qquad b = H_1 \qquad c = H_2 \qquad d = H_3$

$e = H_4 \qquad f = H_5 \qquad g = H_6 \qquad h = H_7$
2.
For $t=0$ to 79, do:

{

$T_1\ =\ h+\displaystyle {\varSigma ^{\{512\}}_1}(e)+Ch(e,f,g)+K_t^{ \{ 512 \} }+W_t$

$T_2 = \displaystyle {\varSigma _0^{ \{ 512 \} }}(a)+Maj(a,b,c)$

$h = g \qquad g = f \qquad f = e \qquad e = d+ T_1$

$d = c \qquad c = b \qquad ~b = a \qquad a = T_1 + T_2$

}
3.
Computing the 512-bit output (hash) value $C=C_0 \cdots C_7$ as:

$C_0 = a + H_0 \qquad C_1 = b + H_1 \qquad C_2 = c + H_2 \qquad C_3 = d + H_3$

$C_4 = e + H_4 \qquad C_5 = f + H_5 \qquad C_6 = g + H_6 \qquad C_7 = h + H_7$

1.3 B.3 The blake2b Compression Function

The initial values H of blake2b was chosen precisely as the ones for SHA-512 (given in Table 2). These values “were obtained by taking the first sixty-four bits of the fractional parts of the square roots of the first eight prime numbers”, according to [13].

Thus, the compression function blake2b takes as input:

$$\begin{aligned} \begin{pmatrix} \nu _0 &{} \nu _1 &{} \nu _2 &{} \nu _3 \\ \nu _4 &{} \nu _5 &{} \nu _6 &{} \nu _7 \\ \nu _8 &{} \nu _9 &{} \nu _{10} &{} \nu _{11} \\ \nu _{12} &{} \nu _{13} &{} \nu _{14} &{} \nu _{15} \end{pmatrix}&:= \begin{pmatrix} h_0 &{} h_1 &{} h_2 &{} h_3 \\ h_4 &{} h_5 &{} h_6 &{} h_7 \\ H_0 &{} H_1 &{} H_2 &{} H_3 \\ T_0 \oplus H_4 &{} T_1 \oplus H_5 &{} F_0 \oplus H_6 &{} F_1 \oplus H_7 \end{pmatrix} \end{aligned}$$

Table 4. Permutations of blake2b

Full size table

Let the round permutations $\sigma _r$ be in accordance with Table 4, where $r=\overline{0,9}$. Note that for rounds $r \ge 10$ the permutation used is $\sigma _{r \bmod 10}$. The core function G of blake2b is defined as follows:

$ a := \ a + b + m_{\sigma _r(2i)} \quad \ \ \ d := \ ROTR^{32}(d \oplus a) \quad c := \ c + d \quad b := \ ROTR^{24}(b \oplus c) $

$a := \ a + b + m_{\sigma _r(2i+1)} \quad d := \ ROTR^{16}(d \oplus a) \quad c := \ c + d \quad b := \ ROTR^{63}(b \oplus ~c)$

C Explicit Performance Metrics

The main performance metrics which are used in our work are throughput, area and throughput to area ratio (Tp/A). They are presented in Figs. 4, 5 and 6 in comparison with the block size of a message (which is represented on the x axis in all plots). The block size n goes from 64 bytes to 9600 bytes (we chose these values in order to meet the requirements of an online system). Both OMD-sha-512 and OMD-blake2b instantiations are taken into consideration.

All formulas used to generate the plots are based on the metrics described in Table 1. We recall that, during the following, Setup represents the number of clock cycles necessary in the initialization phase, Message refers to the number of clock cycles necessary to process a message composed of n blocks, Tag represents the number of clock cycles necessary to calculate Tag and Frequency refers to the frequency of the FPGA circuit.

In Fig. 4 we use latency as the parameter represented by the y axis. We computed latency by the following formula:

$$\text {Latency} = \text {CLK} \cdot (\text {Setup} + n \cdot \text {Message} + \text {Tag}) \cdot 1/\text {Frequency}.$$

We computed the throughput by applying the following formula:

$$ \text {Throughput} = \frac{n \cdot 64 \cdot 8 \cdot \text {Frequency}}{\text {CLK} \cdot (\text {Setup} + n \cdot \text {Message} + \text {Tag}) }. $$

The computation of Tp/A is straightforward.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maimuţ, D., Mega, Ş.A. (2020). Speeding up OMD Instantiations in Hardware. In: Simion, E., Géraud-Stewart, R. (eds) Innovative Security Solutions for Information Technology and Communications. SecITC 2019. Lecture Notes in Computer Science(), vol 12001. Springer, Cham. https://doi.org/10.1007/978-3-030-41025-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-41025-4_13
Published: 28 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41024-7
Online ISBN: 978-3-030-41025-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Speeding up OMD Instantiations in Hardware

Abstract

Access this chapter

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

A OMD Pseudocode

B The sha-512 and blake2b Compression Functions

1.1 B.1 Preliminaries

1.2 B.2 The sha-512 Compression Function

1.3 B.3 The blake2b Compression Function

C Explicit Performance Metrics

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation