Skip to main content

Speeding up OMD Instantiations in Hardware

  • Conference paper
  • First Online:
Innovative Security Solutions for Information Technology and Communications (SecITC 2019)

Abstract

Particular instantiations of the Offset Merkle Damgård authenticated encryption scheme (OMD) represent highly secure alternatives for AES-GCM. It is already a fact that OMD can be efficiently implemented in software. Given this, in our paper we focus on speeding-up OMD in hardware, more precisely on FPGA platforms. Thus, we propose a new OMD instantiation based on the compression function of BLAKE2b. Moreover, to the best of our knowledge, we present the first FPGA implementation results for the SHA-512 instantiation of OMD as well as the first architecture of an online authenticated encryption system based on OMD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    All the withdrawn schemes are listed on the competition’s website. Almost all the withdrawn submissions are due to attacks reported by the community. It can easily be observed that for OMD no attack was presented.

  2. 2.

    Throughput to Area ratio.

  3. 3.

    Number used only once.

  4. 4.

    From a security perspective.

  5. 5.

    xcvu9p-flga2104-2L-e.

  6. 6.

    Moreover, the attack does not apply to the OMD CAESAR submission and to the misuse-resistant variants of [14].

References

  1. CAESAR. https://competitions.cr.yp.to/caesar.html

  2. OMDv2 CAESAR Submission. https://competitions.cr.yp.to/round2/omdv20c.pdf

  3. Password Hashing Competition. https://password-hashing.net

  4. Source Code. https://github.com/megastefan22/OMD

  5. Ashur, T., Mennink, B.: Trivial Nonce-misusing attack on pure OMD. IACR Cryptology ePrint Archive (2015)

    Google Scholar 

  6. Aumasson, J.-P., Meier, W., Phan, R.C.-W., Henzen, L.: The Hash Function BLAKE. ISC. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44757-4

    Book  MATH  Google Scholar 

  7. Cogliani, S., et al.: OMD: a compression function mode of operation for authenticated encryption. In: Joux, A., Youssef, A. (eds.) SAC 2014. LNCS, vol. 8781, pp. 112–128. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13051-4_7

    Chapter  Google Scholar 

  8. Diehl, W., Gaj, K.: RTL implementations and FPGA benchmarking of selected CAESAR round two authenticated ciphers. Microprocess. Microsyst. (2017). https://www.sciencedirect.com/science/article/abs/pii/S0141933117300352

  9. Homsirikamol, E., Rogawski, M., Gaj, K.: Comparing hardware performance of fourteen round two SHA-3 candidates using FPGAs. IACR Cryptology ePrint Archive (2010). http://eprint.iacr.org/2010/445

  10. Krovetz, T., Rogaway, P.: The software performance of authenticated-encryption modes. In: Joux, A. (ed.) FSE 2011. LNCS, vol. 6733, pp. 306–327. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21702-9_18

    Chapter  MATH  Google Scholar 

  11. Maimuţ, D.: Authentication and encryption protocols: design, attacks and algorithmic tools. Ph.D. thesis, École normale supérieure (2015)

    Google Scholar 

  12. Maimuţ, D., Reyhanitabar, R.: Authenticated encryption: toward next-generation algorithms. IEEE Secur. Privacy 12(2), 70–72 (2014)

    Article  Google Scholar 

  13. National Institute of Standards and Technology: FIPS PUB 180–4: Secure Hash Standard. NIST, August 2015

    Google Scholar 

  14. Reyhanitabar, R., Vaudenay, S., Vizár, D.: Misuse-resistant variants of the OMD authenticated encryption mode. In: Chow, S.S.M., Liu, J.K., Hui, L.C.K., Yiu, S.M. (eds.) ProvSec 2014. LNCS, vol. 8782, pp. 55–70. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12475-9_5

    Chapter  Google Scholar 

  15. Reyhanitabar, R., Vaudenay, S., Vizár, D.: Boosting OMD for almost free authentication of associated data. In: Leander, G. (ed.) FSE 2015. LNCS, vol. 9054, pp. 411–427. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48116-5_20

    Chapter  Google Scholar 

  16. Rogaway, P.: Authenticated-encryption with associated-data. In: CCS 2002, pp. 98–107. ACM (2002)

    Google Scholar 

  17. Rogaway, P.: Nonce-based symmetric encryption. In: Roy, B., Meier, W. (eds.) FSE 2004. LNCS, vol. 3017, pp. 348–358. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25937-4_22

    Chapter  MATH  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Traian Neacşa and George Teşeleanu for their helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diana Maimuţ .

Editor information

Editors and Affiliations

Appendices

A OMD Pseudocode

figure l
figure m
figure n
figure o

B The sha-512 and blake2b Compression Functions

1.1 B.1 Preliminaries

In the following, by “word” we mean a group of \(w = 64\) bits. Namely, in sha-512 each word is a 64-bit string.

\(ROTR^n(x)\) and \(SHR^n(x)\): Let x be a w-bit word and n an integer with \(0 \le n < w\). The rotate right (circular right shift) operation is defined by \(ROTR^n(x) = (x \gg n)\vee (x \ll w-n)\). The right shift operation is defined by \(SHR^n(x) = (x \gg n)\).

Choice and Majority Functions. The choice function and majority function (also called the median operator) functions can be defined as follows:

$$\begin{aligned} Ch: \left| \begin{array}{rcl} \{0,1\}^{m} \times \{0,1\}^{m} \times \{0,1\}^{m}&{} \longrightarrow &{} \{0,1\}^{m} \\ x, y, z &{} \longmapsto &{} (x \wedge y) \oplus (\lnot x \wedge z) \\ \end{array} \right. \end{aligned}$$
$$\begin{aligned} Maj: \left| \begin{array}{rcl} \{0,1\}^{m} \times \{0,1\}^{m} \times \{0,1\}^{m}&{} \longrightarrow &{} \{0,1\}^{m} \\ x, y, z &{} \longmapsto &{} (x \wedge y) \oplus (x \wedge z) \oplus (y \wedge z) \\ \end{array} \right. \end{aligned}$$

1.2 B.2 The sha-512 Compression Function

Sigma Functions. The functions \(\varSigma ^{\{512\}}_0\) and \(\varSigma ^{\{512\}}_1\) are defined as follows:

$$\begin{aligned} \varSigma ^{\{512\}}_0 \left| \begin{array}{rcl} \{0,1\}^{64} &{} \longrightarrow &{} \{0,1\}^{64} \\ x &{} \longmapsto &{}ROTR^{28}(x) \oplus ROTR^{34}(x) \oplus ROTR^{39}(x) \\ \end{array} \right. \end{aligned}$$
$$\begin{aligned} \varSigma ^{\{512\}}_1 \left| \begin{array}{rcl} \{0,1\}^{64} &{} \longrightarrow &{} \{0,1\}^{64} \\ x &{} \longmapsto &{} ROTR^{14}(x) \oplus ROTR^{18}(x) \oplus ROTR^{41}(x) \\ \end{array} \right. \end{aligned}$$

The \(\sigma ^{\{512\}}_0\) and \(\sigma ^{\{512\}}_1\) functions are defined as follows:

$$\begin{aligned} \sigma ^{\{512\}}_0 \left| \begin{array}{rcl} \{0,1\}^{64} &{} \longrightarrow &{} \{0,1\}^{64} \\ x &{} \longmapsto &{}ROTR^{1}(x) \oplus ROTR^{8}(x) \oplus SHR^{7}(x) \\ \end{array} \right. \end{aligned}$$
$$\begin{aligned} \sigma ^{\{512\}}_0 \left| \begin{array}{rcl} \{0,1\}^{64} &{} \longrightarrow &{} \{0,1\}^{64} \\ x &{} \longmapsto &{} SHR^{19}(x) \oplus SHR^{61}(x) \oplus SHR^{6}(x) \\ \end{array} \right. \end{aligned}$$

The Process. The sha-512 compression function is defined as:

$$\begin{aligned} sha-512 \left| \begin{array}{rcl} \{0,1\}^{512} \times \{0,1\}^{1024} &{} \longrightarrow &{} \{0,1\}^{512} \\ H, M &{} \longmapsto &{} D \\ \end{array} \right. \end{aligned}$$

Let M be the 1024-bit message input and H the 512-bit hash input (chaining input). These two inputs are represented respectively by an array of 16 64-bit words \(M_0 \Vert \cdots \Vert M_{15}\), and an array of 8 64-bit words \(H_0 \Vert \cdots \Vert H_7\). The 512-bit output value C is also represented as an array of 8 64-bit words \(D_0 \Vert \cdots \Vert D_7\).

Let H be the 512-bit hash input (chaining input) and M be the 1024-bit message input. These two inputs are represented respectively by an array of 8 64-bit words \(H_0 \Vert \cdots \Vert H_7\) (see Table 2) and an array of 16 64-bit words \(M_0 \Vert \cdots \Vert M_{15}\). The 512-bit output value D is also represented as an array of 8 64-bit words \(D_0 \Vert \cdots \Vert D_7\).

Table 2. sha-512 initial values

During the process of compression, a sequence of 80 constant 64-bit words \(K^{ \{ 512 \} }_0, \) \(..., K^{ \{ 512 \} }_{79}\) is used. These 64-bit words represent the first 64 bits of the fractional parts of the cube roots of the first 80 prime numbers. In hex, these constant words are given in Table 3 (from left to right).

Table 3. sha-512 constants

We further provide the reader with the description of the sha-512 compression function. The addition (\(+\)) is performed modulo \(2^{64}\).

  1. 1.

    Preparing the message schedule, \(\{W_t\}\):

    $$ W_t = \left\{ \begin{array}{lr} M_t, &{} 0 \le t \le 15 \\ \sigma ^{\{512\}}_1(W_{t-2}) + W_{t-7} + \sigma ^{\{512\}}_0(W_{t-15}) + W_{t-16}, &{} 16 \le t \le 79 \end{array} \right. $$
  2. 2.

    Initialize the eight working variables, abcdefg and h with the hash input value H:

    \(a = H_0 \qquad b = H_1 \qquad c = H_2 \qquad d = H_3\)

    \(e = H_4 \qquad f = H_5 \qquad g = H_6 \qquad h = H_7\)

  3. 2.

    For \(t=0\) to 79, do:

    {

    \(T_1\ =\ h+\displaystyle {\varSigma ^{\{512\}}_1}(e)+Ch(e,f,g)+K_t^{ \{ 512 \} }+W_t\)

    \(T_2 = \displaystyle {\varSigma _0^{ \{ 512 \} }}(a)+Maj(a,b,c)\)

    \(h = g \qquad g = f \qquad f = e \qquad e = d+ T_1\)

    \(d = c \qquad c = b \qquad ~b = a \qquad a = T_1 + T_2\)

    }

  4. 3.

    Computing the 512-bit output (hash) value \(C=C_0 \cdots C_7\) as:

    \(C_0 = a + H_0 \qquad C_1 = b + H_1 \qquad C_2 = c + H_2 \qquad C_3 = d + H_3\)

    \(C_4 = e + H_4 \qquad C_5 = f + H_5 \qquad C_6 = g + H_6 \qquad C_7 = h + H_7\)

1.3 B.3 The blake2b Compression Function

The initial values H of blake2b was chosen precisely as the ones for SHA-512 (given in Table 2). These values “were obtained by taking the first sixty-four bits of the fractional parts of the square roots of the first eight prime numbers”, according to [13].

Thus, the compression function blake2b takes as input:

figure p
$$\begin{aligned} \begin{pmatrix} \nu _0 &{} \nu _1 &{} \nu _2 &{} \nu _3 \\ \nu _4 &{} \nu _5 &{} \nu _6 &{} \nu _7 \\ \nu _8 &{} \nu _9 &{} \nu _{10} &{} \nu _{11} \\ \nu _{12} &{} \nu _{13} &{} \nu _{14} &{} \nu _{15} \end{pmatrix}&:= \begin{pmatrix} h_0 &{} h_1 &{} h_2 &{} h_3 \\ h_4 &{} h_5 &{} h_6 &{} h_7 \\ H_0 &{} H_1 &{} H_2 &{} H_3 \\ T_0 \oplus H_4 &{} T_1 \oplus H_5 &{} F_0 \oplus H_6 &{} F_1 \oplus H_7 \end{pmatrix} \end{aligned}$$
Table 4. Permutations of blake2b

Let the round permutations \(\sigma _r\) be in accordance with Table 4, where \(r=\overline{0,9}\). Note that for rounds \(r \ge 10\) the permutation used is \(\sigma _{r \bmod 10}\). The core function G of blake2b is defined as follows:

\( a := \ a + b + m_{\sigma _r(2i)} \quad \ \ \ d := \ ROTR^{32}(d \oplus a) \quad c := \ c + d \quad b := \ ROTR^{24}(b \oplus c) \)

\(a := \ a + b + m_{\sigma _r(2i+1)} \quad d := \ ROTR^{16}(d \oplus a) \quad c := \ c + d \quad b := \ ROTR^{63}(b \oplus ~c)\)

Fig. 4.
figure 4

Latency vs. block size

C Explicit Performance Metrics

The main performance metrics which are used in our work are throughput, area and throughput to area ratio (Tp/A). They are presented in Figs. 4, 5 and 6 in comparison with the block size of a message (which is represented on the x axis in all plots). The block size n goes from 64 bytes to 9600 bytes (we chose these values in order to meet the requirements of an online system). Both OMD-sha-512 and OMD-blake2b instantiations are taken into consideration.

All formulas used to generate the plots are based on the metrics described in Table 1. We recall that, during the following, Setup represents the number of clock cycles necessary in the initialization phase, Message refers to the number of clock cycles necessary to process a message composed of n blocks, Tag represents the number of clock cycles necessary to calculate Tag and Frequency refers to the frequency of the FPGA circuit.

Fig. 5.
figure 5

Throughput vs. block size

Fig. 6.
figure 6

Throughput to area ratio (Tp/A) vs. block size

In Fig. 4 we use latency as the parameter represented by the y axis. We computed latency by the following formula:

$$\text {Latency} = \text {CLK} \cdot (\text {Setup} + n \cdot \text {Message} + \text {Tag}) \cdot 1/\text {Frequency}.$$

We computed the throughput by applying the following formula:

$$ \text {Throughput} = \frac{n \cdot 64 \cdot 8 \cdot \text {Frequency}}{\text {CLK} \cdot (\text {Setup} + n \cdot \text {Message} + \text {Tag}) }. $$

The computation of Tp/A is straightforward.

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Maimuţ, D., Mega, Ş.A. (2020). Speeding up OMD Instantiations in Hardware. In: Simion, E., Géraud-Stewart, R. (eds) Innovative Security Solutions for Information Technology and Communications. SecITC 2019. Lecture Notes in Computer Science(), vol 12001. Springer, Cham. https://doi.org/10.1007/978-3-030-41025-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41025-4_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41024-7

  • Online ISBN: 978-3-030-41025-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics