1 Introduction

An Oblivious RAM (ORAM), first introduced by Goldreich and Ostrovsky [Gol87, Ost90, GO96], is a scheme that allows a client to read and write to his data stored on untrusted storage, while entirely hiding the access pattern, i.e., which operations were performed and at which locations. More precisely, we think of the client’s data as “logical memory” which the ORAM scheme encodes and stores in “physical memory”. Whenever the client wants to read or write to logical memory, the ORAM scheme translates this operation into several accesses to the physical memory. Security ensures that for any two (equal length) sequences of access to logical memory, the resultant distributions over the physical accesses performed by the ORAM are computationally (or statistically) close. Following its introduction, there has been a large body of work on ORAM constructions and security [SCSL11, GMOT12, KLO12, WS12, SvDS+13, RFK+15, DvDF+16], as well as its uses in various application scenarios (see, e.g., [GKK+12, GGH+13, LPM+13, LO13, MLS+13, SS13, YFR+13, CKW13, WHC+14, MBC14, KS14, LHS+14, GHJR15, BCP15, HOWW18]).

One can always trivially hide the memory access pattern by performing a linear scan of the entire memory for every memory access. Consequently, an important measure of an ORAM scheme is its overhead, namely the number of memory blocks which need to be accessed to answer a single \(\textsf {read}\) or \(\textsf {write}\) request. Goldreich and Ostrovsky [GO96] proved a lower bound of \(\varOmega \left( \log n\right) \) on the ORAM overhead, where \(n\) denotes the number of memory blocks in the logical memory. There are also ORAM constructions achieving this bound [SvDS+13, WCS15], at least if the block size is set to a sufficiently large polylogarithmic term; and works [PPRY] achieving \(O\left( \log n\log \log n\right) \) overhead for \(\varOmega \left( \log n\right) \) block size, assuming one-way functions. We note that one can circumvent the [GO96] lower bound by relaxing the notion of ORAM to either allow server-side computation [AKST14], or multiple non-colluding servers [LO13], and several works have obtained sub-logarithmic overhead in these settings [AKST14, FNR+15, DvDF+16, ZMZQ16, AFN+17, WGK18, KM18]. However, in this work we focus on the standard ORAM setting with a single server and no server-side computation.

In some respects, the lower bound of [GO96] is very general. First, it applies to all block sizes. Second, it holds also in restricted settings: when the ORAM is only required to work for offline programs in which, roughly, all memory accesses are stated explicitly in advance; and for read-only programs that do not update the memory contents. However, in other respects, the bound is restricted since it only applies to ORAM schemes that operate in the “balls and bins” model, in which memory can only be manipulated by moving memory blocks (“balls”) from one memory location (“bin”) to another. Therefore, the main question left open by the work of [GO96] is: is there an ORAM lower bound for general ORAM schemes, that are not restricted to operate in the “balls and bins” model?

Almost 20 years after Goldreich and Ostrovsky proved their lower bound, it was revisited by Boyle and Naor [BN16], who show how to construct an ORAM scheme in the offline setting with \(o\left( \log n\right) \) overhead, using sorting circuits of size \(o\left( n\log n\right) \). Though sorting circuits of such size are not known, ruling out their existence seems currently out of our reach. This result can be interpreted in two ways. On the one hand, an optimist will view it as a possible approach towards an ORAM construction in the offline setting, which uses “small” sorting circuits as a building block. On the other hand, a pessimist may view this result as a barrier towards proving a lower bound. Indeed, the [BN16] construction shows that proving a lower bound on the overhead of offline ORAM schemes would yield lower bounds on the size of sorting circuits, and proving circuit lower bounds is notoriously difficult. We note that unlike sorting networks, which only contain “compare-and-swap” gates that operate on the two input words as a whole, and for which a simple \(\varOmega \left( n\log n\right) \) lower bound exists, sorting circuits can arbitrarily operate over the input bits, and no such lower bounds are known for them.

The main drawback of the Boyle and Naor result [BN16] is that it only applies to the offline setting, which is not very natural and is insufficient for essentially any imaginable ORAM application. More specifically, the offline setting requires that the entire sequence of accesses be specified in advance - including which operation is performed, on which address, and in case of a \(\textsf {write}\) operation, what value is written. However, even very simple and natural RAM programs (e.g., binary search) require dynamic memory accesses that depend on the results of previous operations. Despite this drawback, the result of Boyle and Naor is still very interesting since it shows that lower bounds which are easy to prove in the “balls and bins” model might not extend to the general model. However, it does not answer the question of whether general ORAM lower bounds exist in the online setting, which is the one of interest for virtually all ORAM applications.

Very recently, and concurrently with our work, Larsen and Nielsen [LN18] proved that the [GO96] lower bound does indeed extend to general online ORAM. Concretely, they show an \(\varOmega \left( \log n\right) \) lower bound on the combined overhead of \(\textsf {read}\) and \(\textsf {write}\) operations in any general online ORAM, even with computational security. Their elegant proof employs techniques from the field of data-structure lower bounds in the cell-probe model, and in particular the “information-transfer” method of Pătraşcu and Demaine [PD06].

1.1 Our Contributions

In this work, we explore the \(\textsf {read}\) overhead of general ORAM schemes beyond the “balls and bins” model and in the online setting. We first consider read-only ORAM schemes that only support reads – but not writes – to the logical memory. We stress that the scheme is read-only in the sense that it only supports programs that do not write to the logical memory. However, to emulate such programs in the ORAM, the client might write to the physical memory stored on the server. We note that read-only ORAM already captures many interesting applications such as private search over a database, or fundamental algorithmic tasks such as binary search. We show how to construct online read-only ORAM schemes with \(o(\log n)\) overhead assuming “small” sorting circuits and “good” Locally Decodable Codes (LDCs). We then extend our results to a setting which also supports sub-linear writes but does not try to hide whether an operation is a \(\textsf {read}\) or a \(\textsf {write}\) and, in particular, allows different overheads for these operations. In all our constructions, the server is only used as remote storage, and does not perform any computations.

We note that, similar to [BN16], our results rely on primitives that we do not know how to instantiate with the required parameters, but also do not have any good lower bounds for. One can therefore interpret our results either positively, as a blueprint for an ORAM construction, or negatively as a barrier to proving a lower bound in these settings. For simplicity of the exposition, we choose to present our results through the “optimistic” lens.

We now describe our results in more detail.

Read-Only (RO) ORAM. We construct a read-only ORAM scheme, based on sorting circuits and smooth locally decodable codes. Roughly, a Locally Decodable Code (LDC) [KT00] has a decoder algorithm that can recover any message symbol by querying only few codeword symbols. In a smooth code, every individual decoder query is uniformly distributed. Given a logical memory of size-\(n\), our scheme has \(O\left( \log \log n\right) \) overhead, assuming the existence of linear-size sorting circuits, and smooth LDCs with constant query complexity and polynomial length codewords. Concretely, we get the following theorem.

Theorem 1

(Informal statement of Corollary 1). Suppose there exist linear-size boolean sorting circuits, and smooth LDCs with constant query complexity and polynomial length codewords. Then there exists a statistically-secure read-only ORAM scheme for memory of size \({n}\) and blocks of size \({{\mathrm{poly}}}\log n\), with \(O\left( 1\right) \) client storage and \(O\left( \log \log {n}\right) \) overhead.

In Sect. 3, we also show a read-only ORAM scheme with \(o\left( \log n\right) \) overhead based on milder assumptions – concretely, smooth LDCs with \(O\left( \log \log n\right) \) query complexity, and the existence of sorting circuits of size \(o\left( \frac{n\log n}{\log ^2\log n}\right) \); see Corollary 2. We note that under the (strong) assumption that the LDC has linear-size codewords, our constructions achieve linear-size server storage. We also note that if an a-priori polynomial bound on the number of memory accesses is known, then the constructions can be based solely on LDCs, and the assumption regarding small sorting circuits can be removed.

ORAM Schemes Supporting Writes. The read-only ORAM scheme described above still leaves the following open question: is there a lower bound on \(\textsf {read}\) overhead for ORAM schemes supporting \(\textsf {write}\) operations? To partially address this question, we extend our ORAM construction to a scheme that supports writes but does not hide whether an operation was a \(\textsf {read}\) or a \(\textsf {write}\). In this setting, \(\textsf {read}\) and \(\textsf {write}\) operations may have different overheads, and we focus on minimizing the overhead of \(\textsf {read}\) operations while preserving efficiency of \(\textsf {write}\) operations as much as possible. Our construction is based on the existence of sorting circuits and smooth LDCs as in Theorem 1, as well as the existence of One-Way Functions (OWFs). (We elaborate on why OWFs are needed in Sect. 1.2.) Assuming the existence of such building blocks, our scheme has \(O\left( \log \log n\right) \) \(\textsf {read}\) overhead and \(O\left( n^{\epsilon }\right) \) \(\textsf {write}\) overhead for an arbitrarily small constant \(\epsilon \in \left( 0,1\right) \), whose exact value depends on the efficiency of the LDC encoding. Concretely, we show the following:

Theorem 2

(Informal statement of Theorem 7). Assume the existence of OWFs, as well as LDCs and sorting circuits as in Theorem 1. Then for every constant \(\epsilon \in \left( 0,1\right) \), there exists a constant \(\gamma \in \left( 0,1\right) \) such that if LDC encoding requires \(n^{1+\gamma }\) operations then there is a computationally-secure ORAM scheme for memory of size \({n}\) and blocks of size \({{\mathrm{poly}}}\log n\) with \(O\left( 1\right) \) client storage, \(O\left( \log \log {n}\right) \) \(\textsf {read}\) overhead, and \(O\left( {n}^{\epsilon }\right) \) \(\textsf {write}\) overhead.

Similar to the read-only setting, we also instantiate (Sect. 4, Theorem 8) the ORAM with writes scheme based on milder assumptions regarding the parameters of the underlying sorting circuits and LDCs, while only slightly increasing the \(\textsf {read}\) overhead. Additionally, we describe a variant of our scheme with improved \(\textsf {write}\) complexity, again at the cost of slightly increasing the \(\textsf {read}\) overhead:

Theorem 3

(Informal statement of Theorem 9). Assume the existence of OWFs, as well as LDCs and sorting circuits as in Theorem 1, where LDC encoding requires \(n^{1+o\left( 1\right) }\) operations. Then there exists a computationally-secure ORAM scheme for memory of size \({n}\) and blocks of size \({{\mathrm{poly}}}\log n\) with \(O\left( 1\right) \) client storage, \(o\left( \log {n}\right) \) \(\textsf {read}\) overhead, and \({n}^{o\left( 1\right) }\) \(\textsf {write}\) overhead.

A Note on Block vs. Word Size. In our constructions we distinguish between words (which are bit strings) and blocks (which consist of several words). More specifically, words, which are the basic unit of physical memory on the server, consist of w bits; and blocks, which are the basic unit of logical memory on the client, consist of \(\mathsf {B}\) words. We measure the overhead as the number of words the client accesses on the server to read or write to a single logical block, divided by \(\mathsf {B}\). We note that it is generally easier to construct schemes with smaller word size. (Indeed, it allows the client more fine-grained access to the physical memory; a larger word size might cause the client to access unneeded bits on the server, simply because they are part of a word containing bits that do interest the client.) Consequently, we would generally like to support larger word size, ideally having words and blocks of equal size. Our constructions can handle any word size,Footnote 1 as long as blocks are poly-logarithmically larger (for a sufficiently large poylogarithmic factor). A similar differentiation between block and word size was used in some previous works as well (e.g., to get \(O\left( \log N\right) \) overhead in Path ORAM [SvDS+13]).

A Note Regarding Assumptions. We instantiate our constructions in two parameter regimes: one based on the existence of “best possible” sorting circuits and smooth LDCs (as described above), and one based on milder assumptions regarding the parameters of these building blocks (as discussed in Sects. 3 and 4). We note that despite years of research in these fields, we currently seem very far from ruling out the existence of even the “best possible” sorting circuits and smooth LDCs. Concretely, to the best of our knowledge there are no specific lower bounds for sorting circuits (as opposed to sorting networks, see discussion above and in Sect. 2.2), and even for general boolean circuits only linear lower bounds of \(c\cdot n\) for some constant \(c>1\) are known [Blu84, IM02, FGHK16]. Regarding LDCs, research has focused on the relation between the query complexity and codeword length in the constant query regime, but there are currently no non-trivial lower bounds for general codes. Even for restricted cases, such as binary codes, or linear codes over arbitrary fields, the bounds are extremely weak. Specifically, the best known lower bound shows that codewords in q-query LDCs must have length \(\varOmega \left( n^{(q+1)/(q-1)}\right) /\log n\) [Woo07] (which, in particular, does not rule out the existence of 4-query LDCs with codeword length \(n^{5/3}\)), so it is plausible that for a sufficiently large constant, constant-query LDCs with polynomial length codewords exist. We note that a recent series of breakthrough results construct 3-query LDCs with sub-exponential codewords of length \( \exp \left( \exp \left( O\left( \sqrt{\log n\log \log n}\right) \right) \right) =2^{n^{o\left( 1\right) }}\), as well as extensions to larger (constant) query complexity [Yek07, Rag07, Efr09, IS10, CFL+13]. Notice that lower bounds on the size of the encoding circuit of such codes will similarly yield circuit lower bounds.

A Note on the Connection to Private Information Retrieval (PIR) and Doubly-Efficient PIR (DEPIR). The notions of PIR and DEPIR, which support reads from memory stored on a remote server, are closely related to read-only ORAM, but differ from it significantly in some respects. We now discuss these primitives in more detail. In a (single-server) PIR scheme [KO97], there is no initial setup, and anybody can run a protocol with the server to retrieve an arbitrary location in the logical memory. The server is not used solely as remote storage, and in fact the main goal, which is to minimize the communication between the client and server, inherently requires the server to perform computations. One additional significant difference from ORAM is that the PIR privacy guarantee inherently requires the server runtime to be linear in the size of the logical memory, whereas a main ORAM goal is to have the server touch only a sublinear number of blocks (which the client reads from it to retrieve the block he is interested in). In a DEPIR scheme [BIM00, BIPW17, CHR17], there is a setup phase (as in ORAM), following which the server(s) stores an encoded version of the logical memory, and the logical memory can be accessed either with no key (in multi-server DEPIR [BIM00]), with a public key (in public-key DEPIR [BIPW17]) or with a secret key (in secret-key DEPIR [BIPW17, CHR17]). First proposed by Beimel, Ishai and Malkin [BIM00], who showed how to construct information-theoretic DEPIR schemes in the multi-server setting (i.e., with several non-colluding servers), two recent works [BIPW17, CHR17] give the first evidence that this notion may be achievable in the single-server setting. These works achieve sublinear server runtime, with a server that is only used as remote storage. Thus, these single-server DEPIR schemes satisfy all the required properties of a RO-ORAM scheme, with the added “bonus” of having a stateless server (namely, whose internal memory does not change throughout the execution of the scheme). However, these (secret-key) constructions are based on new, previously unstudied, computational hardness assumptions relating to Reed-Muller codes, and the public-key DEPIR scheme of [BIPW17] additionally requires a heuristic use of obfuscation. Unfortunately, both of the above assumptions are non-standard, poorly understood, and not commonly accepted. Additionally, these constructions do not achieve \(o\left( \log n\right) \) overhead (at least not with polynomial server storage).

A Note on Statistical vs. Computational Security. Our RO-ORAM achieves statistical security under the assumption that the server does not see the memory contents, namely the server only sees which memory locations are accessed. Hiding memory contents from the server can be generically achieved by encrypting the logical memory, in which case security holds against computationally-bounded servers. We note that our ORAM scheme supporting writes requires encrypting the logical memory even if the server does not see the memory contents. Consequently, our ORAM with writes scheme achieve computational security even in the setting where the server does not see the memory contents. Alternatively, our construction can achieve statistical security if the underlying LDC has the additional property that the memory accesses during encoding are independent of the data. (This property is satisfied by, e.g., linear codes.) We elaborate on this further in Sects. 3.1 and 4.

1.2 Our Techniques

We now give a high-level overview of our ORAM constructions. We start with the read-only setting, and then discuss how to enable writes.

We note that our technique departs quite significantly from that of Boyle and Naor [BN16], whose construction seems heavily tied to the offline setting. Indeed, the high-level idea underlying their scheme is to use the sorting circuit to sort by location the list of operations that need to be performed, so that the outcomes of the read operations can then be easily determined by making one linear scan of the list. It does not appear that this strategy can naturally extend to the online setting in which the memory accesses are not known a-priori.

Read-Only ORAM. We first design a Read-Only (RO) ORAM scheme that is secure only for an a-priori bounded number of accesses, then extend it to a scheme that remains secure for any polynomial number of accesses.

Bounded-Access RO-ORAM Using Metadata. Our RO-ORAM scheme employs a smooth LDC, using the decoder to read from memory. Recall that a k-query LDC is an error-correcting code in which every message symbol can be recovered by querying k codeword symbols. The server in our scheme stores k copies of the codeword, each permuted using a separate, random permutation. (We note that permuted LDCs were already used – but in a very different way – in several prior works [HO08, HOSW11, CHR17, BIPW17].) To read the memory block at address j, the client runs the decoder on j, and sends the decoder queries to the server, who uses the i’th permuted codeword copy to answer the i’th decoding query. This achieves correctness, but does not yet guarantee obliviousness since the server learns, for each \(1\le i\le k\), which \(\textsf {read}\) operations induced the same i’th decoding query.

To prevent the server from obtaining this additional information, we restrict the client to use only fresh decoding queries in each \(\textsf {read}\) operation, namely a set \(q_1,\ldots ,q_k\) of queries such that no \(q_i\) was issued before as the i’th query. The metadata regarding which decoding queries are fresh, as well as the description of the permutations, can be stored on the server using any sufficiently efficient (specifically, polylogarithmic-overhead) ORAM scheme. Each block in the metadata ORAM will consist of a single word, so using the metadata ORAM will not influence the overall complexity of the scheme, since for sufficiently large memory blocks the metadata blocks are significantly smaller. In summary, restricting the client to make fresh queries guarantees that the server only sees uniformly random decoding queries, which reveal no information regarding the identity of the accessed memory blocks.

However, restricting the client to only make fresh decoding queries raises the question of whether the ORAM is still correct, namely whether this restriction has not harmed functionality. Specifically, can the client always “find” fresh decoding queries? We show this is indeed the case as long as the number of \(\textsf {read}\) operations is at most M / 2k, where M denotes the codeword length. More precisely, the smoothness of the code guarantees that for security parameter \(\lambda \) and any index \(j\in \left[ {n}\right] \), \(\lambda \) independent executions of the decoder algorithm on index j will (with overwhelming probability) produce at least one set of fresh decoding queries. Thus, the construction is secure as long as the client performs at most M / 2k \(\textsf {read}\) operations.

We note that given an appropriate LDC, this construction already gives a read-only ORAM scheme which is secure for an a-priori bounded number of accesses, without relying on sorting circuits. Indeed, given a bound B on the number of accesses, all we need is a smooth LDC with length-M codewords, in which the decoder’s query complexity is at most M / 2B.

Handling an Unlimited Number of Reads. To obtain security for an unbounded number of \(\textsf {read}\) operations, we “refresh” the permuted codeword copies every M / 2k operations. (We call each such set of \(\textsf {read}\) operations an “epoch”.) Specifically, to refresh the codeword copies the client picks k fresh, random permutations, and together with the server uses the sorting circuit to permute the codeword copies according to the new permutations. Since the logical memory is read-only, the refreshing operations can be spread-out across the M / 2k \(\textsf {read}\) operations of the epoch.

ORAM with Writes. We extend our RO-ORAM scheme to support \(\textsf {write}\) operations, while preserving \(o\left( \log n\right) \) overhead for \(\textsf {read}\) operations. The construction is loosely based on hierarchical ORAM [Ost90, GO96]. The high-level idea is to store the logical memory on the server in a sequence of \(\ell \) levels of increasing size, each containing an RO-ORAM.Footnote 2 We think of the levels as growing from the top down, namely level-1 (the smallest) is the top-most level, and level-\(\ell \) (the largest) is the bottom-most. Initially, all the data is stored in the bottom level \(\ell \), and all the remaining levels are empty. To read the memory block at some location j, the client performs a \(\textsf {read}\) for location j in the RO-ORAMs of all levels, where the output is the block from the highest level that contains the j’th block. When the client writes to some location j, the server places that memory block in the top level \(i=1\). After every \(l_i\) \(\textsf {write}\) operations – where \(l_i\) denotes the size of level i – the i’th level becomes full. All the values in level i are then moved to level \(i+1\), a process which we call a “reshuffle” of level i into level \(i+1\). Formalizing this high-level intuition requires some care, and the final scheme is somewhat more involved. See Sect. 4 for details.

We note that our construction differs from Hierarchical ORAM in two main points. First, in Hierarchical ORAM level i is reshuffled into level \(i+1\) every \(l_i\) \(\textsf {read}\) or \(\textsf {write}\) operations, whereas in our scheme only \(\textsf {write}\) operations are “counted” towards reshuffle (in that respect, \(\textsf {read}\) operations are “free”). This is because the data is stored in each level using an RO-ORAM which already guarantees privacy for \(\textsf {read}\) operations. Second, Hierarchical ORAM uses \(\varOmega \left( \log n\right) \) levels, whereas to preserve \(o\left( \log n\right) \) \(\textsf {read}\) overhead, we must use \(o\left( \log n\right) \) levels. In particular, the ratio between consecutive levels in our scheme is no longer constant, leading to a higher reshuffle cost (which is the reason \(\textsf {write}\) operations have higher overhead in our scheme).

2 Preliminaries

Throughout the paper \(\lambda \) denotes a security parameter. For a length-n string \(\mathbf {x}\) and a subset \(I=\left\{ i_1,\ldots ,i_l\right\} \subseteq \left[ n\right] \), \(\mathbf {x}_I\) denotes \(\left( x_{i_1},\ldots ,x_{i_l}\right) \).

Terminology. Recall that words, the basic unit of physical memory on the server, consist of w bits; and blocks, the basic unit of logical memory on the client, consist of \(\mathsf {B}\) words. The client may locally perform bit operations on the bit representation of blocks, but can only access full words on the server. We will usually measure complexity in terms of logical blocks (namely, in terms of the basic memory unit on the client). More specifically, unless explicitly stated otherwise, client and server storage are measured as the number of blocks they store (even though the basic storage unit on the server side is a word), and overhead measures the number of blocks one needs to read or write to implement a \(\textsf {read}\) or \(\textsf {write}\) operation on a single block. Formally:

Definition 1

(Overhead). For a block size \(\mathsf {B}\) and input length \({n}\), we say that a protocol between client C and server S has overhead \(\mathsf {Ovh}\) for a function \(\mathsf {Ovh}:\mathbb {N}\rightarrow \mathbb {N}\), if implementing a \(\textsf {read}\) or \(\textsf {write}\) operation on a single logical memory block requires the client to access \(\mathsf {B}\cdot \mathsf {Ovh}\left( {n}\right) \) words on the server.

2.1 Locally Decodable Codes (LDCs)

Locally decodable codes were first formally introduced by [KT00]. We rely on the following definition of smooth LDCs.

Definition 2

(Smooth LDC). A smooth k-query Locally Decodable Code (LDC) with message length \(n\), and codeword length M over alphabet \(\varSigma \), denoted by \(\left( k,n, M \right) _\varSigma \)-smooth LDC, is a triplet \(\left( \mathsf{{Enc}} ,\textsf {Query},\mathsf{{Dec}} \right) \) of PPT algorithms with the following properties.

  • Syntax. \(\mathsf{{Enc}} \) is given a message \(\textsf {msg}\in \varSigma ^{n}\) and outputs a codeword \(c\in \varSigma ^M\), \(\textsf {Query}\) is given an index \(\ell \in \left[ n\right] \) and outputs a vector \(\mathbf {r}= (r_1,\ldots ,r_k) \in \left[ M\right] ^k\), and \(\mathsf{{Dec}} \) is given \(c_{\mathbf {\mathbf {r}}}= (c_{r_1},\ldots ,c_{r_k}) \in \varSigma ^k\) and outputs a symbol in \(\varSigma \).

  • Local decodability. For every message \(\textsf {msg}\in \varSigma ^{n}\), and every index \(\ell \in \left[ n\right] \),

    $$\Pr \left[ \mathbf {r}\leftarrow \textsf {Query}\left( \ell \right) \ :\ \mathsf{{Dec}} \left( \mathsf{{Enc}} \left( \textsf {msg}\right) _{\mathbf {r}}\right) =\textsf {msg}_\ell \right] =1.$$
  • Smoothness. For every index \(\ell \in \left[ n\right] \), every query in the output of \(\textsf {Query}\left( \ell \right) \) is distributed uniformly at random over [M].

To simplify notations, when \(\varSigma =\{0,1\}\) we omit it from the notation.

Remark on Smooth LDCs for Block Messages. We will use smooth LDCs for messages consisting of blocks \(\{0,1\}^\mathsf {B}\) of bits (for some block size \(\mathsf {B}\in \mathbb {N}\)), whose existence is implied by the existence of smooth LDCs over \(\{0,1\}\). Indeed, given a \(\left( k,n, M \right) \)-smooth LDC \(\left( \mathsf{{Enc}} ,\textsf {Query},\mathsf{{Dec}} \right) \), one can obtain a \(\left( k,n, M \right) _{\{0,1\}^{\mathsf {B}}}\)-smooth LDC \(\left( \mathsf{{Enc}} ',\textsf {Query}',\mathsf{{Dec}} '\right) \) by “interpreting” the message and codeword as \(\mathsf {B}\) individual words, where the j’th word consists of the j’th bit in all blocks. Concretely, \(\mathsf{{Enc}} '\) on input a message \(\left( \textsf {msg}^1,\ldots ,\textsf {msg}^{n}\right) \in \left( \{0,1\}^{\mathsf {B}}\right) ^{n}\), computes \(y^1_j\ldots y^M_j=\mathsf{{Enc}} \left( \textsf {msg}^1_j,\ldots ,\textsf {msg}^{n}_j\right) \) for every \(1\le j\le \mathsf {B}\), sets \(c^i=y^i_1\ldots y^i_{\mathsf {B}}\), and outputs \(c=\left( c^1,\ldots ,c^M\right) \). \(\textsf {Query}'\) operates exactly as \(\textsf {Query}\) does. \(\mathsf{{Dec}} '\), on input \(c^{r_1},\ldots ,c^{r_k}\in \{0,1\}^{\mathsf {B}}\), computes \(z_j=\mathsf{{Dec}} \left( c^{r_1}_j,\ldots ,c^{r_k}_j\right) \) for every \(1\le j\le \mathsf {B}\), and outputs \(z_1\ldots z_{\mathsf {B}}\).

2.2 Oblivious-Access Sort Algorithms

Our construction employ an Oblivious-Access Sort algorithm [BN16] which is, roughly, a RAM program that sorts its input, such that the access patterns of the algorithm on any two inputs of equal size are statistically close. Thus, oblivious-access sort is the “RAM version” of boolean sorting circuits. (Informally, a boolean sorting circuit is a boolean circuit ensemble \(\left\{ C\left( {n},\mathsf {B}\right) \right\} _{{n},\mathsf {B}}\) such that each \(C\left( {n},\mathsf {B}\right) \) takes as input \({n}\) size-\(\mathsf {B}\) tagged blocks, and outputs the blocks in sorted order according to their tags.)

Definition 3

(Oblivious-Access Sort Algorithm, [BN16]). An Oblivious-Access Sort algorithm for input size \({n}\) and block size \(\mathsf {B}\), with overhead \(\mathsf {Ovh_{Sort}}\left( {n},\mathsf {B}\right) \), is a (possibly randomized) algorithm \(\textsf {Sort}\) run by a client C on an input stored remotely on a server S, with the following properties:

  • Operation: The input consists of \({n}\) tagged blocks which are represented as length-B bit strings (the tag is a substring of the block) and stored on the server.Footnote 3 The client can perform local bit operations, but can only read and write full blocks from the server.

  • Overhead: The overhead of \(\textsf {Sort}\) is \(\mathsf {Ovh_{Sort}}\left( {n},\mathsf {B}\right) \).

  • Correctness: With overwhelming probability in \({n}\), at the end of the algorithm the server stores the blocks in sorted order according to their tags.

  • Oblivious Access: For a logical memory \(\textsf {DB}\) consisting of \({n}\) blocks of size \(\mathsf {B}\), let \(\textsf {AP}_{{n},\mathsf {B}}\left( \textsf {Sort},\textsf {DB}\right) \) denote the random variable consisting of the list of addresses accessed in a random execution of the algorithm \(\textsf {Sort}\) on \(\textsf {DB}\). Then for every pair \(\textsf {DB},\textsf {DB}'\) of inputs with \({n}\) size-\(\mathsf {B}\) blocks, \(\textsf {AP}_{{n},\mathsf {B}}\left( \textsf {Sort},\textsf {DB}\right) \approx ^s \textsf {AP}_{{n},\mathsf {B}}\left( \textsf {Sort},\textsf {DB}'\right) \), where \(\approx ^s\) denotes \(\mathsf{{negl}} \left( {n}\right) \) statistical distance.

Boyle and Naor [BN16] show that the existence of sorting circuits implies the existence of oblivious-access sort algorithms with related parameters:

Theorem 4

(Oblivious-access sort from sorting circuits, [BN16]). If there exist boolean sorting circuits \(\left\{ C\left( {n},\mathsf {B}\right) \right\} _{{n},\mathsf {B}}\) of size \(s\left( {n},\mathsf {B}\right) \), then there exists an oblivious-access sort algorithm for \({n}\) distinct elements with \(O\left( 1\right) \) client storage, \(O\left( {n}\cdot \log \mathsf {B}+s\left( \frac{2{n}}{\mathsf {B}},\mathsf {B}\right) \right) \) overhead, and \(e^{-n^{\varOmega \left( 1\right) }}\) probability of error.

Remark on the Existence of Oblivious-Access Sort Algorithms with Small Overhead. We note that for blocks of poly-logarithmic size \(\mathsf {B}={{\mathrm{poly}}}\log n\), the existence of sorting circuits of size \(s\left( n,\mathsf {B}\right) =O\left( n\cdot \mathsf {B}\cdot \log \log n\right) \) guarantees (through Theorem 4) the existence of oblivious-access sort algorithms with \(O\left( n\cdot \log \log n\right) \) overhead.

Remark on the Relation to Sorting Networks. The related notion of a sorting network has been extensively used in ORAM constructions. Similar to oblivious-access sort algorithms, sorting networks sort \({n}\) size-\(\mathsf {B}\) blocks in an oblivious manner. (More specifically, a sorting network is data oblivious, namely its memory accesses are independent of the input.) However, unlike oblivious-access sort algorithms, and boolean sorting circuits, which can operate locally on the bits in the bit representation of the input blocks, a sorting network consist of a single type of compare-exchange gate which takes a pair of blocks as input, and outputs them in sorted order. We note that a simple information-theoretic lower bound of \(\varOmega \left( {n}\log {n}\right) \) on the network size is known for sorting networks (as well as matching upper bounds, e.g. [AKS83, Goo14]), whereas no such bound is known for boolean sorting circuits or oblivious-access sorting algorithms.

2.3 Oblivious RAM (ORAM)

Oblivious RAMs were introduced by Goldreich and Ostrovskey [Gol87, Ost90, GO96]. To define oblivious RAMs, we will need the following notation of an access pattern.

Notation 1

(Access pattern). A length-q access pattern Q consists of a list \(\left( \textsf {op}_l,\mathsf {val}_l,\textsf {addr}_l\right) _{1\le l\le q}\) of instructions, where instruction \(\left( \textsf {op}_l,\mathsf {val}_l,\textsf {addr}_l\right) \) denotes that the client performs operation \(\textsf {op}_l\in \left\{ \textsf {read},\textsf {write}\right\} \) at address \(\textsf {addr}_l\) with value \(\mathsf {val}_l\) (which, if \(\textsf {op}_l=\textsf {read}\), is \(\perp \)).

Definition 4

(Oblivious RAM (ORAM)). An Oblivious RAM (ORAM) scheme with block size \(\mathsf {B}\) consists of procedures \(\left( \textsf {Setup},\textsf {Read},\textsf {Write}\right) \), with the following syntax:

  • \(\textsf {Setup}(1^\lambda , \textsf {DB})\) is a function that takes as input a security parameter \(\lambda \), and a logical memory \(\textsf {DB}\in \left( \{0,1\}^{\mathsf {B}}\right) ^{n}\), and outputs an initial server state \(\textsf {st}_S\) and a client key \({\mathsf {ck}}\). We require that the size of the client key \(|{\mathsf {ck}}|\) be bounded by some fixed polynomial in the security parameter \(\lambda \), independent of \(|\textsf {DB}|\).

  • \(\textsf {Read}\) is a protocol between the server S and the client C. The client holds as input an address \(\textsf {addr}\in [{n}]\) and the client key \({\mathsf {ck}}\), and the server holds its current state \(\textsf {st}_S\). The output of the protocol is a value \(\mathsf {val}\) to the client, and an updated server state \(\textsf {st}_S'\).

  • \(\textsf {Write}\) is a protocol between the server S and the client C. The client holds as input an address \(\textsf {addr}\in [{n}]\), a value v, and the client key \({\mathsf {ck}}\), and the server holds its current state \(\textsf {st}_S\). The output of the protocol is an updated server state \(\textsf {st}_S'\).

Throughout the execution of the \(\textsf {Read}\) and \(\textsf {Write}\) protocols, the server is used only as remote storage, and does not perform any computations.

We require the following correctness and security properties.

  • Correctness: In any execution of the \(\textsf {Setup}\) algorithm followed by a sequence of \(\textsf {Read}\) and \(\textsf {Write}\) protocols between the client and the server, where the \(\textsf {Write}\) protocols were executed with a sequence V of values, the output of the client in every execution of the \(\textsf {Read}\) protocol is with overwhelming probability the value he would have read from the logical memory in the corresponding \(\textsf {read}\) operation, if the prefix of V performed before the \(\textsf {Read}\) protocol was performed directly on the logical memory.

  • Security: For a logical memory \(\textsf {DB}\), and an access pattern Q, let \(\textsf {AP}\left( \textsf {DB},Q\right) \) denote the random variable consisting of the list of addresses accessed in the ORAM when the \(\textsf {Setup}\) algorithm is executed on \(\textsf {DB}\), followed by the execution of a sequence of \(\textsf {Read}\) and \(\textsf {Write}\) protocols according to Q. Then for every pair \(\textsf {DB}^0,\textsf {DB}^1\in \left( \{0,1\}^{\mathsf {B}}\right) ^{n}\) of inputs, and any pair \(Q^0=\left( \textsf {op}_l,\mathsf {val}_l^0,\textsf {addr}_l^0\right) _{1\le l\le q},Q^1=\left( \textsf {op}_l,\mathsf {val}_l^1,\textsf {addr}_l^1\right) _{1\le l\le q}\) of access patterns of length \(q={{\mathrm{poly}}}\left( \lambda \right) \), \(\textsf {AP}\left( \textsf {DB}^0,Q^0\right) \approx ^s\textsf {AP}\left( \textsf {DB}^1,Q^1\right) \), where \(\approx ^s\) denotes \(\mathsf{{negl}} \left( \lambda \right) \) statistical distance.

    If \(\textsf {AP}\left( \textsf {DB}^0,Q^0\right) ,\textsf {AP}\left( \textsf {DB}^1,Q^1\right) \) are only computationally indistinguishable, then we say the scheme is computationally secure.

Definition 4 does not explicitly specify who runs the \(\textsf {Setup}\) procedure. It can be performed by the client, who then sends the server state \(\textsf {st}_S\) to the server S, or (to save on client computation) can be delegated to a trusted third party.

Remark on Hiding the Type of Operation. Notice that Definition 4 does not hide whether the performed operation is a \(\textsf {read}\) or a \(\textsf {write}\), whereas an ORAM scheme is usually defined to hide this information. However, any such scheme can be generically made to hide the identity of operations by always performing both a \(\textsf {read}\) and a \(\textsf {write}\). (Specifically, in a \(\textsf {write}\) operation, one first performs a dummy \(\textsf {read}\); in a \(\textsf {read}\) operation, one writes back the value that was read.) Revealing the identity of operations allows us to obtain more fine-grained overheads.

Remark on Hiding Physical Memory Contents. The security property of Definition 4 implicitly assumes that the server does not see the contents of the physical memory: if the server is allowed to see it, he might be able to learn some non-trivial information regarding the access pattern, and thus violate the security property. As noted in Sect. 1.1, hiding the physical memory contents from the server can be achieved by encrypting the physical memory blocks, but security will then only hold against computationally-bounded servers, and so we choose to define security with the implicit assumption that the server does not see the memory contents (which also allows for cleaner constructions).

We will also consider the more restricted notion of a Read-Only (RO) ORAM scheme which, roughly, is an ORAM scheme that supports only read operations.

Definition 5

(Read-Only Oblivious RAM (RO-ORAM)). A Read-Only Oblivious RAM (RO-ORAM) scheme consists of procedures \(\left( \textsf {Setup},\textsf {Read}\right) \) with the same syntax as in Definition 4, in which correctness holds for any sequence of \(\textsf {Read}\) protocols between the client and the server, and security holds for any pair of access patterns \(R^0,R^1\) that contain only \(\textsf {read}\) operations.

3 Read-Only ORAM from Oblivious-Access Sort and Smooth LDCs

In this section we construct a Read-Only Oblivious RAM (RO-ORAM) scheme from oblivious-access sort algorithms and smooth LDCs. Concretely, we prove the following:

Theorem 5

Suppose there exist:

  • \(\left( k,n,M\right) \)-smooth LDCs with \(M={{\mathrm{poly}}}\left( n\right) \).

  • An oblivious-access sort algorithm \(\textsf {Sort}\) with \(s\left( {n},\mathsf {B}\right) \) overhead for input size \({n}\) and block size \(\mathsf {B}\).

Then there exists an RO-ORAM scheme for logical memory of size \({n}\) and blocks of size \(\mathsf {B}=\varOmega \left( \lambda \cdot k^2\cdot \log ^3\left( k{n}\right) \log ^7\log \left( k{n}\right) \right) \) with \(k+\frac{2k^2}{M}\cdot s\left( M,\mathsf {B}\right) +O\left( 1\right) \) overhead, and \(O\left( k\right) \) client storage.

Theorem 1 now follows from Theorem 5 (using also Theorem 4) for an appropriate instantiation of the sorting algorithm and LDC.

Corollary 1

(RO-ORAM, “dream” parameters; formal statement of Theorem 1). Suppose there exist:

  • \(\left( k,{n},M\right) \)-smooth LDCs with \(k=O\left( 1\right) \) and \(M={{\mathrm{poly}}}\left( {n}\right) \).

  • Boolean sorting circuits \(\left\{ C\left( {n},\mathsf {B}\right) \right\} _{{n},\mathsf {B}}\) of size \(s\left( {n},\mathsf {B}\right) =O\left( {n}\cdot \mathsf {B}\right) \) for input size \({n}\) and block size \(\mathsf {B}\).

Then there exists an RO-ORAM scheme for logical memory of size \({n}\) and blocks of size \(\varOmega \left( \lambda \cdot \log ^4{n}\right) \) with \(O\left( \log \log {n}\right) \) overhead, and \(O\left( 1\right) \) client storage.

We also instantiate our construction with sorting algorithms and LDCs with more “conservative” parameters, to obtain the following corollary.

Corollary 2

(RO-ORAM, milder parameters). Suppose there exist:

  • \(\left( k,{n},M\right) \)-smooth LDCs with \(k={{\mathrm{poly}}}\log \log n\) and \(M={{\mathrm{poly}}}\left( {n}\right) \).

  • Boolean sorting circuits \(\left\{ C\left( {n},\mathsf {B}\right) \right\} _{{n},\mathsf {B}}\) of size \(s\left( {n},\mathsf {B}\right) \in o\left( \frac{{n}\cdot \mathsf {B}\cdot \log {n}}{k^2}\right) \) for input size \({n}\) and block size \(\mathsf {B}\).

Then there exists an RO-ORAM scheme for memory of size \({n}\) and blocks of size \(\varOmega \left( \lambda \cdot \log ^4{n}\right) \) with = \(o\left( \log {n}\right) \) overhead, and \({{\mathrm{poly}}}\log \log n\) client storage.

Construction Overview. As outlined in the introduction, our construction uses a \(\left( k,n,M\right) \)-smooth LDC. The server stores k codeword copies, each permuted using a unique uniformly random permutation. To read block j from the logical memory, the client runs the LDC decoder until the decoder generates a set of fresh decoding queries (i.e., a set \(q_1,\ldots ,q_k\) of queries such that for every \(1\le i\le k\), \(q_i\) was not issued before as the i’th query), and sends these queries to the server. The server uses the i’th permuted codeword copy to answer the i’th decoding query. The metadata regarding which decoding queries are fresh, as well as the description of the permutations, are stored on the server using a (polylogarithmic-overhead) ORAM scheme, which the client accesses to determine whether the decoder queries are fresh, and to permute them according to the random permutations.

The execution is divided into “epochs” consisting of \(O\left( M/k\right) \) \(\textsf {read}\) operations. When an epoch ends, the client “refreshes” the permuted codeword copies by picking k fresh, random permutations, and running an oblivious-access sort algorithm with the server to permute the codeword copies stored on the server according to the new permutations. The description of the new permutations is stored in the metadata ORAM (the client also resets the bits indicating which decoding queries are fresh). The refreshing operations are spread-out across the \(O\left( M/k\right) \) \(\textsf {read}\) operations of the epoch. The resultant increase in complexity depends on k (which determines the epoch length, i.e., the frequency in which refreshing is needed), and on the overhead of the oblivious-access sort algorithm.

Construction 1

(RO-ORAM from Oblivious-Access Sort and Smooth LDCs). The scheme uses the following building blocks:

  • A \(\left( k,{n}, M \right) _{\{0,1\}^{\mathsf {B}}}\)-smooth LDC \(\left( \mathsf{{Enc}} _{\textsf {LDC}},\textsf {Query}_{\textsf {LDC}},\mathsf{{Dec}} _{\textsf {LDC}}\right) \).

  • An oblivious-access sort algorithm \(\textsf {Sort}\).

  • An ORAM scheme \(\left( \textsf {Setup}_{{\mathrm {in}} },\textsf {Read}_{{\mathrm {in}} },\textsf {Write}_{{\mathrm {in}} }\right) \).

The scheme consists of the following procedures:

  • \({{\mathbf {\mathsf{{Setup}}}}}(1^\lambda ,{{\mathbf {\mathsf{{DB}}}}})\): Recall that \(\lambda \) denotes the security parameter, and \(\textsf {DB}\in \left( \{0,1\}^{\mathsf {B}}\right) ^{n}\). Instantiate the LDC with message size \({n}\) over alphabet \(\varSigma =\{0,1\}^{\mathsf {B}}\), and let k be the corresponding number of queries, and M be the corresponding codeword size. Proceed as follows.

    1. 1.

      Counter initialization. Initializes a step counter \(\textsf {count}=0\).

    2. 2.

      Data storage generation.

      1. (a)

        Generate the codeword \(\widetilde{\textsf {DB}}=\mathsf{{Enc}} _{\textsf {LDC}}\left( \textsf {DB}\right) \) with \(\widetilde{\textsf {DB}}\in \varSigma ^M\).

      2. (b)

        For every \(1\le i\le k\):

        • Generate a random permutation \(P^i:\left[ M\right] \rightarrow \left[ M\right] \).

        • Let \(\widetilde{\textsf {DB}}^i \in \varSigma ^M\) be a permuted version of the codeword which satisfies \(\widetilde{\textsf {DB}}^i_{P^i(j)} = \widetilde{\textsf {DB}}_j\) for all \(j \in [M]\).

    3. 3.

      Metadata storage generation.

      1. (a)

        For every \(1\le i\le k\):

        • Initialize a length-M bit-array \(\textsf {Queried}^i\) to \(\mathbf {0}\).

        • Initialize a length-M array \(\textsf {Perm}^i\) over \(\{0,1\}^{\log M}\) such that \(\textsf {Perm}^i\left( j\right) =P^i\left( j\right) \).

      2. (b)

        Let \(\textsf {mDB}\) denote the logical memory obtained by concatenating \(\textsf {Queried}^1,\ldots ,\textsf {Queried}^k\) and \(\textsf {Perm}^1,\ldots ,\textsf {Perm}^k\). Run \(\left( {\mathsf {ck}}_m,\textsf {st}_m\right) \leftarrow \textsf {Setup}_{{\mathrm {in}} }\left( 1^{\lambda },\textsf {mDB}\right) \) to obtain the client key and server state for the metadata ORAM.

    4. 4.

      Output. The long-term client key \({\mathsf {ck}}= {\mathsf {ck}}_m\) consists of the client key for the metadata ORAM. The server state \(\textsf {st}_S = \left( \left\{ \widetilde{\textsf {DB}}^i~:~ i \in [k]\right\} , \textsf {st}_m,\textsf {count}\right) \) contains the k permuted codewords, the server state for the metadata ORAM, and the step counter.

  • The Read protocol. To read the logical memory block at location \(\textsf {addr}\in [{n}]\) from the server S, the client C with key \({\mathsf {ck}}={\mathsf {ck}}_m\) operates as follows, where in all executions of the \(\textsf {Read}_{{\mathrm {in}} }\) or \(\textsf {Write}_{{\mathrm {in}} }\) protocols on \(\textsf {mDB}\) S plays the role of the server with state \(\textsf {st}_m\) and C plays the role of the client with key \({\mathsf {ck}}_m\).

    1. 1.

      Generating decoder queries. Repeat the following \(\lambda \) times:

      • Run \(\left( q_1,\ldots ,q_k\right) \leftarrow \textsf {Query}_{\textsf {LDC}}\left( \textsf {addr}\right) \) to obtain decoding queries.

      • For every \(1\le i\le k\), run the \(\textsf {Read}_{{\mathrm {in}} }\) protocol to read \(\textsf {Queried}^i\left[ q_i\right] \). We say that \(q_i\) is fresh if \(\textsf {Queried}^i\left[ q_i\right] =0\).

      • Let \(\left( \hat{q}_1,\ldots ,\hat{q}_k\right) \) denote the decoding queries in the first iteration in which all queries were fresh. (If no such iteration exists, set \(\left( \hat{q}_1,\ldots ,\hat{q}_k\right) \) to be the decoding queries generated in the last iteration.)

    2. 2.

      Permuting queries. For every \(1\le i\le k\), run the \(\textsf {Read}_{{\mathrm {in}} }\) protocol to read \(\textsf {Perm}^i\left[ \hat{q}_i\right] \). Let \(q_i'\) denote the value that \(\textsf {Read}_{{\mathrm {in}} }\) outputs to the client.

    3. 3.

      Decoding logical memory blocks. Read \(\widetilde{\textsf {DB}}^1_{q_1'},\ldots ,\widetilde{\textsf {DB}}^k_{q_k'}\) from the server, and set the client output to \(\mathsf{{Dec}} _{\textsf {LDC}}\left( \widetilde{\textsf {DB}}^1_{q_1'},\ldots ,\widetilde{\textsf {DB}}^k_{q_k'}\right) \).

    4. 4.

      Updating counter and server state. Let \(\ell =\frac{M}{2k}\). Read \(\textsf {count}\) from the server.

      • If \(\textsf {count}<\ell -1\), then update \(\textsf {count}:= \textsf {count}+1\), and for every \(1\le i\le k\), run the \(\textsf {Write}_{{\mathrm {in}} }\) protocol to write “1” to \(\textsf {Queried}^i\left[ \hat{q}_i\right] \).

      • Otherwise, update \(\textsf {count}:=0\), and for every \(1\le i\le k\):

        • Run the \(\textsf {Write}_{{\mathrm {in}} }\) protocol to write \(\mathbf {0}\) to \(\textsf {Queried}^i\).

        • Replace \(P^i\) with a fresh random permutation on \(\left[ M\right] \) by running the Fisher-Yates shuffle algorithm (as presented by Durstenfeld [Dur64]) on \(\textsf {Perm}^i\), using the \(\textsf {Read}_{{\mathrm {in}} }\) and \(\textsf {Write}_{{\mathrm {in}} }\) protocols.

        • Use \(\textsf {Sort}\) to sort \(\widetilde{\textsf {DB}}^i\) according to the new permutation \(P^i\) (each block consists of a codeword symbol, and the index in the codeword which is used as the tag of the block).

        If the complexity of these three steps is \(c_\mathsf{{epoch}}\), then the client performs \(c_\mathsf{{epoch}}/\ell \) steps of this computation in each protocol execution so that it is completed by the end of the epoch.

We prove the following claims about Construction 1.

Proposition 1

(ORAM security). Assuming the security of all of the building blocks, Construction 1 is a secure RO-ORAM scheme.

Proposition 2

(ORAM overhead). Assume that:

  • The logical memory \(\textsf {DB}\) has block size \(\mathsf {B}\), and the metadata ORAM has block size \(\mathsf {mB}\), satisfying \(\mathsf {B}> \mathsf {mB}\ge \log M\).

  • The metadata ORAM has overhead \(\mathsf {Ovh}\left( N\right) \) for memory of size N.

  • The oblivious-access sort algorithm has \(\mathsf {Ovh_{Sort}}\left( {n},\mathsf {B}\right) \) overhead when operating on inputs consisting of \({n}\) size-\(\mathsf {B}\) blocks.

Then every execution of the \(\textsf {Read}\) protocol in Construction 1 requires accessing

$$O\left( k\lambda +k^2\right) \cdot \mathsf {mB}\cdot \mathsf {Ovh}\left( \frac{k\cdot \left( M+M\log M\right) }{\mathsf {mB}}\right) +\left( k+\frac{2k^2}{M}\cdot \mathsf {Ovh_{Sort}}\left( M,\mathsf {B}\right) \right) \cdot \mathsf {B}$$

words on the server.

Claims Imply Theorem. To prove Theorem 5, we instantiate the metadata ORAM of Construction 1 with the following variant of path ORAM [SvDS+13]:

Theorem 6

(Statistical ORAM with polylog overhead, implicit in [SvDS+13]). Let \(\lambda \) be a security parameter. Then there exists a statistical ORAM scheme with \(\mathsf{{negl}} \left( \lambda \right) \) error for logical memory consisting of N blocks of size \(\mathsf {mB}=\log ^2 N\log \log N\) with \(O\left( \log N\right) \) overhead, in which the client stores \(O\left( \log N\left( \lambda +\log \log N\right) \right) \) blocks.

Moreover, initializing the scheme requires accessing \(O\left( N\cdot \mathsf {mB}\right) \) words, and the server stores \(O\left( N\right) \) blocks.

Proof

of Theorem 5. Security follows directly from Proposition 1 since (as noted in Sect. 2.1) the existence of a \(\left( k,{n},M\right) \)-smooth LDC implies the existence of a \(\left( k,{n},M\right) _{\{0,1\}^{\mathsf {B}}}\)-smooth LDC.

As for the overhead of the construction, let \(N_m= k\left( M+M\log M\right) \) denote the size (in bits) of the metadata ORAM. Substituting \(\mathsf {mB}= \log ^2N_m\log \log N_m\), and \(\mathsf {Ovh}\left( N\right) =O\left( \log N\right) \) (according to Theorem 6), Proposition 2 guarantees that every execution of the \(\textsf {Read}\) protocol requires accessing

$$O\left( k\lambda +k^2\right) \cdot \log ^2N_m\log \log N_m\cdot O\left( \log N_m\right) +\left( k+\frac{2k^2}{M}\cdot s\left( M,\mathsf {B}\right) \right) \cdot \mathsf {B}$$

words on the server. The first summand can be upped bounded by

$$k^2\lambda \cdot \log ^2\left( kM\right) \log ^3\log \left( kM\right) \cdot O\left( \log \left( kM\right) \right) \le k^2\lambda \cdot \log ^3\left( kM\right) \log ^3\log \left( kM\right) .$$

For \(\mathsf {B}=\varOmega \left( \lambda \cdot k^2\cdot \log ^3\left( k{n}\right) \log ^7\log \left( k{n}\right) \right) \) (as in the theorem statement) with a sufficiently large constant in the \(\varOmega \left( \cdot \right) \) notation, and since \(M={{\mathrm{poly}}}\left( {n}\right) \), this corresponds to accessing \(O\left( \mathsf {B}\right) \) words on the server, so the overhead is \(k+\frac{2k^2}{M}\cdot s\left( M,\mathsf {B}\right) +O\left( 1\right) \).

Finally, regarding client storage, emulating the LDC decoder requires storing k size-\(\mathsf {B}\) blocks (i.e, the answers to the decoder queries). Operations on \(\textsf {mDB}\) require (by Theorem 6) storing \(O\left( \log N_m\left( \lambda +\log \log N_m\right) \right) \) size-\(\mathsf {mB}\) blocks which corresponds to a constant number of size-\(\mathsf {B}\) blocks.    \(\square \)

Security Analysis: Proof of Proposition 1. The proof of Proposition 1 will use the next lemma, which states that with overwhelming probability, every \(\textsf {Read}\) protocol execution uses fresh decoding queries. This follows from the smoothness of the underlying LDC.

Lemma 1

Let \(k,M\in \mathbb {N}\), and let \(X=\left( X_1,\ldots ,X_k\right) \) be a random variable over \(\left[ M\right] ^k\) such that for every \(1\le i\le k\), \(X_i\) is uniformly distributed over \(\left[ M\right] \). Let \(S_1,\ldots ,S_k\subseteq \left[ M\right] \) be subsets of size at most \(\ell \). Then in l independent samples according to X, with probability at least \(1-\left( k\cdot \frac{\ell }{M}\right) ^{l}\), there exists a sample \(\left( x_1,\ldots ,x_k\right) \) such that \(x_i\notin S_i\) for every \(1\le i\le k\).

In particular, if \(\ell =\frac{M}{2k}\) and \(l=\varOmega \left( \lambda \right) \) then except with probability \(\mathsf{{negl}} \left( \lambda \right) \), there exists a sample \(\left( x_1,\ldots ,x_k\right) \) such that \(x_i\notin S_i\) for every \(1\le i\le k\).

Proof

Consider a sample \(\left( x_1,\ldots ,x_k\right) \) according to X. Since each \(X_i\) is uniformly distributed over \(\left[ M\right] \), then \(\Pr \left[ x_i\in S_i\right] \le \frac{\ell }{M}\), so by the union bound, \(\Pr \left[ \exists i\ :\ x_i\in S_i\right] \le k\cdot \frac{\ell }{M}\). Since the l samples are independent, the probability that no such sample exists is \(\left( \Pr \left[ \text {in\ a\ single\ sample},\ \exists i\ :\ x_i\in S_i\right] \right) ^l\le \left( k\cdot \frac{\ell }{M}\right) ^l\). For the “in particular” part, notice that for \(\ell =\frac{M}{2k}\) and \(l=\varOmega \left( \lambda \right) \), \(1-\left( k\cdot \frac{\ell }{M}\right) ^{l}=1-2^{-\varOmega \left( \lambda \right) }\).   \(\square \)

We are now ready to prove Proposition 1.

Proof of Proposition 1. The correctness of the scheme follows directly from the correctness of the underling LDC. We now argue security. Let \(\textsf {DB}^0,\textsf {DB}^1\) be two logical memories consisting of \({n}\) size-\(\mathsf {B}\) blocks, and let \(R^0,R^1\) be two sequences of \(\textsf {read}\) operations of length \(q={{\mathrm{poly}}}\left( \lambda \right) \). We proceed via a sequence of hybrids. We assume that in each \(\textsf {read}\) operation, at least one iteration in the \(\textsf {Read}\) protocol succeeded in generating fresh decoder queries, and condition all hybrids on this event. This is without loss of generality since by Lemma 1, this happens with overwhelming probability.

 

\(\mathbf {\mathcal {H}_0^b:}\) :

Hybrid \(\mathcal {H}_0^b\) is the access pattern \(\textsf {AP}\left( \textsf {DB}^b,R^b\right) \) in an execution of read sequence \(R^b\) on the RO-ORAM generated for logical memory \(\textsf {DB}^b\).

\(\mathbf {\mathcal {H}_1^b:}\) :

In hybrid \(\mathcal {H}_1^b\), for every \(1\le i\le k\), we replace the values of \(\textsf {Queried}^i\) and \(\textsf {Perm}^i\) with dummy values of (e.g.,) the all-0 string. Moreover, we replace all \(\textsf {read}\) and \(\textsf {write}\) accesses to the metadata \(\textsf {mDB}\) with dummy operations that (e.g.,) read and write the all-0 string to the first location in the metadata. (We note that the accesses to the permuted codewords remain unchanged, where each access consists of fresh decoding queries, permuted according to \(P^1,\ldots ,P^k\).)

Hybrids \(\mathcal {H}_0^b\) and \(\mathcal {H}_1^b\) are statistically indistinguishable by the security of the metadata ORAM.

\(\mathbf {\mathcal {H}_2^b:}\) :

In hybrid \(\mathcal {H}_2^b\), for every \(1\le i \le k\), and every epoch j, we replace the permutation on which the oblivious-access sort algorithm \(\textsf {Sort}\) is applied, with a dummy permutation (e.g., the identity). (As in \(\mathcal {H}_1^b\), the accesses to the codeword copies remain unchanged, and in particular the “right” permutations are used in all epochs.)

Hybrids \(\mathcal {H}_1^b\) and \(\mathcal {H}_2^b\) are statistically indistinguishable by the obliviousness property of the oblivious-access sort algorithm.

\(\mathbf {\mathcal {H}_3^b:}\) :

In hybrid \({\mathcal {H}_3^b}\), for every \(1\le i\le k\), we replace the queries to the i’th permuted codeword with queries that are uniformly random subject to the constraint that they are all distinct.

Hybrids \(\mathcal {H}_2^b\) and \(\mathcal {H}_3^b\) are statistically indistinguishable since by our assumption all the queries sent to the codeword copies are fresh, and they are permuted using random permutations. (Notice that \(\mathcal {H}_2^b,\mathcal {H}_3^b\) contain no additional information regarding these permutations.)

 

We conclude the proof by noting that \(\mathcal {H}_3^0\equiv \mathcal {H}_3^1\) since neither depend on \(\textsf {DB}^0,\textsf {DB}^1,R^0\) or \(R^1\).    \(\square \)

Complexity Analysis: Proof of Proposition 2. We now analyze the complexity of Construction 1, proving Proposition 2. Notice that since \(\mathsf {mB}\ge \log M\), an image of any random permutation \(P^i:\left[ M\right] \rightarrow \left[ M\right] \) is contained in a single block of \(\textsf {mDB}\). Notice also that the metadata \(\textsf {mDB}\) consists of \(k\cdot \left( M+M\log M\right) \) bits, and let \(N_m:=\frac{k\cdot \left( M+M\log M\right) }{\mathsf {mB}}\) denote its size in size-\(\mathsf {mB}\) blocks. Recall that a word (i.e., the basic unit of the physical memory stored on the server) consists of w bits.

Proof of Proposition 2. Every execution of the \(\textsf {Read}\) protocol consists of the following operations:

  • Reading \(k\cdot \lambda \) bits from \(\textsf {mDB}\) to check if the decoding queries in each of the \(\lambda \) iterations are fresh. Reading each bit requires reading a different block from \(\textsf {mDB}\), which requires accessing \(k\lambda \cdot \mathsf {mB}\cdot \mathsf {Ovh}\left( N_m\right) \) words on the server.

  • Reading k images from \(\textsf {Perm}^1,\ldots ,\textsf {Perm}^k\) to permute the chosen decoding queries. This requires reading k blocks from \(\textsf {mDB}\), which requires accessing \(k\cdot \mathsf {mB}\cdot \mathsf {Ovh}\left( N_m\right) \) words on the server.

  • Reading k blocks from the permuted codewords \(\widetilde{\textsf {DB}}^1,\ldots ,\widetilde{\textsf {DB}}^k\) to answer the decoder queries, which requires accessing \(\frac{\mathsf {B}}{w}\cdot k\) words on the server.

  • Writing k bits to \(\textsf {mDB}\) to update the values \(\textsf {Queried}^i\left[ \hat{q}^i\right] ,1\le i\le k\), to 1, in total accessing \(k\cdot \mathsf {mB}\cdot \mathsf {Ovh}\left( N_m\right) \) words on the server. (This operation is only performed when \(\textsf {count}<\ell -1\), but counting it in every \(\textsf {Read}\) execution will not increase the overall asymptotic complexity.)

  • Updating the counter, which requires accessing \(\frac{\lambda }{w}\) words on the server.

In total, these operations require accessing \( O\left( k\lambda \right) \cdot \mathsf {mB}\cdot \mathsf {Ovh}\left( N_m\right) +k\cdot \frac{\mathsf {B}}{w}\) words on the server.

In addition, every \(\textsf {Read}\) execution performs its “share” of the operations needed to update the server state at the end of the epoch. More specifically, it performs a \(\frac{1}{\ell }=\frac{2k}{M}\)-fraction of the following operations:

  • Writing \(k\cdot \frac{M}{\mathsf {mB}}\) blocks to \(\textsf {mDB}\) to reset all entries of \(\textsf {Queried}^i,1\le i\le k\), as well as reading and writing \(k\cdot 2M\) blocks to \(\textsf {mDB}\) to update the entries of \(\textsf {Perm}^i,1\le i\le k\) with the images of the new permutations, using the Fisher-Yates shuffle. In total, this requires accessing \( k\cdot M\cdot \left( \frac{1}{\mathsf {mB}}+4\right) \cdot \mathsf {mB}\cdot \mathsf {Ovh}\left( N_m\right) \) words on the server.

  • Running k executions of \(\textsf {Sort}\) on an input of M blocks of size \(\mathsf {B}\) to re-permute the codeword copies, which requires accessing \(k\cdot \mathsf {Ovh_{Sort}}\left( M,\mathsf {B}\right) \) words on the server.

So these update operations require accessing \(O\left( k^2\right) \cdot \mathsf {mB}\cdot \mathsf {Ovh}\left( N_m\right) +\frac{2k^2}{M}\cdot \mathsf {B}\cdot \mathsf {Ovh_{Sort}}\left( M,\mathsf {B}\right) \) words on the server per execution of the \(\textsf {Read}\) protocol.

In summary, reading a single logical block from \(\textsf {DB}\) requires accessing \(O\left( k\lambda +k^2\right) \cdot \mathsf {mB}\cdot \mathsf {Ovh}\left( \frac{k\cdot \left( M+M\log M\right) }{\mathsf {mB}}\right) +\left( \frac{k}{w}+\frac{2k^2}{M}\cdot \mathsf {Ovh_{Sort}}\left( M,\mathsf {B}\right) \right) \cdot \mathsf {B}\) words on the server.

3.1 Read-Only ORAM with Oblivious Setup

In this section we generalize the notion of an RO-ORAM scheme to allow the client to run the ORAM \(\textsf {Setup}\) algorithm, using the server as remote storage, when the logical memory is already stored at the server. We call this primitive an RO-ORAM scheme with oblivious setup. This primitive will be used in the next section to construct an ORAM scheme supporting writes with low \(\textsf {read}\) overhead.

At a high level, an RO-ORAM scheme with oblivious setup is an RO-ORAM scheme \(\left( \textsf {Setup},\textsf {Read}\right) \) associated with an additional protocol \(\textsf {OblSetup}\) which allows the client to execute the \(\textsf {Setup}\) algorithm using the server as remote storage when the logical memory is already stored on the server, where the execution is oblivious in the sense that the scheme remains secure when the RO-ORAM is generated using \(\textsf {OblSetup}\) instead of \(\textsf {Setup}\).

In the full version [WW18] we formalize this notion, and show that the RO-ORAM scheme of Construction 1 has oblivious setup. The oblivious setup protocol relies on the building blocks of Construction 1, and additionally uses a CPA-secure symmetric encryption scheme (whose existence follows from the existence of OWFs). The high-level idea is conceptually simple. The client first encrypts the logical memory, then generates the codeword copies by encoding the encrypted logical memory. This can be done by running the encoding procedure of the LDC “in the clear” (using the server as remote storage), because by the CPA-security of the encryption scheme, the access pattern of the encoding procedure reveals no information on the logical memory. (Indeed, the access pattern might depend on the values of the ciphertexts, but those are computationally indistinguishable from encryptions of 0.) Then, the client can use an “empty” metadata (initialized to \(\mathbf {0}\)) to generate his keys for the metadata ORAM, and update its contents by running the \(\textsf {Write}\) protocol of the metadata ORAM together with the server. Finally, the codeword copies can be obliviously permuted using the oblivious-access sort algorithm. This high-level intuition is formalized in the full version [WW18], where we prove the following:

Lemma 2

(RO-ORAM with oblivious setup). Assuming OWFs, and assuming the security of the building blocks of Construction 1, there exists a computationally-secure RO-ORAM scheme with oblivious setup. Moreover, if:

  • the logical memory \(\textsf {DB}\) has block size \(\mathsf {B}\), and the metadata ORAM has block size \(\mathsf {mB}\), satisfying \(\mathsf {B}>\mathsf {mB}\ge \log M\),

  • the metadata ORAM has \(\mathsf {Ovh}\left( N\right) \) overhead for memories of size N, and its setup algorithm can be executed using the server as remote storage by accessing \(\textsf {T}_m\left( N\right) \) words on the server, where the client (server) stores \(s_C\) (\(s_S\)) size-\(\mathsf {mB}\) blocks,

  • the oblivious-access sort algorithm has \(\mathsf {Ovh_{Sort}}\left( {n},\mathsf {B}\right) \) overhead when operating on inputs consisting of \({n}\) size-\(\mathsf {B}\) blocks,

  • the LDC has query complexity k, codeword length M, and on messages of length \(n\) its encoding procedure performs \(\textsf {T}_{\textsf {LDC}}\left( n\right) \) operations (i.e., touches \(\textsf {T}_{\textsf {LDC}}\left( n\right) \) message symbols),

then the \(\textsf {OblSetup}\) protocol accesses

$$\lambda +\textsf {T}_m\left( \frac{k\left( M+M\log M\right) }{\mathsf {mB}}\right) + 2n\cdot \frac{\mathsf {B}}{w} + \textsf {T}_{\textsf {LDC}}\left( n\right) \cdot \frac{\mathsf {B}}{w}+kM\cdot \frac{\mathsf {B}}{w} $$
$$ + \left( \frac{kM}{\mathsf {mB}}+kM\right) \cdot \mathsf {mB}\cdot \mathsf {Ovh}\left( \frac{k\left( M+M\log M\right) }{\mathsf {mB}}\right) + k\cdot \mathsf {B}\cdot \mathsf {Ovh_{Sort}}\left( {n},\mathsf {B}\right) $$

words on the server, where w denotes the word size. Moreover, the client stores \(s_C\cdot \frac{\mathsf {mB}}{\mathsf {B}}\) size-\(\mathsf {B}\) blocks, and the server stores \(n+kM+s_S\cdot \frac{\mathsf {mB}}{\mathsf {B}}+\lambda \) size-\(\mathsf {B}\) blocks.

A Note on Statistically-Secure RO-ORAM with Oblivious Setup. Our RO-ORAM with oblivious setup scheme is computationally-secure, even assuming the server does not see the memory contents. This is due to the fact that the access pattern during LDC-encoding might depend on the contents of the message being encoded, which in our case is the encrypted contents of the logical memory. Since the encryptions of two logical memories are only computationally indistinguishable, the resultant security is computational. We note that using an LDC with additional properties, we can obtain a statistically-secure RO-ORAM scheme with oblivious setup. Concretely, if the LDC encoding procedure is oblivious in the sense that its access pattern is independent of the contents of the message being encrypted (a property satisfied by, e.g., linear codes) then one can run the LDC encoding procedure on the logical memory itself, and encryption is not needed. Similarly, if the LDC has a sufficiently small encoding circuit, then encoding can be performed directly on the (un-encrypted) logical memory.

4 Oblivious RAM Supporting Writes with \(o\left( \log n\right) \) Read Complexity

In this section we extend the RO-ORAM scheme of Sect. 3 to support writes, while preserving the overhead of read operations. We instantiate our construction in several parameter regimes, obtaining the following results (see the full version [WW18] for the proofs).

First, by instantiating our construction with “best possible” sorting circuits and LDCs, we prove Theorem 2:

Theorem 7

(ORAM, “dream” parameters; formal statement of Theorem 2). Assume the existence of OWFs, as well as LDCs and sorting circuits as in Corollary 1, where the LDC has the following additional properties:

  • \(M={n}^{1+\delta }\) for some \(\delta \in \left( 0,1\right) \).

  • Encoding requires \(M^{1+\gamma }\) operations over size-\(\mathsf {B}\) blocks, for some \(\gamma \in \left( 0,1\right) \).

Then there exists an ORAM scheme for memories of size \({n}\) and blocks of size \(\mathsf {B}=\varOmega \left( \lambda \cdot \log ^3{n}\log ^7\log {n}\right) \) with \(O\left( 1\right) \) client storage, where \(\textsf {read}\) operations have \(O\left( \log \log {n}\right) \) overhead, and \(\textsf {write}\) operations have \(O\left( {n}^{\epsilon }\right) \) overhead for any constant \(\epsilon \in \left( 0,1\right) \) such that \(\epsilon >\delta +\gamma +\delta \gamma \).

Using milder assumptions regarding the parameters of the underlying sorting circuit and LDC, we can prove the following:

Theorem 8

(ORAM, milder parameters). Assume the existence of OWFs, as well as LDCs and sorting circuits as in Corollary 2, where the LDC has the additional properties specified in Theorem 7. Then there exists an ORAM scheme for memories of size \({n}\) and blocks of size \(\mathsf {B}=\varOmega \left( \lambda \cdot \log ^3{n}\log ^7\log {n}\right) \) with \({{\mathrm{poly}}}\log \log n\) client storage, where \(\textsf {read}\) operations have \(o\left( \log {n}\right) \) overhead, and \(\textsf {write}\) operations have \(O\left( {n}^{\epsilon }\right) \) overhead for any constant \(\epsilon \in \left( 0,1\right) \) such that \(\epsilon >\delta +\gamma +\delta \gamma \).

Finally, we also obtain a scheme with improved \(\textsf {write}\) overhead, by somewhat strengthening the assumptions regarding the LDC.

Theorem 9

(ORAM, low write overhead; formal statement of Theorem 3). Assume the existence of OWFs, as well as LDCs and sorting circuits as in Corollary 1, where the LDC has the following additional properties:

  • \(M={n}^{1+o\left( 1\right) }\).

  • Encoding requires \(M^{1+o\left( 1\right) }\) operations over size-\(\mathsf {B}\) blocks.

Then there exists an ORAM scheme for memories of size \({n}\) and blocks of size \(\mathsf {B}=\varOmega \left( \lambda \cdot \log ^3{n}\log ^7\log {n}\right) \) with \(O\left( 1\right) \) client storage, where \(\textsf {read}\) operations have \(o\left( \log {n}\right) \) overhead, and \(\textsf {write}\) operations have \({n}^{o\left( 1\right) }\) overhead.

Construction Overview. As outlined in Sect. 1.2, the ORAM consists of \(\ell \) levels of increasing size (growing from top to bottom), where initially the logical memory is stored in the lowest level, and all other levels are empty. \(\textsf {read}\) operations look for the memory block in all levels, returning the top-most copy of the block, and \(\textsf {write}\) operations write the memory block to the top-most level, causing a reshuffle at predefined intervals to prevent levels from overflowing.

Transforming this high-level intuition into an actual scheme requires some adjustments. First, our RO-ORAM schemeFootnote 4 was designed for logical memories given as array data structures (namely, in which blocks can only be accessed by specifying the location of the block in the logical memory), but upper levels are too small to contain the entire logical memory, namely they require RO-ORAM schemes for map data structure.Footnote 5 To overcome this issue, we associate with each level i an array \(\mathcal {DB}^i\) that contains the memory blocks of level i, and is stored in an RO-ORAM \(\mathcal {O}^i\) (for array data structures). Additionally, we store the metadata regarding which block appears in which array location in a (standard, polylogarithmic-overhead) ORAM \(\mathcal {MO}^i\) for map structures. Thus, to look for block j in level i, the client first searches for j in \(\mathcal {MO}^i\). If the j’th memory block appears in level i, then \(\mathcal {MO}^i\) returns the location t in which it appears in \(\mathcal {DB}^i\), and so the client can read the block by performing a read for address t on the RO-ORAM \(\mathcal {O}^i\) of the level.

Second, to allow for efficient “reshuffling” of level i (which, in particular, requires a traversal of both \(\mathcal {DB}^i\) and \(\mathcal {DB}^{i+1}\)), we also store \(\mathcal {DB}^i\) in every level i. Thus, every level i contains the array \(\mathcal {DB}^i\), the metadata ORAM \(\mathcal {MO}^i\) which maps blocks to their locations in \(\mathcal {DB}^i\), and the RO-ORAM \(\mathcal {O}^i\) which stores \(\mathcal {DB}^i\). We note that the metadata ORAM is not needed in the lowest level, because the structure will preserve the invariant that \(\mathcal {DB}^{\ell }\) contains all the blocks “in order” (namely, the k’th block of the logical memory is the k’th block of \(\mathcal {DB}^{\ell }\)).

Finally, every “reshuffle” of level i into level \(i+1\) requires re-generation of the RO-ORAM \(\mathcal {O}^{i+1}\), since the contents of \(\mathcal {DB}^{i+1}\) have changed. In general, re-generation cannot use the setup algorithm of the RO-ORAM due to two reasons. First, the setup is designed to be run by a trusted party, and so the server cannot run it, and since setup depends on the entire logical memory, it is too costly for the client to run on his own. Second, while the setup of an RO-ORAM is only required to be polynomial-time (since it is only executed once, and so its cost is amortized over sufficiently many accesses to the RO-ORAM), when executed repeatedly as part of reshuffle, a more stringent efficiency requirement is needed. The first property is captured by the ORAM with oblivious setup primitive (Sect. 3.1). For the second property we use the fact that our RO-ORAM scheme described in Sect. 3 has a highly-efficient oblivious setup protocol.

Given these building blocks, the ORAM operates as follows. To read the j’th logical memory block, the client looks for the block in every level. At the lowest level \(\ell \), which contains the entire logical memory, this is done by reading the block at address j from \(\mathcal {O}^{\ell }\). For all other levels \(1\le i<\ell \), this is done by first reading j from \(\mathcal {MO}^i\) to check whether the j’th memory block appears in \(\mathcal {DB}^i\), and if so in which index t; and then using \(\mathcal {O}^i\) to read the t’th block of \(\mathcal {DB}^i\). (If the j’th block does not appear in \(\mathcal {DB}^i\), a dummy read is performed on \(\mathcal {O}^i\).) The output is the copy of block j from \(\mathcal {DB}^{i^*}\) for the smallest level \(i^*\) such that \(\mathcal {DB}^{i^*}\) contains the j’th memory block. This is the “correct” answer because the levels preserve the invariant that each level contains at most one copy of each logical memory block, and the most recent copy appears in the top-most level that contains the block.

To write value v to the block at address j, the client asks the server to write a new copy of block j with value v to the top level. As noted above, this causes a reshuffle into lower levels at predefined intervals to prevent levels from overflowing. More specifically, every \(l_i\) \(\textsf {write}\) operations level i will be reshuffled into level \(i+1\), where \(l_i\) denotes the size of level i. During reshuffle, all memory blocks from \(\mathcal {DB}^i\) are copied into \(\mathcal {DB}^{i+1}\), and multiple copies of the same memory block are consolidated by storing the level-i copy. Additionally, the ORAMs \(\mathcal {MO}^{i+1},\mathcal {O}^{i+1}\) of level \(i+1\) are updated, and level i is emptied (that is, \(\mathcal {DB}^{i}\) is replaced with an empty array, and \(\mathcal {MO}^i,\mathcal {O}^i\) are updated accordingly). See Figs. 2 and 4 for an example.

Instantiating this ORAM scheme with different values of the number of levels \(\ell \) yields ORAM schemes with different tradeoffs between the \(\textsf {read}\) and \(\textsf {write}\) overhead. Concretely, Theorems 7 and 8 are obtained by setting \(\ell \) to be constant, and Theorem 9 is obtained by setting \(\ell =\frac{\log n}{\log ^2\log n}\).

We now formally describe the construction.

Construction 2

(ORAM with writes). The scheme uses the following building blocks:

  • An RO-ORAM scheme with oblivious setup \(\left( \textsf {Setup}_R,\textsf {Read}_R,\textsf {OblSetup}_R\right) \).

  • An ORAM scheme \(\left( \textsf {Setup}_m,\textsf {Read}_m,\textsf {Write}_m\right) \) for map data structures.

We define the following protocols.

  • \({{\mathbf {\mathsf{{Setup}}}}}(1^\lambda ,{{\mathbf {\mathsf{{DB}}}}})\) : Recall that \(\lambda \) denotes the security parameter, and \(\textsf {DB}\in \left( \{0,1\}^{\mathsf {B}}\right) ^{n}\). \(\textsf {Setup}\) does the following.

    • Initialize a writes counter. Initialize a writes counter \(\textsf {count}\) to 0.

    • Initialize lowest level.

      • Initialize \(\mathcal {DB}^{\ell }=\textsf {DB}\). We assume without loss of generality that the blocks in \(\textsf {DB}\) are of the form \(\left( j,b_j\right) \), namely each logical memory block contains its logical address.Footnote 6

      • Generate an RO-ORAM scheme \(\mathcal {O}^{\ell }\) for \(\mathcal {DB}^{\ell }\) by running \(\left( {\mathsf {ck}}_R^{\ell },\textsf {st}_R^{\ell }\right) \leftarrow \textsf {Setup}_R\left( 1^{\lambda },\mathcal {DB}^{\ell }\right) \) to obtain a client key \({\mathsf {ck}}_R^{\ell }\) and a server state \(\textsf {st}_R^{\ell }\) for \(\mathcal {O}^{\ell }\).

    • Initialize upper levels. For every level \(1\le i <\ell \):

      • Initialize \({\mathcal {DB}}^i\) to consist of \({}_i\) dummy memory blocks.

      • Generate an RO-ORAM scheme \(\mathcal {O}^i\) for \(\mathcal {DB}^i\) by running \(\left( {\mathsf {ck}}_R^i,\textsf {st}_R^i\right) \leftarrow \textsf {Setup}_R\left( 1^{\lambda },\mathcal {DB}^i\right) \) to obtain a client key \({\mathsf {ck}}_R^i\) and a server state \(\textsf {st}_R^i\) for \(\mathcal {O}^i\).

      • Generate a map data structure \(\mathcal {M}^i\) mapping each block \(\left( j,b_j\right) \) in \(\mathcal {DB}^i\) to its index in \(\mathcal {DB}^i\). (That is, if \(\left( j,b_j\right) \) is the t’th block of \(\mathcal {DB}^i\) then the entry \(\left( t,j\right) \) is added to \(\mathcal {M}^i\).)

      • Generate a metadata ORAM scheme \(\mathcal {MO}^i\) for \(\mathcal {M}^i\), by running \(\left( {\mathsf {ck}}_m^i,\textsf {st}_m^i\right) \leftarrow \textsf {Setup}_m\left( 1^{\lambda },\mathcal {M}^i\right) \) to obtain the client key and server state for \(\mathcal {MO}^i\).

    • Output. The long-term client key \({\mathsf {ck}}= \left( {\mathsf {ck}}_R^{\ell },\left\{ {\mathsf {ck}}_R^i,{\mathsf {ck}}_m^i\right\} _{i\in \left[ \ell -1\right] }\right) \) consists of the client keys for the RO-ORAMs \(\mathcal {O}^i\) and the metadata ORAMs \(\mathcal {MO}^i\) of all levels. The server state \(\textsf {st}_S = \left( \textsf {count},\textsf {st}_R^{\ell },\mathcal {DB}^{\ell },\left\{ \textsf {st}_R^i,\textsf {st}_m^i,\mathcal {DB}^i\right\} _{i\in \left[ \ell -1\right] }\right) \) contains the counter \(\textsf {count}\) of the number of write operations performed, the server states in the RO-ORAMs \(\mathcal {O}^i\) and the metadata ORAMs \(\mathcal {MO}^i\) of all levels, as well as the memory contents \(\mathcal {DB}^i\) of all levels.

Fig. 1.
figure 1

The \(\textsf {ReShuffle}^{\ell }\) protocol used in Construction 2

Fig. 2.
figure 2

\(\textsf {ReShuffle}^{\ell }\) execution on a toy-example ORAM with logical memory size \(n=5\) and \(\ell =4\) levels. The red circle indicates the block which is currently updated. Arrows denote the output of the metadata and RO ORAMs, where dashes arrows denote dummy accesses. Block 1 is updated first (top left), \(\mathcal {MO}^3\) is accessed and returns \(t=2\) indicating that block 1 appears as the second block of \(\mathcal {DB}^3\). The block \(\left( 1,v_1'\right) \) is then read from \(\mathcal {O}^3\), and updated in \(\mathcal {DB}^4\). Block 2 is updated next (top right), \(\mathcal {MO}^3\) is accessed and returns \(t=3\) indicating that block 2 appears as the third block of \(\mathcal {DB}^3\). The block \(\left( 2,v_2''\right) \) is then read from \(\mathcal {O}^3\), and updated in \(\mathcal {DB}^4\). Block 3 is updated next (center left), \(\mathcal {MO}^3\) is accessed and returns \(t=1\) indicating that block 3 appears as the first block of \(\mathcal {DB}^3\). The block \(\left( 3,v_3'\right) \) is then read from \(\mathcal {O}^3\), and updated in \(\mathcal {DB}^4\). Block 4 is updated next (center right), \(\mathcal {MO}^3\) is accessed and returns \(t=\perp \), indicating that block 4 does not appear in \(\mathcal {DB}^3\). Therefore, a dummy read is performed on \(\mathcal {O}^3\), and a dummy write is performed on \(\mathcal {DB}^4\). Finally, block 5 is updated (bottom left), \(\mathcal {MO}^3\) is accessed and returns \(t=\perp \), indicating that block 5 does not appear in \(\mathcal {DB}^3\). Therefore, a dummy read is performed on \(\mathcal {O}^3\), and a dummy write is performed on \(\mathcal {DB}^4\). The values of \(\mathcal {DB}^3,\mathcal {DB}^4\) at the end of the \(\textsf {ReShuffle}^{\ell }\) execution are depicted at the bottom right (these values are used to generate new RO-ORAMs \(\mathcal {O}^3,\mathcal {O}^4\), and update the metadata ORAMs \(\mathcal {MO}^3,\mathcal {MO}^4\)).

Fig. 3.
figure 3

The \(\textsf {ReShuffle}\) protocol used in Construction 2

Fig. 4.
figure 4

\(\textsf {ReShuffle}\) execution for \(i=1\) on the ORAM from Fig. 2. The red circle indicates the block which is currently updated. Arrows denote the output of the metadata and RO ORAMs, where dashes arrows denote dummy accesses. The blocks of \(\mathcal {DB}^2\) are updated first. The first block of \(\mathcal {DB}^2\) is updated first (top left), \(\mathcal {MO}^1\) is accessed and returns \(t=\perp \) indicating that this block does not appear in \(\mathcal {DB}^1\). Therefore, a dummy read is performed on \(\mathcal {O}^1\), and dummy writes are performed on \(\mathcal {MO}^1,\mathcal {DB}^2\). The second block of \(\mathcal {DB}^2\) is updated next (top right), \(\mathcal {MO}^1\) is accessed and returns \(t=1\) indicating that this block appears as the first block of \(\mathcal {DB}^1\). The block \(\left( 4,v_4'\right) \) is then read from \(\mathcal {O}^1\), and updated in \(\mathcal {DB}^2\). Then, the block is deleted from \(\mathcal {DB}^1\) by updating \(\mathcal {MO}^1\) (replacing the entry \(\left( 1,4\right) \) with \(\left( \perp ,4\right) \)). Next, the blocks of \(\mathcal {DB}^1\) are copied into \(\mathcal {DB}^2\). The first block of \(\mathcal {DB}^1\) is copied first. \(\mathcal {MO}^1\) is accessed and returns \(t=\perp \), indicating that this block was already copied into \(\mathcal {DB}^2\) (and removed from \(\mathcal {DB}^1\)). Therefore, a dummy block is written to \(\mathcal {DB}^2\), and dummy writes are performed on \(\mathcal {MO}^1,\mathcal {MO}^2\). Finally, the second block of \(\mathcal {DB}^1\) is copied. \(\mathcal {MO}^1\) is accessed and returns \(t=2\), indicating that the block has not been removed from \(\mathcal {DB}^1\). The block is then written into \(\mathcal {DB}^2\), \(\mathcal {MO}^2\) is updated to reflect that block 1 appears as the fourth block of \(\mathcal {DB}^2\), and the block is deleted from \(\mathcal {DB}^1\) by updating \(\mathcal {MO}^1\) accordingly. The values of \(\mathcal {DB}^1,\mathcal {DB}^2\) at the end of the \(\textsf {ReShuffle}\) execution are depicted at the bottom (these values are used to generate new RO-ORAMs \(\mathcal {O}^1,\mathcal {O}^2\)).

The \({{\mathbf {\mathsf{{Read}}}}}\) protocol. To read the logical memory block at location \(\textsf {addr}\in [{n}]\) from the server S, the client C with key \(\left( {\mathsf {ck}}_R^{\ell },\left\{ {\mathsf {ck}}_R^i,{\mathsf {ck}}_m^i\right\} _{i\in \left[ \ell -1\right] }\right) \) operates as follows, where in all executions of the \(\textsf {Read}_R\) protocol on \(\mathcal {O}^i\) (respectively, all executions of the \(\textsf {Read}_m\) or \(\textsf {Write}_m\) protocols on \(\mathcal {MO}^i\)) S plays the role of the server with state \(\textsf {st}_R^i\) (respectively, \(\textsf {st}_m^i\)) and C plays the role of the client with key \({\mathsf {ck}}_R^i\) (respectively, \({\mathsf {ck}}_m^i\)).

  • \(\underline{{\text {Determine block location in level}}\, i.}\)  For every level \(1\le i\le \ell -1\), run the \(\textsf {Read}_m\) protocol on \(\mathcal {MO}^i\) to read the index l in which the block appears in \(\mathcal {DB}^i\). (If block \(\textsf {addr}\) does not appear in level i, then \(l=\perp \).)

  • \(\underline{{\text {Read block from level}}\, i.}\)  For every level \(1\le i\le \ell -1\), if \(l=\perp \), set \(l=1\). Run the \(\textsf {Read}_R\) protocol on \(\mathcal {O}^i\) to read the l’th block from \(\mathcal {DB}^i\).

  • \(\underline{{\text {Read block from level}}\,\ell .}\)  Run the \(\textsf {Read}_R\) protocol on \(\mathcal {O}^{\ell }\) to read the \(\textsf {addr}\)’th block from \(\mathcal {DB}^{\ell }\).

  • Output. Let \(i^*\) be the smallest such that block \(\textsf {addr}\) appears in \(\mathcal {DB}^{i^*}\), and let \(\left( \textsf {addr},v\right) \) denote the block returned by the execution of the \(\textsf {Read}_R\) protocol on \(\mathcal {O}^{i^*}\). Output v to C. (All other memory blocks returned by the \(\textsf {Read}_R\) protocol executions are ignored.)

The \({{\mathbf {\mathsf{{Write}}}}}\) protocol. To write value \(\mathsf {val}\) to block \(\textsf {addr}\in [{n}]\) in the logical memory, the client C with key \(\left( {\mathsf {ck}}_R^{\ell },\left\{ {\mathsf {ck}}_R^i,{\mathsf {ck}}_m^i\right\} _{i\in \left[ \ell -1\right] }\right) \) operates as follows.

  • Generate a “dummy” level 0 which contains a single memory block \(\left( \textsf {addr},\mathsf {val}\right) \), and send it to the server.

  • Update the server state and client key as follows:

    • \(\textsf {count}:= \textsf {count}+1\).

    • If \(l_{\ell -1}\) divides \(\textsf {count}\), then reshuffle level \(\ell -1\) into level \(\ell \) using the \(\textsf {ReShuffle}^{\ell }\) procedure of Fig. 1, namely execute \(\textsf {ReShuffle}^{\ell }\left( {\mathsf {ck}}_R^{\ell -1},{\mathsf {ck}}_R^{\ell },{\mathsf {ck}}_m^{\ell -1},\textsf {st}_R^{\ell -1},\textsf {st}_R^{\ell }, \textsf {st}_m^{\ell -1}\right) \).

    • For every i from \(\ell -2\) down to 0 for which \(l_i\) divides \(\textsf {count}\), reshuffle level i into level \(i+1\) using the \(\textsf {ReShuffle}\) procedure of Fig. 3, namely execute \(\textsf {ReShuffle}\left( i,{\mathsf {ck}}_R^i,{\mathsf {ck}}_R^{i+1},{\mathsf {ck}}_m^i,{\mathsf {ck}}_m^{i+1},\textsf {st}_R^{i},\textsf {st}_R^{i+1},\textsf {st}_m^{i},\textsf {st}_m^{i+1}\right) \).

Remark on De-amortization. We note that using a technique of Ostrovsky and Shoup [OS97], the server complexity in Construction 2 can be de-amortized, by slightly modifying the \(\textsf {Write}\) protocol to allow the reshuffling process to be spread-out over multiple accesses to the ORAM. The reason reshuffle operations can be “spread out” is that reshuffling is performed in a “bottom-up” fashion, namely when it is time to reshuffle level i into level \(i+1\), that reshuffling is executed before level \(i-1\) is reshuffled into level i. Thus, the memory blocks that are involved in the reshuffle of level i into level \(i+1\) have been known for the last \(l_{i-1}\) time units, ever since level i was last updated due to a reshuffle of level \(i-1\) into it. Therefore, the operations needed to perform the reshuffle of level i into level \(i+1\) can be spread out over \(l_{i-1}\) operations.

A Note on Statistically-Secure ORAM with Writes. Our ORAM with writes constructions (Theorems 79) are computationally-secure due to the use of a computationally-secure RO-ORAM with oblivious setup. However, given a statistically-secure RO-ORAM with oblivious setup the resultant ORAM with writes would also be statistically secure. As noted in Sect. 3.1, such a scheme can be obtained assuming an LDC with a small encoding circuit, or with an oblivious encoding procedure. Thus, given an LDC with one of these additional properties we can get a statistically-secure ORAM with writes (with the parameters stated in Theorems 79).