Alibi: A Flaw in Cuckoo-Hashing Based Hierarchical ORAM Schemes and a Solution

Hemenway Falk, Brett; Noble, Daniel; Ostrovsky, Rafail

doi:10.1007/978-3-030-77883-5_12

Brett Hemenway Falk¹⁰,
Daniel Noble¹⁰ &
Rafail Ostrovsky¹¹

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12698))

Included in the following conference series:

Annual International Conference on the Theory and Applications of Cryptographic Techniques

1396 Accesses
6 Citations

Abstract

There once was a table of hashes

That held extra items in stashes

It all seemed like bliss

But things went amiss

When the stashes were stored in the caches

The first Oblivious RAM protocols introduced the “hierarchical solution,” (STOC ’90) where the server stores a series of hash tables of geometrically increasing capacities. Each ORAM query would read a small number of locations from each level of the hierarchy, and each level of the hierarchy would be reshuffled and rebuilt at geometrically increasing intervals to ensure that no single query was ever repeated twice at the same level. This yielded an ORAM protocol with polylogarithmic overhead.

Future works extended and improved the hierarchical solution, replacing traditional hashing with cuckoo hashing (ICALP ’11) and cuckoo hashing with a combined stash (Goodrich et al. SODA ’12). In this work, we identify a subtle flaw in the protocol of Goodrich et al. (SODA ’12) that uses cuckoo hashing with a stash in the hierarchical ORAM solution.

We give a concrete distinguishing attack against this type of hierarchical ORAM that uses cuckoo hashing with a combined stash. This security flaw has propagated to at least 5 subsequent hierarchical ORAM protocols, including the recent optimal ORAM scheme, OptORAMa (Eurocrypt ’20).

In addition to our attack, we identify a simple fix that does not increase the asymptotic complexity.

We note, however, that our attack only affects more recent hierarchical ORAMs, but does not affect the early protocols that predate the use of cuckoo hashing, or other types of ORAM solutions (e.g. Path ORAM or Circuit ORAM).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Rebuilds require constructing oblivious hash tables, which is relatively costly, so the amortized cost of lookups is usually dominated by the rebuild cost. Much of the progress in the literature has been towards reducing this cost, but to simplify the narrative, we focus here only on the costs of lookups without rebuilds.
2.
Even though a logarithmic-sized stash provides a negligible failure probability, for the smaller levels, a failure probability that is negligible in the size of the level may be non-negligible in the overall size of the ORAM. To avoid this problem, [GM11] suggested using traditional hash tables (rather than cuckoo hashing) for the smaller levels of the hierarchy, i.e., until the level size reached $\mathcal {O}\left( \log ^7(n) \right) $.
3.
With some additional work, an ORAM scheme can be made to be an oblivious implementation of a dictionary, i.e., that have keys chosen from a space different than [N], but we avoid this version for simplicity.
4.
In practice, implementations must use hash functions that are not truly random, but seem sufficiently random to a computationally bounded adversary.
5.
In the client-server setting expense is measured by communication between the client and the server. In the MPC setting, expense is measured as the communication between the parties in the computation.
6.
This greedy matching assignment not give an optimal matching for G, but it will provide an upper bound for $s_1$ in terms of $s_0$.
7.
Some schemes use a mixture of hash table types at different levels. We do not require that all levels use a Cuckoo Hash Table, only that there is at least one such level of size $\le \frac{N}{2}$ that has its stash re-inserted into the ORAM data structure.
8.
This is not quite true. We would like to construct $L_i$ such that it contains indices $1, \ldots , m$ (although some may of these may be stashed). However, due to reinsersions of the stash this will actually need to occur in a level with capacity roughly 2m. If additional accesses are needed to trigger the rebuild, then the same element, e.g., (1, 0) can be looked up multiple times. The exact details of what sequence of accesses is needed in order to cause elements $1, \ldots , m$ to be inserted into a particular level also varies depending on how exactly the ORAM is constructed. More generally, the sequence $(1, 0), \ldots , (m, 0)$ at the beginning of both U and $U'$ should be replaced with whatever sequence in the given ORAM is needed in order to instantiate a level to contain exactly the indices $1, \ldots , m$.
9.
It is possible that when the ORAM is initialized, elements from $L_\ell $ are stashed and stored in the cache. These elements would inadvertently also be stored in $L_i$. The effect of this on the Cuckoo Hash Table is small.
10.
OptORAMa seaches in the Combined Stash after searching in the bins, so the access pattern in the bins will be the same for items that are later found in the Combined Stash. However, in PanORAMa, the Combined Stash is accessed before the bins are accessed and a random bin is chosen in the case that the data is found in the Combined Stash. Therefore, the access patterns in the individual bins are also vulnerable to a distinguishing attack based on the fact that stashed elements will not be searched for. This can simply be solved by searching the bins before searching the Combined Stash.
11.
The proof would work out the same if T was the Combined Stash Hash Table.
12.
This protocol uses a slightly definition of Oblivious Hash Tables. Rather than returning a single array, $\mathsf {Build}$ returns a tuple $(T_i, S_i)$, where $T_i$ is the main table and $S_i$ is the stash. $\mathsf {Lookup}$ only contains the non-stash locations.
13.
Chan et al. also presented a concrete instantiation of Goodrich and Mitzenmacher’s ORAM protocol in an appendix of the full version of their paper. The protocol they present uses a Cuckoo Hash Table at each level and a shared stash, so is vulnerable to the attack described in this paper. However, they recommend, somewhat clairvoyantly, that since Cuckoo hashing is complex and hard to prove correct, that their two-tier hash-table protocol should be used rather than the Cuckoo-hashing protocol.
14.
In response to our preprint, Asharov et al. have updated the OptORAMa paper to include a fix.

References

Aumüller, M., Dietzfelbinger, M., Woelfel, P.: Explicit and efficient hash families suffice for cuckoo hashing with a stash. Algorithmica 70(3), 428–456 (2014)
Article MathSciNet Google Scholar
Asharov, G., Komargodski, I., Lin, W.-K., Nayak, K., Peserico, E., Shi, E.: OptORAMa: optimal oblivious RAM. In: Canteaut, A., Ishai, Y. (eds.) EUROCRYPT 2020. LNCS, vol. 12106, pp. 403–432. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45724-2_14
Chapter Google Scholar
Brasser, F., Müller, U., Dmitrienko, A., Kostiainen, K., Capkun, S., Sadeghi, A.-R.: SGX cache attacks are practical. In: WOOT, Software Grand Exposure (2017)
Google Scholar
Chan, T.-H.H., Guo, Y., Lin, W.-K., Shi, E.: Oblivious hashing revisited, and applications to asymptotically efficient ORAM and OPRAM. In: Takagi, T., Peyrin, T. (eds.) ASIACRYPT 2017. LNCS, vol. 10624, pp. 660–690. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70694-8_23
Chapter Google Scholar
Doerner, J., Shelat, A.: Scaling ORAM for secure computation. In: CCS, pp. 523–535 (2017)
Google Scholar
Götzfried, J., Eckert, M., Schinzel, S., Müller, T.: Cache attacks on Intel SGX. In: Proceedings of the 10th European Workshop on Systems Security, pp. 1–6 (2017)
Google Scholar
Goodrich, M.T., Mitzenmacher, M.: Privacy-preserving access of outsourced data via oblivious RAM simulation. In: Aceto, L., Henzinger, M., Sgall, J. (eds.) ICALP 2011. LNCS, vol. 6756, pp. 576–587. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22012-8_46
Chapter Google Scholar
Goodrich, M.T., Mitzenmacher, M., Ohrimenko, O., Tamassia, R.: Privacy-preserving group data access via stateless oblivious RAM simulation. In: SODA, pp. 157–167. SIAM (2012)
Google Scholar
Goldreich, O., Ostrovsky, R.: Software protection and simulation on oblivious RAMs. JACM 43(3), 431–473 (1996)
Article MathSciNet Google Scholar
John, T.M., Haider, S.K., Omar, H., van Dijk, M.: Connecting the dots: privacy leakage via write-access patterns to the main memory. IEEE Trans. Dependable Secure Comput. (2017)
Google Scholar
Kushilevitz, E., Lu, S., Ostrovsky, R.: On the (in) security of hash-based oblivious RAM and a new balancing scheme. In: SODA, pp. 143–156. SIAM (2012)
Google Scholar
Kushilevitz, E., Mour, T.: Sub-logarithmic distributed oblivious RAM with small block size. In: Lin, D., Sako, K. (eds.) PKC 2019. LNCS, vol. 11442, pp. 3–33. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17253-4_1
Chapter Google Scholar
Kirsch, A., Mitzenmacher, M., Wieder, U.: More robust hashing: Cuckoo hashing with a stash. SIAM J. Comput. 39(4), 1543–1561 (2009)
Article MathSciNet Google Scholar
Liu, C., Huang, Y., Shi, E., Katz, J., Hicks, M.: Automating efficient RAM-model secure computation. In: S&P, pp. 623–638. IEEE (2014)
Google Scholar
Lu, S., Ostrovsky, R.: Distributed oblivious RAM for secure two-party computation. In: Sahai, A. (ed.) TCC, pp. 377–396. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36594-2_22
Moghimi, A., Irazoqui, G., Eisenbarth, T.: CacheZoom: how SGX amplifies the power of cache attacks. In: Fischer, W., Homma, N. (eds.) CHES 2017. LNCS, vol. 10529, pp. 69–90. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66787-4_4
Chapter Google Scholar
Mitzenmacher, M.: Some open questions related to cuckoo hashing. In: ESA, pp. 1–10 (2009)
Google Scholar
Mitchell, J.C., Zimmerman, J.: Data-oblivious data structures. In: STACS. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2014)
Google Scholar
Ostrovsky, R., Shoup, V.: Private information storage. In: STOC, vol. 97, pp. 294–303. Citeseer (1997)
Google Scholar
Ostrovsky, R.: Efficient computation on oblivious RAMs. In: STOC, pp. 514–523 (1990)
Google Scholar
Ostrovsky, R.: Software protection and simulation on oblivious RAMs. Ph.D. thesis, Massachusetts Institute of Technology (1992)
Google Scholar
Patel, S., Persiano, G., Raykova, M., Yeo, K.: PanORAMa: oblivious RAM with logarithmic overhead. In: FOCS, pp. 871–882. IEEE (2018)
Google Scholar
Pagh, R., Rodler, F.F.: Cuckoo hashing. J. Algorithms 51, 122–144 (2004)
Article MathSciNet Google Scholar
Pinkas, B., Reinman, T.: Oblivious RAM revisited. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 502–519. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14623-7_27
Chapter Google Scholar
Shi, E., Chan, T.-H.H., Stefanov, E., Li, M.: Oblivious RAM with O((logN)³) worst-case cost. In: Lee, D.H., Wang, X. (eds.) ASIACRYPT 2011. LNCS, vol. 7073, pp. 197–214. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25385-0_11
Chapter Google Scholar
Sasy, S., Gorbunov, S., Fletcher, C.W.: Zerotrace: oblivious memory primitives from Intel SGX. IACR Cryptol. ePrint Arch., 2017:549 (2017)
Google Scholar
Stefanov, E.: Path ORAM: an extremely simple oblivious RAM protocol. In: CCS, pp. 299–310 (2013)
Google Scholar
Wang, X., Chan, H., Shi, E.: Circuit ORAM: on tightness of the Goldreich-Ostrovsky lower bound. In: CCS, pp. 850–861 (2015)
Google Scholar
Wang, X.S., Huang, Y., Chan, T.-H.H., Shelat, A., Shi, E.: SCORAM: oblivious RAM for secure computation. In: CCS, pp. 191–202. ACM (2014)
Google Scholar

Download references

Acknowledgements

This research was sponsored in part by ONR grant (N00014-15-1-2750) “SynCrypt: Automated Synthesis of Cryptographic Constructions”. This research was supported in part by DARPA under Cooperative Agreement No: HR0011-20-2-0025, NSF-BSF Grant1619348, US-Israel BSF grant 2012366, Google Faculty Award, JP Morgan Faculty Award, IBM Faculty Research Award, Xerox Faculty Research Award, OKAWA Foundation Research Award, B. John Garrick Foundation Award, Teradata Research Award, and Lockheed-Martin Corporation Research Award. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of DARPA, the Department of Defense, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes not withstanding any copyright annotation therein.

Author information

Authors and Affiliations

University of Pennsylvania, Philadelphia, USA
Brett Hemenway Falk & Daniel Noble
UCLA, Los Angeles, USA
Rafail Ostrovsky

Authors

Brett Hemenway Falk
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Noble
View author publications
You can also search for this author in PubMed Google Scholar
Rafail Ostrovsky
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brett Hemenway Falk .

Editor information

Editors and Affiliations

Inria, Paris, France
Anne Canteaut
UCLouvain, Louvain-la-Neuve, Belgium
François-Xavier Standaert

Appendices

Supplementary Material

A Distinguishing Distributions

In this section, we review a basic fact that if two distributions are statistically different, and supported on polynomial-sized sets, then they are polynomial-time distinguishable.

Lemma 7

Let $\left\{ X_n \right\} $, $\left\{ Y_n \right\} $ denote two sequences of distributions supported on polynomial-sized sets, i.e., there is a constant c, such that $\max ( |X_n|, |Y_n| ) < n^c$. In addition, assume that $X_n$ and $Y_n$ are efficiently samplable.

Then if $\varDelta ( X_n, Y_n )$ is non-negligible, the distributions $\left\{ X_n \right\} $ and $\left\{ Y_n \right\} $ are polynomial-time distinguishable.

Proof

Consider the following maximum likelihood distinguisher, D. Let $W = {\text {supp}}(X_n) \cup {\text {supp}}(Y_n)$, and $m = |W|$. Define

$$\begin{aligned} p_z&{\mathop {=}\limits ^{\text {def}}}\Pr \left[ X_n = z \right] \\ q_z&{\mathop {=}\limits ^{\text {def}}}\Pr \left[ Y_n = z \right] \end{aligned}$$

Fix $t = {\text {poly}}(n)$.

Recall that if $W = X_n \cup Y_n$,

$$\begin{aligned} \sum _{w \in W} \max (p_w,q_w)&= \frac{1}{2} \sum _{w \in W} \left[ \left[ \max (p_w,q_w) + \min (p_w,q_w) \right] + \left[ \max (p_w,q_w) - \min (p_w,q_w) \right] \right] \\&= \frac{1}{2} \left[ 2 + \sum _{w \in W} \left[ \max (p_w,q_w) - \min (p_w,q_w) \right] \right] \\&= \frac{1}{2} \left[ 2 + 2 \varDelta (X_n,Y_n) \right] \\&= 1 + \varDelta (X_n,Y_n) \\ \end{aligned}$$

First, D will estimate the frequency of elements in both $X_n$ and $Y_n$ by sampling. First D will draw tm samples from $X_n$, let $X_{\text {sampled}}$ denote the multiset corresponding to these samples. Similarly D will draw tm samples from $Y_n$. Let $Y_{\text {sampled}}$ be the multiset corresponding to these samples.

Then D defines

$$\begin{aligned} \tilde{p}_w&{\mathop {=}\limits ^{\text {def}}}\frac{ \text{ number } \text{ of } \text{ times } \text{ w } \text{ occurred } \text{ in } X_{\text {sampled}} }{tm} \\ \tilde{q}_w&{\mathop {=}\limits ^{\text {def}}}\frac{ \text{ number } \text{ of } \text{ times } \text{ w } \text{ occurred } \text{ in } Y_{\text {sampled}} }{tm} \end{aligned}$$

Finally, given a sample z from a distribution $Z \in \left\{ X_n,Y_n \right\} $, the adversary will guess

$$ A(z) = \left\{ \begin{array}{l} \text{ X } \text{ if } \tilde{p}_z \ge \tilde{q}_z \\ \text{ Y } \text{ if } \tilde{p}_z < \tilde{q}_z\text{. } \end{array} \right. $$

A Hoeffding bound shows that

$$ \Pr \left[ \left| \tilde{p}_z - p_z \right| > \delta \right] < 2e^{- 2mt \delta ^2} $$

and similarly

$$ \Pr \left[ \left| \tilde{q}_z - q_z \right| > \delta \right] < 2e^{- 2mt \delta ^2} $$

Fix $\delta > 0$, and define

Now, notice that

$$\begin{aligned} \max (p_z,q_z) - 2\delta < \min (p_z,q_z) \qquad \text{ for } \text{ all } z \in B . \end{aligned}$$

(1)

The Hoeffding bounds give

$$\begin{aligned} \Pr \left[ \max ( p_z, q_z ) = \max \left( \tilde{p}_z, \tilde{q}_z \right) \right] > 1 - 2e^{-2mt\delta ^2} \qquad \text{ for } z \in G \end{aligned}$$

(2)

Let $\epsilon = \max _z \left( |\Pr (X_n = z) - \Pr (Y_n=z)| \right) $. Thus $\epsilon \ge \frac{\varDelta (X_n,Y_n)}{m}$, which is non-negligible.

$$\begin{aligned} \Pr&\left[ \text{ A } \text{ is } \text{ correct } \right] \\&= \frac{1}{2} \left[ \sum _{z \in Z} \Pr \left[ \max \left( \tilde{p}_z, \tilde{q}_z \right) = \max \left( p_z, q_z \right) \right] \max \left( p_z,q_z \right) + \sum _{z \in Z} \Pr \left[ \max \left( \tilde{p}_z, \tilde{q}_z \right) \ne \max \left( p_z, q_z \right) \right] \min \left( p_z, q_z \right) \right] \\&=\frac{1}{2} \left[ \sum _{z \in Z} \Pr \left[ \max \left( \tilde{p}_z, \tilde{q}_z \right) = \max \left( p_z, q_z \right) \right] \max \left( p_z,q_z \right) + \sum _{z \in B} \Pr \left[ \max \left( \tilde{p}_z, \tilde{q}_z \right) \ne \max \left( p_z, q_z \right) \right] \min \left( p_z, q_z \right) \right] \\&\ge \frac{1}{2} \left[ \sum _{z \in G} \Pr \left[ \max \left( \tilde{p}_z, \tilde{q}_z \right) = \max \left( p_z, q_z \right) \right] \max \left( p_z,q_z \right) + \sum _{z \in B} \left[ \max \left( p_z, q_z \right) -2 \delta \right] \right] \\&\ge \frac{1}{2} \left[ \left( 1 - 2e^{-2mt\delta ^2} \right) \sum _{z \in Z} \max \left( p_z,q_z \right) - 2m\delta \right] \\&= \frac{1}{2} \left[ \left( 1 - 2e^{-2mt\delta ^2} \right) \left[ 1 + \varDelta (X_n,Y_n) \right] - 2m\delta \right] \\&= \left( 1 - 2e^{-2mt\delta ^2} \right) \left[ \frac{1}{2} + \frac{1}{2}\varDelta (X_n,Y_n) \right] - m\delta \\ \end{aligned}$$

Which is a non-negligible advantage for sufficiently large t and sufficiently small $\delta $.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hemenway Falk, B., Noble, D., Ostrovsky, R. (2021). Alibi: A Flaw in Cuckoo-Hashing Based Hierarchical ORAM Schemes and a Solution. In: Canteaut, A., Standaert, FX. (eds) Advances in Cryptology – EUROCRYPT 2021. EUROCRYPT 2021. Lecture Notes in Computer Science(), vol 12698. Springer, Cham. https://doi.org/10.1007/978-3-030-77883-5_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-77883-5_12
Published: 16 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77882-8
Online ISBN: 978-3-030-77883-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the International Association for Cryptologic Research (opens in a new tab)