Skip to main content

Alibi: A Flaw in Cuckoo-Hashing Based Hierarchical ORAM Schemes and a Solution

  • Conference paper
  • First Online:
Advances in Cryptology – EUROCRYPT 2021 (EUROCRYPT 2021)

Abstract

There once was a table of hashes

That held extra items in stashes

It all seemed like bliss

But things went amiss

When the stashes were stored in the caches

The first Oblivious RAM protocols introduced the “hierarchical solution,” (STOC ’90) where the server stores a series of hash tables of geometrically increasing capacities. Each ORAM query would read a small number of locations from each level of the hierarchy, and each level of the hierarchy would be reshuffled and rebuilt at geometrically increasing intervals to ensure that no single query was ever repeated twice at the same level. This yielded an ORAM protocol with polylogarithmic overhead.

Future works extended and improved the hierarchical solution, replacing traditional hashing with cuckoo hashing (ICALP ’11) and cuckoo hashing with a combined stash (Goodrich et al. SODA ’12). In this work, we identify a subtle flaw in the protocol of Goodrich et al. (SODA ’12) that uses cuckoo hashing with a stash in the hierarchical ORAM solution.

We give a concrete distinguishing attack against this type of hierarchical ORAM that uses cuckoo hashing with a combined stash. This security flaw has propagated to at least 5 subsequent hierarchical ORAM protocols, including the recent optimal ORAM scheme, OptORAMa (Eurocrypt ’20).

In addition to our attack, we identify a simple fix that does not increase the asymptotic complexity.

We note, however, that our attack only affects more recent hierarchical ORAMs, but does not affect the early protocols that predate the use of cuckoo hashing, or other types of ORAM solutions (e.g. Path ORAM or Circuit ORAM).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Rebuilds require constructing oblivious hash tables, which is relatively costly, so the amortized cost of lookups is usually dominated by the rebuild cost. Much of the progress in the literature has been towards reducing this cost, but to simplify the narrative, we focus here only on the costs of lookups without rebuilds.

  2. 2.

    Even though a logarithmic-sized stash provides a negligible failure probability, for the smaller levels, a failure probability that is negligible in the size of the level may be non-negligible in the overall size of the ORAM. To avoid this problem, [GM11] suggested using traditional hash tables (rather than cuckoo hashing) for the smaller levels of the hierarchy, i.e., until the level size reached \(\mathcal {O}\left( \log ^7(n) \right) \).

  3. 3.

    With some additional work, an ORAM scheme can be made to be an oblivious implementation of a dictionary, i.e., that have keys chosen from a space different than [N], but we avoid this version for simplicity.

  4. 4.

    In practice, implementations must use hash functions that are not truly random, but seem sufficiently random to a computationally bounded adversary.

  5. 5.

    In the client-server setting expense is measured by communication between the client and the server. In the MPC setting, expense is measured as the communication between the parties in the computation.

  6. 6.

    This greedy matching assignment not give an optimal matching for G, but it will provide an upper bound for \(s_1\) in terms of \(s_0\).

  7. 7.

    Some schemes use a mixture of hash table types at different levels. We do not require that all levels use a Cuckoo Hash Table, only that there is at least one such level of size \(\le \frac{N}{2}\) that has its stash re-inserted into the ORAM data structure.

  8. 8.

    This is not quite true. We would like to construct \(L_i\) such that it contains indices \(1, \ldots , m\) (although some may of these may be stashed). However, due to reinsersions of the stash this will actually need to occur in a level with capacity roughly 2m. If additional accesses are needed to trigger the rebuild, then the same element, e.g., (1, 0) can be looked up multiple times. The exact details of what sequence of accesses is needed in order to cause elements \(1, \ldots , m\) to be inserted into a particular level also varies depending on how exactly the ORAM is constructed. More generally, the sequence \((1, 0), \ldots , (m, 0)\) at the beginning of both U and \(U'\) should be replaced with whatever sequence in the given ORAM is needed in order to instantiate a level to contain exactly the indices \(1, \ldots , m\).

  9. 9.

    It is possible that when the ORAM is initialized, elements from \(L_\ell \) are stashed and stored in the cache. These elements would inadvertently also be stored in \(L_i\). The effect of this on the Cuckoo Hash Table is small.

  10. 10.

    OptORAMa seaches in the Combined Stash after searching in the bins, so the access pattern in the bins will be the same for items that are later found in the Combined Stash. However, in PanORAMa, the Combined Stash is accessed before the bins are accessed and a random bin is chosen in the case that the data is found in the Combined Stash. Therefore, the access patterns in the individual bins are also vulnerable to a distinguishing attack based on the fact that stashed elements will not be searched for. This can simply be solved by searching the bins before searching the Combined Stash.

  11. 11.

    The proof would work out the same if T was the Combined Stash Hash Table.

  12. 12.

    This protocol uses a slightly definition of Oblivious Hash Tables. Rather than returning a single array, \(\mathsf {Build}\) returns a tuple \((T_i, S_i)\), where \(T_i\) is the main table and \(S_i\) is the stash. \(\mathsf {Lookup}\) only contains the non-stash locations.

  13. 13.

    Chan et al. also presented a concrete instantiation of Goodrich and Mitzenmacher’s ORAM protocol in an appendix of the full version of their paper. The protocol they present uses a Cuckoo Hash Table at each level and a shared stash, so is vulnerable to the attack described in this paper. However, they recommend, somewhat clairvoyantly, that since Cuckoo hashing is complex and hard to prove correct, that their two-tier hash-table protocol should be used rather than the Cuckoo-hashing protocol.

  14. 14.

    In response to our preprint, Asharov et al. have updated the OptORAMa paper to include a fix.

References

  1. Aumüller, M., Dietzfelbinger, M., Woelfel, P.: Explicit and efficient hash families suffice for cuckoo hashing with a stash. Algorithmica 70(3), 428–456 (2014)

    Article  MathSciNet  Google Scholar 

  2. Asharov, G., Komargodski, I., Lin, W.-K., Nayak, K., Peserico, E., Shi, E.: OptORAMa: optimal oblivious RAM. In: Canteaut, A., Ishai, Y. (eds.) EUROCRYPT 2020. LNCS, vol. 12106, pp. 403–432. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45724-2_14

    Chapter  Google Scholar 

  3. Brasser, F., Müller, U., Dmitrienko, A., Kostiainen, K., Capkun, S., Sadeghi, A.-R.: SGX cache attacks are practical. In: WOOT, Software Grand Exposure (2017)

    Google Scholar 

  4. Chan, T.-H.H., Guo, Y., Lin, W.-K., Shi, E.: Oblivious hashing revisited, and applications to asymptotically efficient ORAM and OPRAM. In: Takagi, T., Peyrin, T. (eds.) ASIACRYPT 2017. LNCS, vol. 10624, pp. 660–690. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70694-8_23

    Chapter  Google Scholar 

  5. Doerner, J., Shelat, A.: Scaling ORAM for secure computation. In: CCS, pp. 523–535 (2017)

    Google Scholar 

  6. Götzfried, J., Eckert, M., Schinzel, S., Müller, T.: Cache attacks on Intel SGX. In: Proceedings of the 10th European Workshop on Systems Security, pp. 1–6 (2017)

    Google Scholar 

  7. Goodrich, M.T., Mitzenmacher, M.: Privacy-preserving access of outsourced data via oblivious RAM simulation. In: Aceto, L., Henzinger, M., Sgall, J. (eds.) ICALP 2011. LNCS, vol. 6756, pp. 576–587. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22012-8_46

    Chapter  Google Scholar 

  8. Goodrich, M.T., Mitzenmacher, M., Ohrimenko, O., Tamassia, R.: Privacy-preserving group data access via stateless oblivious RAM simulation. In: SODA, pp. 157–167. SIAM (2012)

    Google Scholar 

  9. Goldreich, O., Ostrovsky, R.: Software protection and simulation on oblivious RAMs. JACM 43(3), 431–473 (1996)

    Article  MathSciNet  Google Scholar 

  10. John, T.M., Haider, S.K., Omar, H., van Dijk, M.: Connecting the dots: privacy leakage via write-access patterns to the main memory. IEEE Trans. Dependable Secure Comput. (2017)

    Google Scholar 

  11. Kushilevitz, E., Lu, S., Ostrovsky, R.: On the (in) security of hash-based oblivious RAM and a new balancing scheme. In: SODA, pp. 143–156. SIAM (2012)

    Google Scholar 

  12. Kushilevitz, E., Mour, T.: Sub-logarithmic distributed oblivious RAM with small block size. In: Lin, D., Sako, K. (eds.) PKC 2019. LNCS, vol. 11442, pp. 3–33. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17253-4_1

    Chapter  Google Scholar 

  13. Kirsch, A., Mitzenmacher, M., Wieder, U.: More robust hashing: Cuckoo hashing with a stash. SIAM J. Comput. 39(4), 1543–1561 (2009)

    Article  MathSciNet  Google Scholar 

  14. Liu, C., Huang, Y., Shi, E., Katz, J., Hicks, M.: Automating efficient RAM-model secure computation. In: S&P, pp. 623–638. IEEE (2014)

    Google Scholar 

  15. Lu, S., Ostrovsky, R.: Distributed oblivious RAM for secure two-party computation. In: Sahai, A. (ed.) TCC, pp. 377–396. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36594-2_22

  16. Moghimi, A., Irazoqui, G., Eisenbarth, T.: CacheZoom: how SGX amplifies the power of cache attacks. In: Fischer, W., Homma, N. (eds.) CHES 2017. LNCS, vol. 10529, pp. 69–90. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66787-4_4

    Chapter  Google Scholar 

  17. Mitzenmacher, M.: Some open questions related to cuckoo hashing. In: ESA, pp. 1–10 (2009)

    Google Scholar 

  18. Mitchell, J.C., Zimmerman, J.: Data-oblivious data structures. In: STACS. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2014)

    Google Scholar 

  19. Ostrovsky, R., Shoup, V.: Private information storage. In: STOC, vol. 97, pp. 294–303. Citeseer (1997)

    Google Scholar 

  20. Ostrovsky, R.: Efficient computation on oblivious RAMs. In: STOC, pp. 514–523 (1990)

    Google Scholar 

  21. Ostrovsky, R.: Software protection and simulation on oblivious RAMs. Ph.D. thesis, Massachusetts Institute of Technology (1992)

    Google Scholar 

  22. Patel, S., Persiano, G., Raykova, M., Yeo, K.: PanORAMa: oblivious RAM with logarithmic overhead. In: FOCS, pp. 871–882. IEEE (2018)

    Google Scholar 

  23. Pagh, R., Rodler, F.F.: Cuckoo hashing. J. Algorithms 51, 122–144 (2004)

    Article  MathSciNet  Google Scholar 

  24. Pinkas, B., Reinman, T.: Oblivious RAM revisited. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 502–519. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14623-7_27

    Chapter  Google Scholar 

  25. Shi, E., Chan, T.-H.H., Stefanov, E., Li, M.: Oblivious RAM with O((logN)3) worst-case cost. In: Lee, D.H., Wang, X. (eds.) ASIACRYPT 2011. LNCS, vol. 7073, pp. 197–214. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25385-0_11

    Chapter  Google Scholar 

  26. Sasy, S., Gorbunov, S., Fletcher, C.W.: Zerotrace: oblivious memory primitives from Intel SGX. IACR Cryptol. ePrint Arch., 2017:549 (2017)

    Google Scholar 

  27. Stefanov, E.: Path ORAM: an extremely simple oblivious RAM protocol. In: CCS, pp. 299–310 (2013)

    Google Scholar 

  28. Wang, X., Chan, H., Shi, E.: Circuit ORAM: on tightness of the Goldreich-Ostrovsky lower bound. In: CCS, pp. 850–861 (2015)

    Google Scholar 

  29. Wang, X.S., Huang, Y., Chan, T.-H.H., Shelat, A., Shi, E.: SCORAM: oblivious RAM for secure computation. In: CCS, pp. 191–202. ACM (2014)

    Google Scholar 

Download references

Acknowledgements

This research was sponsored in part by ONR grant (N00014-15-1-2750) “SynCrypt: Automated Synthesis of Cryptographic Constructions”. This research was supported in part by DARPA under Cooperative Agreement No: HR0011-20-2-0025, NSF-BSF Grant1619348, US-Israel BSF grant 2012366, Google Faculty Award, JP Morgan Faculty Award, IBM Faculty Research Award, Xerox Faculty Research Award, OKAWA Foundation Research Award, B. John Garrick Foundation Award, Teradata Research Award, and Lockheed-Martin Corporation Research Award. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of DARPA, the Department of Defense, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes not withstanding any copyright annotation therein.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brett Hemenway Falk .

Editor information

Editors and Affiliations

Appendices

Supplementary Material

A Distinguishing Distributions

In this section, we review a basic fact that if two distributions are statistically different, and supported on polynomial-sized sets, then they are polynomial-time distinguishable.

Lemma 7

Let \(\left\{ X_n \right\} \), \(\left\{ Y_n \right\} \) denote two sequences of distributions supported on polynomial-sized sets, i.e., there is a constant c, such that \(\max ( |X_n|, |Y_n| ) < n^c\). In addition, assume that \(X_n\) and \(Y_n\) are efficiently samplable.

Then if \(\varDelta ( X_n, Y_n )\) is non-negligible, the distributions \(\left\{ X_n \right\} \) and \(\left\{ Y_n \right\} \) are polynomial-time distinguishable.

Proof

Consider the following maximum likelihood distinguisher, D. Let \(W = {\text {supp}}(X_n) \cup {\text {supp}}(Y_n)\), and \(m = |W|\). Define

$$\begin{aligned} p_z&{\mathop {=}\limits ^{\text {def}}}\Pr \left[ X_n = z \right] \\ q_z&{\mathop {=}\limits ^{\text {def}}}\Pr \left[ Y_n = z \right] \end{aligned}$$

Fix \(t = {\text {poly}}(n)\).

Recall that if \(W = X_n \cup Y_n\),

$$\begin{aligned} \sum _{w \in W} \max (p_w,q_w)&= \frac{1}{2} \sum _{w \in W} \left[ \left[ \max (p_w,q_w) + \min (p_w,q_w) \right] + \left[ \max (p_w,q_w) - \min (p_w,q_w) \right] \right] \\&= \frac{1}{2} \left[ 2 + \sum _{w \in W} \left[ \max (p_w,q_w) - \min (p_w,q_w) \right] \right] \\&= \frac{1}{2} \left[ 2 + 2 \varDelta (X_n,Y_n) \right] \\&= 1 + \varDelta (X_n,Y_n) \\ \end{aligned}$$

First, D will estimate the frequency of elements in both \(X_n\) and \(Y_n\) by sampling. First D will draw tm samples from \(X_n\), let \(X_{\text {sampled}}\) denote the multiset corresponding to these samples. Similarly D will draw tm samples from \(Y_n\). Let \(Y_{\text {sampled}}\) be the multiset corresponding to these samples.

Then D defines

$$\begin{aligned} \tilde{p}_w&{\mathop {=}\limits ^{\text {def}}}\frac{ \text{ number } \text{ of } \text{ times } \text{ w } \text{ occurred } \text{ in } X_{\text {sampled}} }{tm} \\ \tilde{q}_w&{\mathop {=}\limits ^{\text {def}}}\frac{ \text{ number } \text{ of } \text{ times } \text{ w } \text{ occurred } \text{ in } Y_{\text {sampled}} }{tm} \end{aligned}$$

Finally, given a sample z from a distribution \(Z \in \left\{ X_n,Y_n \right\} \), the adversary will guess

$$ A(z) = \left\{ \begin{array}{l} \text{ X } \text{ if } \tilde{p}_z \ge \tilde{q}_z \\ \text{ Y } \text{ if } \tilde{p}_z < \tilde{q}_z\text{. } \end{array} \right. $$

A Hoeffding bound shows that

$$ \Pr \left[ \left| \tilde{p}_z - p_z \right| > \delta \right] < 2e^{- 2mt \delta ^2} $$

and similarly

$$ \Pr \left[ \left| \tilde{q}_z - q_z \right| > \delta \right] < 2e^{- 2mt \delta ^2} $$

Fix \(\delta > 0\), and define

Now, notice that

$$\begin{aligned} \max (p_z,q_z) - 2\delta < \min (p_z,q_z) \qquad \text{ for } \text{ all } z \in B . \end{aligned}$$
(1)

The Hoeffding bounds give

$$\begin{aligned} \Pr \left[ \max ( p_z, q_z ) = \max \left( \tilde{p}_z, \tilde{q}_z \right) \right] > 1 - 2e^{-2mt\delta ^2} \qquad \text{ for } z \in G \end{aligned}$$
(2)

Let \(\epsilon = \max _z \left( |\Pr (X_n = z) - \Pr (Y_n=z)| \right) \). Thus \(\epsilon \ge \frac{\varDelta (X_n,Y_n)}{m}\), which is non-negligible.

$$\begin{aligned} \Pr&\left[ \text{ A } \text{ is } \text{ correct } \right] \\&= \frac{1}{2} \left[ \sum _{z \in Z} \Pr \left[ \max \left( \tilde{p}_z, \tilde{q}_z \right) = \max \left( p_z, q_z \right) \right] \max \left( p_z,q_z \right) + \sum _{z \in Z} \Pr \left[ \max \left( \tilde{p}_z, \tilde{q}_z \right) \ne \max \left( p_z, q_z \right) \right] \min \left( p_z, q_z \right) \right] \\&=\frac{1}{2} \left[ \sum _{z \in Z} \Pr \left[ \max \left( \tilde{p}_z, \tilde{q}_z \right) = \max \left( p_z, q_z \right) \right] \max \left( p_z,q_z \right) + \sum _{z \in B} \Pr \left[ \max \left( \tilde{p}_z, \tilde{q}_z \right) \ne \max \left( p_z, q_z \right) \right] \min \left( p_z, q_z \right) \right] \\&\ge \frac{1}{2} \left[ \sum _{z \in G} \Pr \left[ \max \left( \tilde{p}_z, \tilde{q}_z \right) = \max \left( p_z, q_z \right) \right] \max \left( p_z,q_z \right) + \sum _{z \in B} \left[ \max \left( p_z, q_z \right) -2 \delta \right] \right] \\&\ge \frac{1}{2} \left[ \left( 1 - 2e^{-2mt\delta ^2} \right) \sum _{z \in Z} \max \left( p_z,q_z \right) - 2m\delta \right] \\&= \frac{1}{2} \left[ \left( 1 - 2e^{-2mt\delta ^2} \right) \left[ 1 + \varDelta (X_n,Y_n) \right] - 2m\delta \right] \\&= \left( 1 - 2e^{-2mt\delta ^2} \right) \left[ \frac{1}{2} + \frac{1}{2}\varDelta (X_n,Y_n) \right] - m\delta \\ \end{aligned}$$

Which is a non-negligible advantage for sufficiently large t and sufficiently small \(\delta \).

Rights and permissions

Reprints and permissions

Copyright information

© 2021 International Association for Cryptologic Research

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hemenway Falk, B., Noble, D., Ostrovsky, R. (2021). Alibi: A Flaw in Cuckoo-Hashing Based Hierarchical ORAM Schemes and a Solution. In: Canteaut, A., Standaert, FX. (eds) Advances in Cryptology – EUROCRYPT 2021. EUROCRYPT 2021. Lecture Notes in Computer Science(), vol 12698. Springer, Cham. https://doi.org/10.1007/978-3-030-77883-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-77883-5_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-77882-8

  • Online ISBN: 978-3-030-77883-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics