Nearly Optimal Property Preserving Hashing

Holmgren, Justin; Liu, Minghao; Tyner, LaKyah; Wichs, Daniel

doi:10.1007/978-3-031-15982-4_16

Justin Holmgren⁹,
Minghao Liu¹⁰,
LaKyah Tyner¹⁰ &
…
Daniel Wichs^9,10

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13509))

Included in the following conference series:

Annual International Cryptology Conference

942 Accesses
2 Citations

Abstract

Property-preserving hashing (PPH) consists of a family of compressing hash functions h such that, for any two inputs x, y, we can correctly identify whether some property P(x, y) holds given only the digests h(x), h(y). In a basic PPH, correctness should hold with overwhelming probability over the choice of h when x, y are worst-case values chosen a-priori and independently of h. In an adversarially robust PPH (RPPH), correctness must hold even when x, y are chosen adversarially and adaptively depending on h. Here, we study (R)PPH for the property that the Hamming distance between x and y is at most t.

The notion of (R)PPH was introduced by Boyle, LaVigne and Vaikuntanathan (ITCS ’19), and further studied by Fleischhacker, Simkin (Eurocrypt ’21) and Fleischhacker, Larsen, Simkin (Eurocrypt ’22). In this work, we obtain improved constructions that are conceptually simpler, have nearly optimal parameters, and rely on more general assumptions than prior works. Our results are:

We construct information-theoretic non-robust PPH for Hamming distance via syndrome list-decoding of linear error-correcting codes. We provide a lower bound showing that this construction is essentially optimal.
We make the above construction robust with little additional overhead, by relying on homomorphic collision-resistant hash functions, which can be constructed from either the discrete-logarithm or the short-integer-solution assumptions. The resulting RPPH achieves improved compression compared to prior constructions, and is nearly optimal.
We also show an alternate construction of RPPH for Hamming distance under the minimal assumption that standard collision-resistant hash functions exist. The compression is slightly worse than our optimized construction using homomorphic collision-resistance, but essentially matches the prior state of the art constructions from specific algebraic assumptions.
Lastly, we study a new notion of randomized robust PPH (R2P2H) for Hamming distance, which relaxes RPPH by allowing the hashing algorithm itself to be randomized. We give an information-theoretic construction with optimal parameters.

Research supported by NSF grant CNS-1750795, CNS-2055510 and the Alfred P. Sloan Research Fellowship.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Technically, the \(\textsf{Eval}\) procedure also takes as input the description of the hash function h, but for simplicity we omit this throughout the introduction.
2.
They also considered two additional intermediate variants of PPH, where the adversary does not get the full description of the hash function but gets some partial oracle access before choosing x, y. Our notion of robust PPH is the strongest notion they considered and is also referred to as a “direct access robust” PPH in their work.
3.
Asymptotically, the existence of CRHFs with output length \(\ell (\lambda )=\lambda \) is equivalent to those with output length \(\ell (\lambda )=\lambda ^\varepsilon \) for \(\varepsilon >0\). Moreover, it may be plausible to even conjecture the existence of CRHFs with (e.g.,) output length \(\ell (\lambda ) = \log \lambda \log \log \lambda \). However, these choices will have vastly different exact security. All the constructions/reductions referred to in this work preserve exact security. Therefore, we find it more informative to phrase all results in terms of the exact output length \(\ell (\lambda )\) of the underlying primitive and the construction with inherit the exact security of that primitive with the given output length.
4.
Since we’re working over \({\mathbb F}_2\), addition and subtraction are equivalent, but we use subtraction to make it easier to compare to later constructions that work in larger fields.
5.
On the other hand, it allows us to use codes over \({\mathbb F}_3\) which may have slightly improved rate compared to ones over \({\mathbb F}_2\).
6.
A heuristic construction would be to define the hash function \(h_{CR}\) whose description consists of an obfuscated program that has a hard-coded random matrix \(A \leftarrow {\mathbb Z}_2^{\lambda \times n}\) and a key k for a pseudorandom permutation \(\pi _k~:~\{0,1\}^{\lambda } \rightarrow \{0,1\}^\lambda \). On input (“hash”, x) the program would output \(\pi _k(Ax)\), which we would also define as the output of the hash function \(h_{CR}(x)\). On input (“homomorphism”, \(y_1\), \(y_2\)) the program would output \(\pi _k( \pi ^{-1}_k(y_1) - \pi ^{-1}_k(y_2))\), which would allow us to implement the homomorphic operation on the hash outputs.
7.
There are multiple possible parity check matrices for any code, but the specific choice will be unimportant for us.
8.
The asymptotic bound of \(\textrm{poly}(n)\) in fact assumes that we have a family of codes for an infinite and dense set of n.
9.
In fact, a random linear code is known to have the stated list decodability with high probability.

References

Ajtai, M.: Generating hard instances of lattice problems (extended abstract). In: 28th ACM STOC, pp. 99–108. ACM Press (1996)
Google Scholar
Apple csam detection. https://www.apple.com/child-safety/pdf/CSAM_Detection_Technical_Summary.pdf. Accessed 13 Feb 2022
Babai, L., Kimmel, P.G.: Randomized simultaneous messages: solution of a problem of yao in communication complexity. In: Twelfth Annual IEEE Conference on Proceedings of Computational Complexity, pp. 239–246 (1997)
Google Scholar
Boyle, E., LaVigne, R., Vaikuntanathan, V.: Adversarially robust property-preserving hash functions. In: Blum, A., (ed.) ITCS 2019, vol. 124, pp. 16:1–16:20. LIPIcs (2019)
Google Scholar
Blokh, E.L., Zyablov, V.: Linear concatenated codes. Nauka (1982)
Google Scholar
Cohen, S.P., Naor, M.: Low communication complexity protocols, collision resistant hash functions and secret key-agreement protocols. Cryptol. ePrint Arch. Paper 2022/312, 2022. https://eprint.iacr.org/2022/312
Apple’s csam detection tech is under fire - again. https://techcrunch.com/2021/08/18/apples-csam-detection-tech-is-under-fire-again/. Accessed 13 Feb 2022
Dodis, Y., Ostrovsky, R., Reyzin, L., Smith, A.D.: Fuzzy extractors: how to generate strong keys from biometrics and other noisy data. SIAM J. Comput. 38(1), 97–139 (2008)
Article MathSciNet Google Scholar
Fleischhacker, N., Larsen, K.G., Simkin, M.: Property-preserving hash functions from standard assumptions. EUROCRYPT (2022)
Google Scholar
Fleischhacker, N., Simkin, M.: Robust property-preserving hash functions for hamming distance and more. In: Canteaut, A., Standaert, F.-X. (eds.) EUROCRYPT 2021. LNCS, vol. 12698, pp. 311–337. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77883-5_11
Chapter Google Scholar
Guruswami, V., Hastad, J., Kopparty, S.: On the list-decodability of random linear codes. In: Schulman, L.J. (ed.) Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, Massachusetts, USA, 5–8 June 2010, pp. 409–416. ACM (2010)
Google Scholar
Guruswami, V., Rudra, A.: Better binary list decodable codes via multilevel concatenation. IEEE Trans. Inf. Theory 55(1), 19–26 (2009)
Article MathSciNet Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Vitter, J.S., (ed.) Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, Dallas, Texas, USA, May 23–26, 1998, pp. 604–613. ACM (1998)
Google Scholar
Kushilevitz, E., Ostrovsky, R., Rabani, Y.: Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM J. Comput. 30(2), 457–474 (2000)
Article MathSciNet Google Scholar
Motwani, R., Naor, A., Panigrahy, R.: Lower bounds on locality sensitive hashing. SIAM J. Discret. Math. 21(4), 930–935 (2007)
Article MathSciNet Google Scholar
Mironov, I., Naor, M., Segev, G.: Sketching in adversarial environments. In: Ladner, R.E., Dwork, C., (eds.) 40th ACM STOC, pp. 651–660. ACM Press, May 2008
Google Scholar
Newman, I., Szegedym M.: Public versus private coin flips in one round communication games (extended abstract). In: 28th ACM STOC, pp. 561–570. ACM Press, May 1996
Google Scholar
Apple wants to protect children. but it’s creating serious privacy risks. https://www.nytimes.com/2021/08/11/opinion/apple-iphones-privacy.html. Accessed 13 Feb 2022
O’Donnell, R., Wu, Y., Zhou, Y.: Optimal lower bounds for locality sensitive hashing (except when q is tiny). In: Chazelle, B. (ed.) Innovations in Computer Science - ICS 2011. Tsinghua University, Beijing, China, January 7–9, 2011. Proceedings, pp. 275–283. Tsinghua University Press (2011)
Google Scholar
Pedersen, T.P.: Non-interactive and information-theoretic secure verifiable secret sharing. In: Feigenbaum, J. (ed.) CRYPTO 1991. LNCS, vol. 576, pp. 129–140. Springer, Heidelberg (1992). https://doi.org/10.1007/3-540-46766-1_9
Chapter Google Scholar
Peterson, W.: Encoding and error-correction procedures for the Bose-Chaudhuri codes. IRE Trans. Inf. Theory 6(4), 459–470 (1960)
Article MathSciNet Google Scholar
Reingold, O., Rothblum, G.N., Rothblum, R.D.: Constant-round interactive proofs for delegating computation. SIAM J. Comput. 50(3) (2021)
Google Scholar
Apple adds a backdoor to imessage and icloud storage. https://www.schneier.com/blog/archives/2021/08/apple-adds-a-backdoor-to-imesssage-and-icloud-storage.html. Accessed 13 Feb 2022
Apple’s neuralhash algorithm has been reverse-engineered. https://www.schneier.com/blog/archives/2021/08/apples-neuralhash-algorithm-has-been-reverse-engineered.html. Accessed 13 Feb 2022

Download references

Author information

Authors and Affiliations

NTT Research, Sunnyvale, CA, 94085, USA
Justin Holmgren & Daniel Wichs
Northeastern University, Boston, MA, 02115, USA
Minghao Liu, LaKyah Tyner & Daniel Wichs

Authors

Justin Holmgren
View author publications
You can also search for this author in PubMed Google Scholar
Minghao Liu
View author publications
You can also search for this author in PubMed Google Scholar
LaKyah Tyner
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Wichs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Wichs .

Editor information

Editors and Affiliations

New York University, New York, NY, USA
Yevgeniy Dodis
University of Florida, Gainesville, FL, USA
Thomas Shrimpton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Holmgren, J., Liu, M., Tyner, L., Wichs, D. (2022). Nearly Optimal Property Preserving Hashing. In: Dodis, Y., Shrimpton, T. (eds) Advances in Cryptology – CRYPTO 2022. CRYPTO 2022. Lecture Notes in Computer Science, vol 13509. Springer, Cham. https://doi.org/10.1007/978-3-031-15982-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-15982-4_16
Published: 12 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15981-7
Online ISBN: 978-3-031-15982-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

International Association for Cryptologic Research (opens in a new tab)