Skip to main content

Nearly Optimal Property Preserving Hashing

  • Conference paper
  • First Online:
Advances in Cryptology – CRYPTO 2022 (CRYPTO 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13509))

Included in the following conference series:

Abstract

Property-preserving hashing (PPH) consists of a family of compressing hash functions h such that, for any two inputs xy, we can correctly identify whether some property P(xy) holds given only the digests h(x), h(y). In a basic PPH, correctness should hold with overwhelming probability over the choice of h when xy are worst-case values chosen a-priori and independently of h. In an adversarially robust PPH (RPPH), correctness must hold even when xy are chosen adversarially and adaptively depending on h. Here, we study (R)PPH for the property that the Hamming distance between x and y is at most t.

The notion of (R)PPH was introduced by Boyle, LaVigne and Vaikuntanathan (ITCS ’19), and further studied by Fleischhacker, Simkin (Eurocrypt ’21) and Fleischhacker, Larsen, Simkin (Eurocrypt ’22). In this work, we obtain improved constructions that are conceptually simpler, have nearly optimal parameters, and rely on more general assumptions than prior works. Our results are:

  • We construct information-theoretic non-robust PPH for Hamming distance via syndrome list-decoding of linear error-correcting codes. We provide a lower bound showing that this construction is essentially optimal.

  • We make the above construction robust with little additional overhead, by relying on homomorphic collision-resistant hash functions, which can be constructed from either the discrete-logarithm or the short-integer-solution assumptions. The resulting RPPH achieves improved compression compared to prior constructions, and is nearly optimal.

  • We also show an alternate construction of RPPH for Hamming distance under the minimal assumption that standard collision-resistant hash functions exist. The compression is slightly worse than our optimized construction using homomorphic collision-resistance, but essentially matches the prior state of the art constructions from specific algebraic assumptions.

  • Lastly, we study a new notion of randomized robust PPH (R2P2H) for Hamming distance, which relaxes RPPH by allowing the hashing algorithm itself to be randomized. We give an information-theoretic construction with optimal parameters.

Research supported by NSF grant CNS-1750795, CNS-2055510 and the Alfred P. Sloan Research Fellowship.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Technically, the \(\textsf{Eval}\) procedure also takes as input the description of the hash function h, but for simplicity we omit this throughout the introduction.

  2. 2.

    They also considered two additional intermediate variants of PPH, where the adversary does not get the full description of the hash function but gets some partial oracle access before choosing xy. Our notion of robust PPH is the strongest notion they considered and is also referred to as a “direct access robust” PPH in their work.

  3. 3.

    Asymptotically, the existence of CRHFs with output length \(\ell (\lambda )=\lambda \) is equivalent to those with output length \(\ell (\lambda )=\lambda ^\varepsilon \) for \(\varepsilon >0\). Moreover, it may be plausible to even conjecture the existence of CRHFs with (e.g.,) output length \(\ell (\lambda ) = \log \lambda \log \log \lambda \). However, these choices will have vastly different exact security. All the constructions/reductions referred to in this work preserve exact security. Therefore, we find it more informative to phrase all results in terms of the exact output length \(\ell (\lambda )\) of the underlying primitive and the construction with inherit the exact security of that primitive with the given output length.

  4. 4.

    Since we’re working over \({\mathbb F}_2\), addition and subtraction are equivalent, but we use subtraction to make it easier to compare to later constructions that work in larger fields.

  5. 5.

    On the other hand, it allows us to use codes over \({\mathbb F}_3\) which may have slightly improved rate compared to ones over \({\mathbb F}_2\).

  6. 6.

    A heuristic construction would be to define the hash function \(h_{CR}\) whose description consists of an obfuscated program that has a hard-coded random matrix \(A \leftarrow {\mathbb Z}_2^{\lambda \times n}\) and a key k for a pseudorandom permutation \(\pi _k~:~\{0,1\}^{\lambda } \rightarrow \{0,1\}^\lambda \). On input (“hash”, x) the program would output \(\pi _k(Ax)\), which we would also define as the output of the hash function \(h_{CR}(x)\). On input (“homomorphism”, \(y_1\), \(y_2\)) the program would output \(\pi _k( \pi ^{-1}_k(y_1) - \pi ^{-1}_k(y_2))\), which would allow us to implement the homomorphic operation on the hash outputs.

  7. 7.

    There are multiple possible parity check matrices for any code, but the specific choice will be unimportant for us.

  8. 8.

    The asymptotic bound of \(\textrm{poly}(n)\) in fact assumes that we have a family of codes for an infinite and dense set of n.

  9. 9.

    In fact, a random linear code is known to have the stated list decodability with high probability.

References

  1. Ajtai, M.: Generating hard instances of lattice problems (extended abstract). In: 28th ACM STOC, pp. 99–108. ACM Press (1996)

    Google Scholar 

  2. Apple csam detection. https://www.apple.com/child-safety/pdf/CSAM_Detection_Technical_Summary.pdf. Accessed 13 Feb 2022

  3. Babai, L., Kimmel, P.G.: Randomized simultaneous messages: solution of a problem of yao in communication complexity. In: Twelfth Annual IEEE Conference on Proceedings of Computational Complexity, pp. 239–246 (1997)

    Google Scholar 

  4. Boyle, E., LaVigne, R., Vaikuntanathan, V.: Adversarially robust property-preserving hash functions. In: Blum, A., (ed.) ITCS 2019, vol. 124, pp. 16:1–16:20. LIPIcs (2019)

    Google Scholar 

  5. Blokh, E.L., Zyablov, V.: Linear concatenated codes. Nauka (1982)

    Google Scholar 

  6. Cohen, S.P., Naor, M.: Low communication complexity protocols, collision resistant hash functions and secret key-agreement protocols. Cryptol. ePrint Arch. Paper 2022/312, 2022. https://eprint.iacr.org/2022/312

  7. Apple’s csam detection tech is under fire - again. https://techcrunch.com/2021/08/18/apples-csam-detection-tech-is-under-fire-again/. Accessed 13 Feb 2022

  8. Dodis, Y., Ostrovsky, R., Reyzin, L., Smith, A.D.: Fuzzy extractors: how to generate strong keys from biometrics and other noisy data. SIAM J. Comput. 38(1), 97–139 (2008)

    Article  MathSciNet  Google Scholar 

  9. Fleischhacker, N., Larsen, K.G., Simkin, M.: Property-preserving hash functions from standard assumptions. EUROCRYPT (2022)

    Google Scholar 

  10. Fleischhacker, N., Simkin, M.: Robust property-preserving hash functions for hamming distance and more. In: Canteaut, A., Standaert, F.-X. (eds.) EUROCRYPT 2021. LNCS, vol. 12698, pp. 311–337. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77883-5_11

    Chapter  Google Scholar 

  11. Guruswami, V., Hastad, J., Kopparty, S.: On the list-decodability of random linear codes. In: Schulman, L.J. (ed.) Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, Massachusetts, USA, 5–8 June 2010, pp. 409–416. ACM (2010)

    Google Scholar 

  12. Guruswami, V., Rudra, A.: Better binary list decodable codes via multilevel concatenation. IEEE Trans. Inf. Theory 55(1), 19–26 (2009)

    Article  MathSciNet  Google Scholar 

  13. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Vitter, J.S., (ed.) Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, Dallas, Texas, USA, May 23–26, 1998, pp. 604–613. ACM (1998)

    Google Scholar 

  14. Kushilevitz, E., Ostrovsky, R., Rabani, Y.: Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM J. Comput. 30(2), 457–474 (2000)

    Article  MathSciNet  Google Scholar 

  15. Motwani, R., Naor, A., Panigrahy, R.: Lower bounds on locality sensitive hashing. SIAM J. Discret. Math. 21(4), 930–935 (2007)

    Article  MathSciNet  Google Scholar 

  16. Mironov, I., Naor, M., Segev, G.: Sketching in adversarial environments. In: Ladner, R.E., Dwork, C., (eds.) 40th ACM STOC, pp. 651–660. ACM Press, May 2008

    Google Scholar 

  17. Newman, I., Szegedym M.: Public versus private coin flips in one round communication games (extended abstract). In: 28th ACM STOC, pp. 561–570. ACM Press, May 1996

    Google Scholar 

  18. Apple wants to protect children. but it’s creating serious privacy risks. https://www.nytimes.com/2021/08/11/opinion/apple-iphones-privacy.html. Accessed 13 Feb 2022

  19. O’Donnell, R., Wu, Y., Zhou, Y.: Optimal lower bounds for locality sensitive hashing (except when q is tiny). In: Chazelle, B. (ed.) Innovations in Computer Science - ICS 2011. Tsinghua University, Beijing, China, January 7–9, 2011. Proceedings, pp. 275–283. Tsinghua University Press (2011)

    Google Scholar 

  20. Pedersen, T.P.: Non-interactive and information-theoretic secure verifiable secret sharing. In: Feigenbaum, J. (ed.) CRYPTO 1991. LNCS, vol. 576, pp. 129–140. Springer, Heidelberg (1992). https://doi.org/10.1007/3-540-46766-1_9

    Chapter  Google Scholar 

  21. Peterson, W.: Encoding and error-correction procedures for the Bose-Chaudhuri codes. IRE Trans. Inf. Theory 6(4), 459–470 (1960)

    Article  MathSciNet  Google Scholar 

  22. Reingold, O., Rothblum, G.N., Rothblum, R.D.: Constant-round interactive proofs for delegating computation. SIAM J. Comput. 50(3) (2021)

    Google Scholar 

  23. Apple adds a backdoor to imessage and icloud storage. https://www.schneier.com/blog/archives/2021/08/apple-adds-a-backdoor-to-imesssage-and-icloud-storage.html. Accessed 13 Feb 2022

  24. Apple’s neuralhash algorithm has been reverse-engineered. https://www.schneier.com/blog/archives/2021/08/apples-neuralhash-algorithm-has-been-reverse-engineered.html. Accessed 13 Feb 2022

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Wichs .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 International Association for Cryptologic Research

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Holmgren, J., Liu, M., Tyner, L., Wichs, D. (2022). Nearly Optimal Property Preserving Hashing. In: Dodis, Y., Shrimpton, T. (eds) Advances in Cryptology – CRYPTO 2022. CRYPTO 2022. Lecture Notes in Computer Science, vol 13509. Springer, Cham. https://doi.org/10.1007/978-3-031-15982-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15982-4_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15981-7

  • Online ISBN: 978-3-031-15982-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics