Abstract
We introduce a new technique for indexing joins in encrypted SQL databases called partially precomputed joins which achieves lower leakage and bandwidth than those used in prior constructions. These techniques are incorporated into state-of-the-art structured encryption schemes for SQL data, yielding a hybrid indexing scheme with both partially and fully precomputed join indexes. We then introduce the idea of leakage-aware query planning by giving a heuristic that helps the client decide, at query time, which index to use so as to minimize leakage and stay below a given bandwidth budget. We conclude by simulating our constructions on real datasets, showing that our heuristic is accurate and that partially-precomputed joins perform well in practice.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
We excluded the \(\mathsf {film\_text}\) relation since it is a subset of the \(\mathsf {film}\) relation.
- 2.
Note that in the case of \(\mathsf {FpSj},\mathsf {HybStI}\), this includes the multimap for internal joins.
References
encrypted-bigquery-client (2015). https://github.com/google/encrypted-bigquery-client
City of Chicago data portal (2021). https://data.cityofchicago.org/
Sakila sample database (2021). https://dev.mysql.com/doc/sakila/en/
Antonopoulos, P., et al.: Azure SQL database always encrypted. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 1511–1525 (2020)
Bajaj, S., Sion, R.: TrustedDB: a trusted hardware-based database with privacy and data confidentiality. IEEE Trans. Knowl. Data Eng. 26(3), 752–765 (2013)
Bater, J., Elliott, G., Eggen, C., Goel, S., Kho, A., Rogers, J.: SMCQL: secure querying for federated databases. Proc. VLDB Endow. 10(6), 673–684 (2017)
Bellare, M., Rogaway, P.: The security of triple encryption and a framework for code-based game-playing proofs. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 409–426. Springer, Heidelberg (2006). https://doi.org/10.1007/11761679_25
Bindschaedler, V., Grubbs, P., Cash, D., Ristenpart, T., Shmatikov, V.: The TAO of inference in privacy-protected databases. Proc. VLDB Endow. 11(11), 1715–1728 (2018)
Blackstone, L., Kamara, S., Moataz, T.: Revisiting leakage abuse attacks. Cryptology ePrint Archive, Report 2019/1175 (2019). https://eprint.iacr.org/2019/1175
Bruno, N., Gravano, L.: Statistics on query expressions in relational database management systems. Ph.D. thesis, Columbia University (2003)
Cao, Y., Fan, W., Wang, Y., Yi, K.: Querying shared data with security heterogeneity. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 575–585 (2020)
Cash, D., et al.: Dynamic searchable encryption in very-large databases: data structures and implementation. In: NDSS, vol. 14, pp. 23–26. Citeseer (2014)
Cash, D., Jarecki, S., Jutla, C., Krawczyk, H., Roşu, M.-C., Steiner, M.: Highly-scalable searchable symmetric encryption with support for Boolean queries. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013. LNCS, vol. 8042, pp. 353–373. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40041-4_20
Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational data bases. In: Proceedings of the Ninth Annual ACM Symposium on Theory of Computing, pp. 77–90 (1977)
Chase, M., Kamara, S.: Structured encryption and controlled disclosure. In: Abe, M. (ed.) ASIACRYPT 2010. LNCS, vol. 6477, pp. 577–594. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17373-8_33
Chow, S.S., Lee, J.-H., Subramanian, L.: Two-party computation model for privacy-preserving queries over distributed databases. In: NDSS. Citeseer (2009)
Ciriani, V., De Capitani di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Keep a few: outsourcing data while maintaining confidentiality. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 440–455. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04444-1_27
Curtmola, R., Garay, J., Kamara, S., Ostrovsky, R.: Searchable symmetric encryption: improved definitions and efficient constructions. Cryptology ePrint Archive, Report 2006/210 (2006). https://eprint.iacr.org/2006/210
Damiani, E., Vimercati, S.D.C., Jajodia, S., Paraboschi, S., Samarati, P.: Balancing confidentiality and efficiency in untrusted relational DBMSs. In: Proceedings of the 10th ACM Conference on Computer and Communications Security, pp. 93–102 (2003)
Demertzis, I., Papadopoulos, D., Papamanthou, C., Shintre, S.: Seal: attack mitigation for encrypted databases via adjustable leakage. In: 29th USENIX Security Symposium (USENIX Security 2020) (2020)
Evdokimov, S., Günther, O.: Encryption techniques for secure database outsourcing. In: Biskup, J., López, J. (eds.) ESORICS 2007. LNCS, vol. 4734, pp. 327–342. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74835-9_22
Faber, S., Jarecki, S., Krawczyk, H., Nguyen, Q., Rosu, M., Steiner, M.: Rich queries on encrypted data: beyond exact matches. Cryptology ePrint Archive, Report 2015/927 (2015). https://eprint.iacr.org/2015/927
Garg, S., Mohassel, P., Papamanthou, C.: TWORAM: efficient oblivious RAM in two rounds with applications to searchable encryption. In: Robshaw, M., Katz, J. (eds.) CRYPTO 2016. LNCS, vol. 9816, pp. 563–592. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53015-3_20
Grofig, P., et al.: Privacy by encrypted databases. In: Preneel, B., Ikonomou, D. (eds.) APF 2014. LNCS, vol. 8450, pp. 56–69. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06749-0_4
Grubbs, P., Lacharité, M.-S., Minaud, B., Paterson, K.G.: Pump up the volume: practical database reconstruction from volume leakage on range queries. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 315–331 (2018)
Grubbs, P., Sekniqi, K., Bindschaedler, V., Naveed, M., Ristenpart, T.: Leakage-abuse attacks against order-revealing encryption. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 655–672. IEEE (2017)
Gui, Z., Johnson, O., Warinschi, B.: Encrypted databases: new volume attacks against range queries. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 361–378 (2019)
Hacigümüş, H., Iyer, B., Li, C., Mehrotra, S.: Executing SQL over encrypted data in the database-service-provider model. In: Proceedings of the 2002 ACM Sigmod International Conference on Management of Data, pp. 216–227 (2002)
Hackenjos, T., Hahn, F., Kerschbaum, F.: SAGMA: secure aggregation grouped by multiple attributes. ACM SIGMOD Record (2020)
Jaeger, J., Tyagi, N.: Handling adaptive compromise for practical encryption schemes. In: Micciancio, D., Ristenpart, T. (eds.) CRYPTO 2020. LNCS, vol. 12170, pp. 3–32. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-56784-2_1
Kamara, S., Moataz, T.: Encrypted multi-maps with computationally-secure leakage. IACR Cryptology ePrint Archive 2018, 978 (2018)
Kamara, S., Moataz, T.: SQL on structurally-encrypted databases. In: Peyrin, T., Galbraith, S. (eds.) ASIACRYPT 2018. LNCS, vol. 11272, pp. 149–180. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03326-2_6
Kamara, S., Moataz, T., Ohrimenko, O.: Structured encryption and leakage suppression. In: Shacham, H., Boldyreva, A. (eds.) CRYPTO 2018. LNCS, vol. 10991, pp. 339–370. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96884-1_12
Kamara, S., Moataz, T., Zdonik, S., Zhao, Z.: An optimal relational database encryption scheme. Cryptology ePrint Archive, Report 2020/274 (2020). https://eprint.iacr.org/2020/274. Accessed 29 Feb 2020
Kantarcıoǧlu, M., Clifton, C.: Security issues in querying encrypted data. In: Jajodia, S., Wijesekera, D. (eds.) DBSec 2005. LNCS, vol. 3654, pp. 325–337. Springer, Heidelberg (2005). https://doi.org/10.1007/11535706_24
E. S. Lab. The clusion library (2020). https://github.com/encryptedsystems/Clusion
Naveed, M., Kamara, S., Wright, C.V.: Inference attacks on property-preserving encrypted databases. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 644–655 (2015)
Patel, S., Persiano, G., Yeo, K., Yung, M.: Mitigating leakage in secure cloud-hosted data structures: Volume-hiding for multi-maps via hashing. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 79–93 (2019)
Popa, R.A., Redfield, C.M., Zeldovich, N., Balakrishnan, H.: CryptDB: protecting confidentiality with encrypted query processing. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pp. 85–100 (2011)
Sack, J.: Optimizing your query plans with the SQL server 2014 cardinality estimator (2014)
Song, D.X., Wagner, D., Perrig, A.: Practical techniques for searches on encrypted data. In: Proceeding 2000 IEEE Symposium on Security and Privacy, S&P 2000, pp. 44–55. IEEE (2000)
Tu, S.L., Kaashoek, M.F., Madden, S.R., Zeldovich, N.: Processing analytical queries over encrypted data (2013)
Yang, Z., Zhong, S., Wright, R.N.: Privacy-preserving queries on encrypted data. In: Gollmann, D., Meier, J., Sabelfeld, A. (eds.) ESORICS 2006. LNCS, vol. 4189, pp. 479–495. Springer, Heidelberg (2006). https://doi.org/10.1007/11863908_29
Acknowledgements
We would like to thank the anonymous reviewers for their comments on our work. We are also grateful to Mihir Bellare and Francesca Falzon for discussion and insights. Cash was supported in part by NSF CNS 1703953. Ng was supported by DSO National Laboratories. Rivkin was supported by the Liew Family College Research Fellows Fund.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A CJJ+’s Multimap/Dictionary Encryption Schemes
In our StE scheme constructed using \(\mathbf {SqlStE}\) (in Sect. 3) we use a specific RH dictionary encryption scheme to store the rows in \(\mathsf {DB}\). We formalize this as \(\mathsf {Dye}_{\pi }\) whose algorithms are in Fig. 13. The primitives (given as input to \(\mathbf {SqlStE}\)) used in \(\mathsf {Dye}_{\pi }\) are symmetric encryption scheme \(\mathsf {SE}\) and function family \(\mathsf {F}\). Note that \(\mathsf {Dye}_{\pi }\mathsf {.KS}=\mathsf {F}\mathsf {.KS}\times \mathsf {SE}\mathsf {.KS}\).
In our StI schemes such as \(\mathsf {PpJn},\mathsf {PpSj},\mathsf {HybStI}\) we use a RR multimap \(\mathsf {Mme}\) as a primitive. We give an example of such a scheme \(\mathsf {Mme}^{\mathrm {rr}}_{\pi }\) which is also based on \(\Pi _{\mathrm {bas}}\). Its algorithms are in algorithms are in Fig. 13. The primitives are as in \(\mathsf {Dye}_{\pi }\) but we require that \(\mathsf {SE}\mathsf {.KS}=\{0,1\}^{{\mathsf {F}\mathsf {.ol}}}\). Note that \(\mathsf {Mme}^{\mathrm {rr}}_{\pi }\mathsf {.KS}=\mathsf {F}\mathsf {.KS}\).
BProof of Theorem 1
Theorem 1. Let \(\mathsf {StE}=\mathbf {SqlStE}[\mathsf {StI},\mathsf {SE},\mathsf {F}]\) be a correct StE scheme for \(\mathsf {SqlDT}\). Then given algorithms \(\mathcal {L}^{\mathrm {i}},\mathcal {S}^{\mathrm {i}}\) and adversary \(A\) we can define \(\mathcal {L}\) as in Sect. 3.2 and construct \(\mathcal {S},A_{\mathrm {s}},A_{\mathrm {f}},A_{\mathrm {i}}\) such that:
Proof
The adversaries, simulator and games \(\mathrm {G}_0,\mathrm {G}_1,\mathrm {G}_2,\mathrm {G}_3\) are given in Fig. 14. Notice that the \(\mathsf {EncRows}\) algorithm used in the adversaries and games is given at the top, and uses two oracles \(\textsc {Enc},\textsc {Fn}\) which the algorithms define. Let b be the challenge bit selected in \(\mathrm {G}^{\mathrm {ss}}_{{\mathsf {StE},\mathcal {L},\mathcal {S}}}(A)\).
Notice that we can express \(\mathbf {Adv}^{\mathrm {ss}}_{\mathsf {StE},\mathcal {L},\mathcal {S}}(A)=\Pr [\mathrm {G}^{\mathrm {ss}}_{{\mathsf {StE},\mathcal {L},\mathcal {S}}}(A) | b=1] - \Pr [\mathrm {G}^{\mathrm {ss}}_{{\mathsf {StE},\mathcal {L},\mathcal {S}}}(A) | b=0]=\Pr [\mathrm {G}_3]-\Pr [\mathrm {G}_0]\). In \(b=1\) case, this follows directly from the definition of \(A_{\mathrm {i}}\). In the \(b=0\) case, this follows from the definition of \(\mathcal {L}^{\mathrm {i}},\mathcal {S}^{\mathrm {i}}\).
The only difference between \(\mathrm {G}_0\) and \(\mathrm {G}_1\) is whether \(\mathsf {IX},\mathsf {tk}_1,...\,,\mathsf {tk}_n\) are generated using \(\mathsf {StI}\)’s algorithms or \(\mathcal {S}\). In both cases, \(\mathbf {D}'\)’s values are encrypted using \(\mathsf {SE}\mathsf {.Enc}\). This is the same differentiation going on in the semantic security game so \(\mathrm {G}^{\mathrm {ss}}_{{\mathsf {StI},\mathcal {L}^{\mathrm {i}},\mathcal {S}^{\mathrm {i}}}}(A_{\mathrm {i}})=\Pr [\mathrm {G}_1]-\Pr [\mathrm {G}_0]\). Similarly the difference between \(\mathrm {G}_1\) and \(\mathrm {G}_2\) is whether the values in \(\mathbf {D}'\) are the output of \(\mathsf {SE}\mathsf {.Enc}\) or random strings which is what is going on in the IND$-security game \(\mathrm {G}^{\mathrm {ind\$}}_{\mathsf {SE}}(A_{\mathrm {s}})\), so \(\mathbf {Adv}^{\mathrm {ind\$}}_{\mathsf {SE}}(A_{\mathrm {s}})=\Pr [\mathrm {G}_2]-\Pr [\mathrm {G}_1]\). Once again, the difference between \(\mathrm {G}_2\) and \(\mathrm {G}_3\) is whether the labels in \(\mathbf {D}'\) (i.e. the tokens in \(\mathsf {Dye}_{\pi }\mathsf {.Enc}\)) are generated using \(\mathsf {F}\mathsf {.Ev}\) or a random function which is what is going on in the PRF-security game \(\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_{\mathrm {f}})\), so \(\mathbf {Adv}^{\mathrm {prf}}_{\mathsf {F}}(A_{\mathrm {f}})=\Pr [\mathrm {G}_3]-\Pr [\mathrm {G}_2]\).
Combining all the above equations gives the desired bound on \(\mathbf {Adv}^{\mathrm {ss}}_{\mathsf {StE},\mathcal {L},\mathcal {S}}(A)\).
CLeakage Profile and Security Proof for \(\mathsf {PpSj}\)
Theorem 3
Let \(\mathcal {L},\mathcal {S}\) be the leakage algorithm and simulator for \(\mathsf {Mme}\) respectively. Let \(\mathcal {L}^{\mathrm {p}},\mathcal {S}^{\mathrm {p}}\) be as defined in Fig. 15 and let \(\mathsf {F}\) be the function family used. Then for all adversaries \(A\) there exists adversaries \(A_{\mathrm {m}},A_{\mathrm {f}}\) such that:
Here, p is the number of distinct predicates used in constructing \(\mathsf {HS}\).
Proof
Adversary \(A_{\mathrm {m}}\) is given in Fig. 16. In the same diagram, we see \(A_1,A_2\) which are both PRF adversaries playing \(\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}\). We define \(A_{\mathrm {f}}\) to randomly pick one at run time and use it.
Now we can proceed via a standard hybrid argument. Let \(b_{\mathrm {p}},b_{\mathrm {f}},b_{\mathrm {m}}\) be the challenge bits in \(\mathrm {G}^{\mathrm {ss}}_{{\mathsf {PpSj},\mathcal {L}^{\mathrm {p}},\mathcal {S}^{\mathrm {p}}}}\), \(\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}\) and \(\mathrm {G}^{\mathrm {ss}}_{{\mathsf {Mme},\mathcal {L},\mathcal {S}}}\) respectively.
From the various advantage definitions, we have that \(\mathbf {Adv}^{\mathrm {ss}}_{\mathsf {PpSj},\mathcal {L}^{\mathrm {p}},\mathcal {S}^{\mathrm {p}}}(A)\) \(=\Pr [\mathrm {G}^{\mathrm {ss}}_{{\mathsf {PpSj},\mathcal {L}^{\mathrm {p}},\mathcal {S}^{\mathrm {p}}}}(A)| b_{\mathrm {p}}=1]-\Pr [\mathrm {G}^{\mathrm {ss}}_{{\mathsf {PpSj},\mathcal {L}^{\mathrm {p}},\mathcal {S}^{\mathrm {p}}}}(A)|b_{\mathrm {p}}=0]\), \(\mathbf {Adv}^{\mathrm {ss}}_{\mathsf {Mme},\mathcal {L},\mathcal {S}}(A_{\mathrm {m}})\) \(=\Pr [\mathrm {G}^{\mathrm {ss}}_{{\mathsf {Mme},\mathcal {L},\mathcal {S}}}(A_{\mathrm {m}})| b_{\mathrm {m}}=1]-\Pr [\mathrm {G}^{\mathrm {ss}}_{{\mathsf {Mme},\mathcal {L},\mathcal {S}}}(A_{\mathrm {m}})| b_{\mathrm {m}}=0]\), and \(\mathbf {Adv}^{\mathrm {prf}}_{\mathsf {F}}(A_1)\) \(=\Pr [\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_1)|b_{\mathrm {f}}=1]-\Pr [\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_1)|b_{\mathrm {f}}=0]\). Notice also that \(\Pr [\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_2)|b_{\mathrm {f}}\) \(=0,c=i]=\Pr [\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_2)|b_{\mathrm {f}}=1,c=i+1]\) for \(i\in [p-1]\) and \(\Pr [\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_2)|b_{\mathrm {f}}\) \(=1,c=j]-\Pr [\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_2)|b_{\mathrm {f}}=1,c=j]\le \mathbf {Adv}^{\mathrm {prf}}_{\mathsf {F}}(A_2)\) for \(j\in [p]\). This means that
Notice that \(A_{\mathrm {m}}\) in \(\mathrm {G}^{\mathrm {ss}}_{{\mathsf {Mme},\mathcal {L},\mathcal {S}}}\) uses the game to simulate multimap encryption and performs the rest itself as it happens in the “real world” of \(\mathrm {G}^{\mathrm {ss}}_{{\mathsf {PpSj},\mathcal {L}^{\mathrm {p}},\mathcal {S}^{\mathrm {p}}}}(A)\). This gives \(\Pr [\mathrm {G}^{\mathrm {ss}}_{{\mathsf {PpSj},\mathcal {L}^{\mathrm {p}},\mathcal {S}^{\mathrm {p}}}}(A)| b_{\mathrm {p}}=1]=\Pr [\mathrm {G}^{\mathrm {ss}}_{{\mathsf {Mme},\mathcal {L},\mathcal {S}}}(A_{\mathrm {m}})| b_{\mathrm {m}}=1]\). Similarly, \(A_1\) simulates multimap encryption as in the “ideal world” of \(\mathrm {G}^{\mathrm {ss}}_{{\mathsf {Mme},\mathcal {L},\mathcal {S}}}\) and defers the filtering key production to \(\textsc {Fn}\) which gives us \(\Pr [\mathrm {G}^{\mathrm {ss}}_{{\mathsf {Mme},\mathcal {L},\mathcal {S}}}(A_{\mathrm {m}})| b_{\mathrm {m}}=0]=\Pr [\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_1)|b_{\mathrm {f}}=1]\). When \(A_2\) plays \(\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_2)\), if \(c=p\) then all the \(K_i\) will be randomly selected. This means \(\Pr [\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_1)|b_{\mathrm {f}}=0]=\Pr [\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_2)|b_{\mathrm {f}}=1,c=p]\). Over p hybrids, we get to the version where all the \(\mathsf {F}\mathsf {.Ev}(K_i,\cdot )\) (where \(K_i\) is not revealed to the adversary) are simulated with random functions, giving us \(\Pr [\mathrm {G}^{\mathrm {prf}}_{\mathsf {F}}(A_1)|b_{\mathrm {f}}=0,c=1]=\Pr [\mathrm {G}^{\mathrm {ss}}_{{\mathsf {PpSj},\mathcal {L}^{\mathrm {p}},\mathcal {S}^{\mathrm {p}}}}(A)|b_{\mathrm {p}}=0]\) because this selects all of \(\mathsf {HS}\) elements as \(\mathcal {S}^{\mathrm {p}}\) does.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Cash, D., Ng, R., Rivkin, A. (2021). Improved Structured Encryption for SQL Databases via Hybrid Indexing. In: Sako, K., Tippenhauer, N.O. (eds) Applied Cryptography and Network Security. ACNS 2021. Lecture Notes in Computer Science(), vol 12727. Springer, Cham. https://doi.org/10.1007/978-3-030-78375-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-78375-4_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78374-7
Online ISBN: 978-3-030-78375-4
eBook Packages: Computer ScienceComputer Science (R0)