Skip to main content
Log in

VBTree: forward secure conjunctive queries over encrypted data for cloud computing

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

This paper concerns the fundamental problem of processing conjunctive keyword queries over an outsourced data table on untrusted public clouds in a privacy-preserving manner. The data table can be properly implemented with tree-based searchable symmetric encryption schemes, such as the known Keyword Red–Black tree and the Indistinguishable Bloom-filter Tree in ICDE’17. However, as for these trees, there still exist many limitations to support sub-linear time updates. One of the reasons is that their tree branches are directly exposed to the cloud. To achieve efficient conjunctive queries while supporting dynamic updates, we introduce a novel tree data structure called virtual binary tree (VBTree). Our key design is to organize indexing elements into the VBTree in a top-down fashion, without storing any tree branches and tree nodes. The tree only exists in a logical view, and all of the elements are actually stored in a hash table. To achieve forward privacy, which is discussed by Bost in CCS’16, we also propose a storage mechanism called version control repository (VCR), to record and control versions of keywords and queries. VCR has a smaller client-side storage compared to other forward-private schemes. With our proposed approach, data elements can be quickly searched while the index can be privately updated. The security of the VBTree is formally proved under the IND-CKA2 model. We test our scheme on a real e-mail dataset and a user location dataset. The testing results demonstrate its high efficiency and scalability in both searching and updating processes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Amazon: “Amazon Web services” (2017). http://aws.amazon.com

  2. Microsoft: “Microsoft Azure” (2017). http://www.microsoft.com/azure

  3. Google: “Google App Engine” (2017). http://code.google.com/appengine

  4. Zhang, Y., Katz, J., Papamanthou, C.: All your queries are belong to us: the power of file-injection attacks on searchable encryption. In: 25th USENIX Security Symposium (USENIX), pp. 707–720. USENIX Association (2016)

  5. Curtmola, R., Garay, J., Kamara, S., et al.: Searchable symmetric encryption: improved definitions and efficient constructions. In: Proceedings of the 13th ACM Conference on Computer and Communications Security (CCS), vol. 95, No. 5, pp. 79–88. ACM (2006)

  6. Kamara, S., Papamanthou, C.: Parallel and dynamic searchable symmetric encryption. In: Sadeghi, A.R. (ed.) Financial Cryptography and Data Security FC 2013. Lecture Notes in Computer Science, vol. 7859, pp. 258–274. Springer, Berlin, Heidelberg (2013)

    Chapter  Google Scholar 

  7. Bost, R.: \(\Sigma o\varphi o\varsigma \): forward secure searchable encryption. In: ACM Sigsac Conference on Computer and Communications Security (CCS), pp. 1143–1154. ACM (2016)

  8. Liu, Z., Lv, S., et al.: FFSSE: flexible forward secure searchable encryption with efficient performance. ACR Cryptology ePrint Archive (2017)

  9. Li, R., Liu, A.X.: Adaptively secure conjunctive query processing over encrypted data for cloud computing. In: International Conference on Data Engineering (ICDE), pp. 697–708. IEEE (2017)

  10. Goh, E.J.: Secure indexes. IACR Cryptology ePrint Archive (2003)

  11. Li, R., Liu, A.X., Wang, A.L., et al.: Fast range query processing with strong privacy protection for cloud computing. In: International Conference on Very Large Data Bases (VLDB), pp. 1953–1964 (2014)

  12. Naveed, M., Prabhakaran, M., Gunter, C.A.: Dynamic searchable encryption via blind storage. In: Security and Privacy (S&P), pp. 639–654 (2014)

  13. Li, R., Liu, A.X., Wang, A.L., et al.: Fast and scalable range query processing with strong privacy protection for cloud computing. In: Transactions on Networking (TON), pp. 2305–2318 (2016)

  14. Xia, Z., Wang, X., Sun, X., Wang, Q.: A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. Trans. Parallel Distrib. Syst. (TPDS) 27(2), 340–352 (2016)

    Article  Google Scholar 

  15. Song, D.X., Wagner, D., Perrig, A.: Practical techniques for searches on encrypted data. In: IEEE Symposium on Security and Privacy (S&P), pp. 44–55 (2000)

  16. Chang, Y.C., Mitzenmacher, M.: Privacy preserving keyword searches on remote encrypted data. In: Applied Cryptography and Network Security (ACNS), pp. 442–455. Springer, Berlin (2005)

  17. Bezawada, B., Liu, A.X., Jayaraman, B., et al.: Privacy preserving string matching for cloud computing. In: IEEE International Conference on Distributed Computing Systems (ICDCS), pp. 609–618 (2015)

  18. Chase, M., Kamara, S.: Structured encryption and controlled disclosure. In: Conference on the Theory and Application of Cryptology and Information Security (ASIACRYPT), pp. 577–594. Springer, New York (2010)

  19. Kurosawa, K., Ohtaki, Y.: UC-secure searchable symmetric encryption. In: Financial Cryptography and Data Security (FC), pp. 285–298. Springer, New York (2012)

  20. Liesdonk, P.V., Sedghi, S., Doumen, J., Hartel, P., Jonker, W.: Computationally efficient searchable symmetric encryption. In: VLDB Conference on Secure Data Management (SDM), pp. 87–100. Springer, New York (2010)

  21. Cash, D., Jarecki, S., Jutla, C., et al.: Highly-scalable searchable symmetric encryption with support for Boolean queries. In: International Cryptology Conference (CRYPTO), pp. 353–373. Springer, New York (2013)

  22. Pappas, V., Krell, F., Vo, B., et al.: Blind seer: a scalable private DBMS. In: Security and Privacy (S&P), pp. 359–374 (2014)

  23. Ishai, Y., Kushilevitz, E., Lu, S., et al.: Private large-scale databases with distributed searchable symmetric encryption. In: Cryptographers ’Track at the RSA Conference, pp. 90–107. Springer, New York (2016)

  24. Kamara, S., Moataz, T.: SQL on structurally-encrypted databases. IACR Cryptology ePrint Archive (2016)

  25. Kamara, S., Papamanthou, C., Roeder, T.: Dynamic searchable symmetric encryption. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCS), pp. 965–976. ACM (2012)

  26. Cash, D., Jaeger, J., Jarecki, S., et al.: Dynamic searchable encryption in very-large databases: data structures and implementation. In: Network and Distributed System Security (NDSS), pp. 23–26. ISOC (2014)

  27. Kamara, S., Moataz, T.: Boolean searchable symmetric encryption with worst-case sub-linear complexity. In: European Cryptology Conference (EUROCRYPT). Springer, New York (2017)

  28. Wang, B., Yu, S., Lou, W., et al.: Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud. In: INFOCOM, pp. 2112–2120 (2014)

  29. Fu, Z., Wu, X., Guan, C., et al.: Toward efficient multi-keyword fuzzy search over encrypted outsourced data with accuracy improvement. In: Transactions on Information Forensics and Security (TIFS), pp. 2706–2716

  30. Stefanov, E., Papamanthou, C., Shi, E.: Practical dynamic searchable encryption with small leakage. In: Network and Distributed System Security (NDSS), pp. 23–26. ISOC (2014)

  31. Garg, S., Mohassel, P., Papamanthou, C.: TWORAM: round-optimal oblivious RAM with applications to searchable encryption. IACR Cryptology ePrint Archive (2015)

  32. Bost, R., Fouque, P.A., Pointcheval, D.: Verifiable dynamic symmetric searchable encryption: optimality and forward security. IACR Cryptology ePrint Archive (2016)

  33. Chang, Z., Xie, D., Li, F.: Oblivious RAM: a dissection and experimental evaluation. In: International Conference on Very Large Data Bases (VLDB), pp. 1113–1124 (2016)

  34. Islam, M.S., Kuzu, M., Kantarcioglu, M.: Access pattern disclosure on searchable encryption: ramification, attack and mitigation. In: Network and Distributed System Security (NDSS). ISOC (2012)

  35. Popa, R.A., Redfield, C., Zeldovich, N., et al.: CryptDB: protecting confidentiality with encrypted query processing. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP), pp. 85–100. ACM (2011)

  36. Mavroforakis, C., Chenette, N., O’Neill, A., et al.: Modular order-preserving encryption, Revisited. ACM International Conference on Management of Data (SIGMOD), pp. 763–777. ACM (2015)

  37. Naveed, M., Kamara, S., Wright, C.V.: Inference attacks on property-preserving encrypted databases. In: ACM Sigsac Conference on Computer and Communications Security (CCS), pp. 644–655. ACM (2015)

  38. Yao, A.C.: Protocols for secure computations. In: Foundations of Computer Science (SFCS), pp. 160–164 (1982)

  39. Ben-David, A., Nisan, N., Pinkas, B.: FairplayMP: a system for secure multi-party computation. In: Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS), pp. 257–266. ACM (2008)

  40. Dijk, M.V., Gentry, C., Halevi, S., et al.: Fully homomorphic encryption over the integers. In: Advances in Cryptology – EUROCRYPT, pp. 24–43. Springer, Berlin, Heidelberg (2010)

  41. Enron email dataset (2015). http://www.cs.cmu.edu/~enron/

  42. Cho, E., Myers, S.A., Leskovec, J.: Friendship and mobility: user movement in location-based social networks. In: Proceedings of the 17th International Conference on Knowledge Discovery and Data mining (SIGKDD), pp. 1082–1090. ACM (2011)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiqiang Wu.

Appendix: Proof of Theorem 5.1

Appendix: Proof of Theorem 5.1

Proof

Let’s first study DDFPP. First, given a partition instance, the solution can be verified in polynomial time and then DDFPP is NP. Next, we reduce a known NP-complete problem, the Subset Sum Problem, to DDFPP. The Subset Sum Problem is as follows: “For a multiset of positive numbers \(A=\{a_1,a_2,\ldots ,a_{n}\}\), given a positive number t, is there a set B such that \(\sum _{a_i\in B}{a_i}=t\) and \(B\subseteq A\)?”. \(\square \)

Considering an instance of the Subset Sum Problem with a positive number multiset \(A=\{a_1,a_2,\ldots ,a_n\}\) and a positive number t, we convert it to an instance of DDFPP using following construction. If n is odd, we insert a zero number into A first. We create two variables \(a_{n+1}\) and \(a_{n+2}\), and let \(\sum _{a_i\in A}{a_i}=b\), \(a_{n+1}=2b-t\), and \(a_{n+2}=b+t\). Then, a new set C is created as follows, \(C=A\cup \{a_{n+1},a_{n+2}\}\). For each number \(a_i\) in C, we generate a file group \(G_{a_i}=\{d_1, d_2, \cdots , d_{a_i}\}\). All generated files are \(G_{a_1}\bigcup G_{a_2}\bigcup \cdots \bigcup G_{a_{n+2}}\), which have \(\sum _{a_i\in C}{a_i}=b+(2b-t)+(b+t)=4b\) files in total. For each data file group \(G_{a_i}\), we create k unique keywords first and insert these k keywords into every file in the group. That is to say, in the \(a_i\)th file group, \(a_i\) files share k same keywords. Next, we insert other randomly generated keywords into every file. Repeat this process until all of the file groups are initialized.

Suppose the file groups constructed above have a data file partition solution, and the file groups can be partitioned into \({\mathcal {F}}_1\) and \({\mathcal {F}}_2\) in polynomial time, such that \(\vert {\mathcal {F}}_1\vert =\vert {\mathcal {F}}_2\vert \) and \(\vert W({\mathcal {F}}_1)\bigcap W({\mathcal {F}}_2)\vert <k\). We now prove that A has a subset sum solution. Note that, for each of the file group, the files share k common keywords. It implies that for any file group \(G_{a_i}\) constructed from the number \(a_i\), the files of \(G_{a_i}\) are either all in \({\mathcal {F}}_1\) or all in \({\mathcal {F}}_2\). Otherwise, suppose there exist two files \(d_i\) and \(d_j\) with \(d_i\in {\mathcal {F}}_1\) and \(d_j\in {\mathcal {F}}_2\), then we have \(\vert d_i\bigcap d_j\vert \geqslant k\) and \(\vert W({\mathcal {F}}_1)\bigcap W({\mathcal {F}}_2)\vert \geqslant k\). This contradicts the fact that we assume. Since \(\vert {\mathcal {F}}_1\vert =\vert {\mathcal {F}}_2\vert \), we have \(\vert G_{x_1}\vert +\vert G_{x_2}\vert +\cdots +\vert G_{x_r}\vert =\vert G_{y_1}\vert +\vert G_{y_2}\vert +\cdots +\vert G_{y_r}\vert \), where \(G_{x_i}\subseteq {\mathcal {F}}_1\), \(G_{y_i}\subseteq {\mathcal {F}}_2\), and \(2r=n+2\). As \(\vert G_{x_i}\vert =x_i\) and \(\vert G_{y_i}\vert =y_i\), the following equation holds, \(x_1+x_2+\cdots +x_r=y_1+y_2+\cdots +y_r\). Let \(B=\{x_1,x_2,\ldots ,x_r\}\) and \(B'=\{y_1,y_2,\ldots ,y_r\}\). Note that, \(a_{n+1}\) and \(a_{n+2}\) cannot coexist in the set B or in the set \(B'\), otherwise, \(\sum {x_i}\ne \sum {y_i}\). If \(a_{n+1}\) is in B, then the subset sum \(\sum _{i\in ({B-\{a_{n+1}\}})}a_i=(2b-(2b-t))=t\). If \(a_{n+1}\) is in B,’ then \(\sum _{i\in ({B'-\{a_{n+1}\}})}a_i=(2b-(2b-t))=t\). Now, we have a subset sum solution of \(B-\{a_{n+1}\}\) or \(B'-\{a_{n+1}\}\). Finally, the Subset Sum Problem \(\le _p\) DDFPP, which means DDFPP is NP-complete. Since DDFPP is a special case of DFPP, then DFPP is NP-hard.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Z., Li, K. VBTree: forward secure conjunctive queries over encrypted data for cloud computing. The VLDB Journal 28, 25–46 (2019). https://doi.org/10.1007/s00778-018-0517-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-018-0517-6

Keywords

Navigation