ABSTRACT
Large-scale cloud storage systems often attempt to achieve two seemingly conflicting goals: (1) the systems need to reduce the copies of redundant data to save space, a process called deduplication; and (2) users demand encryption of their data to ensure privacy. Conventional encryption makes deduplication on ciphertexts ineffective, as it destroys data redundancy. A line of work, originated from Convergent Encryption [27], and evolved into Message Locked Encryption [13] and the latest DupLESS architecture [12], strives to solve this problem. DupLESS relies on a key server to help the clients generate encryption keys that result in convergent ciphertexts. In this paper, we first introduce a new security notion appropriate for the setting of deduplication and show that it is strictly stronger than all relevant notions. We then provide a rigorous proof of security against this notion, in the random oracle model, for the DupLESS architecture which is lacking in the original paper. Our proof shows that using additional secret, other than the data itself, for generating encryption keys achieves the best possible security under current deduplication paradigm. We also introduce a distributed protocol that eliminates the need for the key server. This not only provides better protection but also allows less managed systems such as P2P systems to enjoy the high security level. Implementation and evaluation show that the scheme is both robust and practical.
- Bitcasa. http://www.bitcasa.com/.Google Scholar
- Ciphertite. http://www.ciphertite.com.Google Scholar
- Dropbox. http://www.dropbox.com/.Google Scholar
- flud. http://flud.org.Google Scholar
- Freenet. https://freenetproject.org/.Google Scholar
- GNUnet. http://gnunet.org.Google Scholar
- P. Anderson and L. Zhang. Fast and secure laptop backups with encrypted de-duplication. In Proceedings of the 24th international conference on Large installation system administration, LISA'10, pages 1--8, Berkeley, CA, USA, 2010. USENIX Association. Google ScholarDigital Library
- B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In PODS '07, pages 273--282, New York, NY, USA, 2007. ACM Press. Google ScholarDigital Library
- M. Bellare, A. Boldyreva, and A. O'Neill. Deterministic and efficiently searchable encryption. In Proceedings of the 27th annual international cryptology conference on Advances in cryptology, CRYPTO'07, pages 535--552, Berlin, Heidelberg, 2007. Springer-Verlag. Full Version of this paper at http://www.cc.gatech.edu/ aboldyre/papers/bbo.pdf. Google ScholarDigital Library
- M. Bellare, A. Desai, E. Jokipii, and P. Rogaway. A concrete security treatment of symmetric encryption. In Proceedings of the 38th Annual Symposium on Foundations of Computer Science, FOCS '97, pages 394--403, Washington, DC, USA, 1997. IEEE Computer Society. Google ScholarDigital Library
- M. Bellare, M. Fischlin, A. O'Neill, and T. Ristenpart. Deterministic encryption: Definitional equivalences and constructions without random oracles. In Proceedings of the 28th Annual conference on Cryptology: Advances in Cryptology, CRYPTO 2008, pages 360--378, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarDigital Library
- M. Bellare and S. Keelveedhi. DupLESS: Server-aided encryption for deduplicated storage. In USENIX Security Symposium 2013, 2013. Google ScholarDigital Library
- M. Bellare, S. Keelveedhi, and T. Ristenpart. Message-locked encryption and secure deduplication. In T. Johansson and P. Nguyen, editors, Advances in Cryptology lC EUROCRYPT 2013, volume 7881 of Lecture Notes in Computer Science, pages 296--312. Springer Berlin Heidelberg, 2013.Google Scholar
- K. Bennett, C. Grothoff, T. Horozov, and I. Patrascu. Efficient sharing of encrypted data. In Proceedings of the 7th Australian Conference on Information Security and Privacy, ACISP '02, pages 107--120, London, UK, UK, 2002. Springer-Verlag. Google ScholarDigital Library
- A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: the SuLQ framework. In PODS '05, pages 128--138, New York, NY, USA, 2005. ACM Press. Google ScholarDigital Library
- A. Boldyreva, S. Fehr, and A. O'Neill. On notions of security for deterministic encryption, and efficient constructions without random oracles. In Proceedings of the 28th Annual conference on Cryptology: Advances in Cryptology, CRYPTO 2008, pages 335--359, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarDigital Library
- W. J. Bolosky, J. R. Douceur, D. Ely, and M. Theimer. Feasibility of a serverless distributed file system deployed on an existing set of desktop pcs. In Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, SIGMETRICS '00, pages 34--43, New York, NY, USA, 2000. ACM. Google ScholarDigital Library
- D. Boneh and M. Franklin. Efficient generation of shared rsa keys. J. ACM, 48(4):702--722, July 2001. Google ScholarDigital Library
- D. Chaum. Blind signatures for untraceable payments. In D. Chaum, R. Rivest, and A. Sherman, editors, Advances in Cryptology, pages 199--203. Springer US, 1983.Google ScholarCross Ref
- F. Chin and G. Ozsoyoglu. Auditing for secure statistical databases. In ACM 81: Proceedings of the ACM '81 conference, pages 53--59, New York, NY, USA, 1981. ACM. Google ScholarDigital Library
- A. T. Clements, I. Ahmad, M. Vilayannur, and J. Li. Decentralized deduplication in san cluster file systems. In Proceedings of the 2009 conference on USENIX Annual technical conference, USENIX'09, pages 8--8, Berkeley, CA, USA, 2009. USENIX Association. Google ScholarDigital Library
- R. Cramer and V. Shoup. A practical public key cryptosystem provably secure against adaptive chosen ciphertext attack. In H. Krawczyk, editor, Advances in Cryptology al CRYPTO '98, volume 1462 of Lecture Notes in Computer Science, pages 13--25. Springer Berlin Heidelberg, 1998. Google ScholarDigital Library
- T. Dalenius. Towards a methodology for statistical disclosure control. Statistik Tidskrift, 15:429--444, 1977.Google Scholar
- I. Damgård and M. Koprowski. Practical threshold rsa signatures without a trusted dealer. In Proceedings of the International Conference on the Theory and Application of Cryptographic Techniques: Advances in Cryptology, EUROCRYPT '01, pages 152--165, London, UK, UK, 2001. Springer-Verlag. Google ScholarDigital Library
- I. Damgård and G. L. Mikkelsen. Efficient, robust and constant-round distributed rsa key generation. In Proceedings of the 7th international conference on Theory of Cryptography, TCC'10, pages 183--200, Berlin, Heidelberg, 2010. Springer-Verlag. Google ScholarDigital Library
- I. Dinur and K. Nissim. Revealing information while preserving privacy. In PODS '03, pages 202--210, New York, NY, USA, 2003. ACM Press. Google ScholarDigital Library
- J. R. Douceur, A. Adya, W. J. Bolosky, D. Simon, and M. Theimer. Reclaiming space from duplicate files in a serverless distributed file system. In Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02), ICDCS '02, pages 617--624, Washington, DC, USA, 2002. IEEE Computer Society. Google ScholarDigital Library
- Y. Duan. Privacy without noise. In CIKM '09, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- Y. Duan and J. Canny. Protecting user data in ubiquitous computing: Towards trustworthy environments. In PET'04, 2004. Google ScholarDigital Library
- Y. Duan and J. Canny. How to construct multicast cryptosystems provably secure against adaptive chosen ciphertext attack. In RSA Conference 2006, Cryptographers' Track. San Jose, USA, volume 3860 of Lecture Notes in Computer Science, pages 244--261. Springer-Verlag, 2006. Google ScholarDigital Library
- Y. Duan, J. Canny, and J. Zhan. P4P: Practical large-scale privacy-preserving distributed computation robust against malicious users. In USENIX Security Symposium 2010, pages 609--618, 2010. Google ScholarDigital Library
- M. Dutch. Understanding data deduplication ratios. http://www.snia.org, 2008.Google Scholar
- C. Dwork. An ad omnia approach to defining and achieving private data analysis. In PinKDD, pages 1--13, 2007. Google ScholarDigital Library
- C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT 2006. Springer, 2006. Google ScholarDigital Library
- C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC 2006. Springer, 2006. Google ScholarDigital Library
- EMC. http://www.emc.com/solutions/samples/backuprecovery-archiving/backup-data-deduplication.htm.Google Scholar
- P.-A. Fouque and J. Stern. Fully distributed threshold RSA under standard assumptions. In Proceedings of the 7th International Conference on the Theory and Application of Cryptology and Information Security: Advances in Cryptology, ASIACRYPT '01, pages 310--330, London, UK, UK, 2001. Springer-Verlag. Google ScholarDigital Library
- Y. Frankel, P. D. MacKenzie, and M. Yung. Robust efficient distributed rsa-key generation. In Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing, PODC '98, pages 320--, New York, NY, USA, 1998. ACM. Google ScholarDigital Library
- S. Goldwasser and S. Micali. Probabilistic encryption & how to play mental poker keeping secret all partial information. In Proceedings of the fourteenth annual ACM symposium on Theory of computing, STOC '82, pages 365--377, New York, NY, USA, 1982. ACM. Google ScholarDigital Library
- S. Goldwasser and S. Micali. Probabilistic encryption. Journal of Computer and System Sciences, 28(2):270-- 299, 1984.Google ScholarCross Ref
- S. Goldwasser, S. Micali, and C. Rackoff. The knowledge complexity of interactive proof systems. SIAM J. Comput., 18(1):186--208, Feb. 1989. Google ScholarDigital Library
- S. Goldwasser, S. Micali, and R. L. Rivest. A digital signature scheme secure against adaptive chosen-message attacks. SIAM Journal on Computing, 17(2):281--308, 1988. Google ScholarDigital Library
- S. K. Langford. Threshold dss signatures without a trusted party. In Proceedings of the 15th Annual International Cryptology Conference on Advances in Cryptology, CRYPTO '95, pages 397--409, London, UK, UK, 1995. Springer-Verlag. Google ScholarDigital Library
- F. McSherry and I. Mironov. Differentially private recommender systems: Building privacy into the netflix prize contenders. In KDD '09, pages 627--636, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- E. L. Miller, D. D. E. Long, W. E. Freeman, and B. Reed. Strong security for network-attached storage. In Proceedings of the Conference on File and Storage Technologies, FAST '02, pages 1--13, Berkeley, CA, USA, 2002. USENIX Association. Google ScholarDigital Library
- K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In STOC '07, pages 75--84. ACM, 2007. Google ScholarDigital Library
- D. H. Phan and D. Pointcheval. Deterministic Symmetric Encryption (Semantic Security and Pseudo-Random Permutations). In Proceedings of the 11th Annual Workshop on Selected Areas in Cryptography (SAC '04), volume 3357 of Lecture Notes in Computer Science, pages 185--200, Waterloo, Canada, 2004. Springer.Google ScholarDigital Library
- P. Rogaway, M. Bellare, and J. Black. Ocb: A block-cipher mode of operation for efficient authenticated encryption. ACM Trans. Inf. Syst. Secur., 6(3):365--403, Aug. 2003. Google ScholarDigital Library
- V. Shoup. Practical threshold signatures. In Proceedings of the 19th international conference on Theory and application of cryptographic techniques, EUROCRYPT'00, pages 207--220, Berlin, Heidelberg, 2000. Springer-Verlag. Google ScholarDigital Library
- Z. Wilcox-O'Hearn. Convergent encryption reconsidered. https://tahoe-lafs.org/pipermail/tahoedev/2008-March/000449.html, 2008.Google Scholar
- Z. Wilcox-O'Hearn and B. Warner. Tahoe: the least-authority filesystem. In Proceedings of the 4th ACM international workshop on Storage security and survivability, StorageSS '08, pages 21--26, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- Y. Xing, Z. Li, and Y. Dai. Peerdedupe: Insights into the peer-assisted sampling deduplication. In Peer-to-Peer Computing, pages 1--10. IEEE, 2010.Google Scholar
- X. Zhao, Y. Zhang, Y. Wu, K. Chen, J. Jiang, and K. Li. Liquid: A scalable deduplication file system for virtual machine images. IEEE Transactions on Parallel and Distributed Systems, 99(PrePrints):1, 2013.Google Scholar
Index Terms
- Distributed Key Generation for Encrypted Deduplication: Achieving the Strongest Privacy
Recommendations
Secure Deduplication of Encrypted Data without Additional Independent Servers
CCS '15: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications SecurityEncrypting data on client-side before uploading it to a cloud storage is essential for protecting users' privacy. However client-side encryption is at odds with the standard practice of deduplication. Reconciling client-side encryption with cross-user ...
Incremental Deterministic Public-Key Encryption
Motivated by applications in large storage systems, we initiate the study of incremental deterministic public-key encryption. Deterministic public-key encryption, introduced by Bellare, Boldyreva, and O'Neill (CRYPTO '07), provides an alternative to ...
Better Security for Deterministic Public-Key Encryption: The Auxiliary-Input Setting
Deterministic public-key encryption, introduced by Bellare, Boldyreva, and O'Neill (CRYPTO '07), provides an alternative to randomized public-key encryption in various scenarios where the latter exhibits inherent drawbacks. A deterministic encryption ...
Comments