skip to main content
10.1145/2664168.2664169acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Distributed Key Generation for Encrypted Deduplication: Achieving the Strongest Privacy

Published: 07 November 2014 Publication History

Abstract

Large-scale cloud storage systems often attempt to achieve two seemingly conflicting goals: (1) the systems need to reduce the copies of redundant data to save space, a process called deduplication; and (2) users demand encryption of their data to ensure privacy. Conventional encryption makes deduplication on ciphertexts ineffective, as it destroys data redundancy. A line of work, originated from Convergent Encryption [27], and evolved into Message Locked Encryption [13] and the latest DupLESS architecture [12], strives to solve this problem. DupLESS relies on a key server to help the clients generate encryption keys that result in convergent ciphertexts. In this paper, we first introduce a new security notion appropriate for the setting of deduplication and show that it is strictly stronger than all relevant notions. We then provide a rigorous proof of security against this notion, in the random oracle model, for the DupLESS architecture which is lacking in the original paper. Our proof shows that using additional secret, other than the data itself, for generating encryption keys achieves the best possible security under current deduplication paradigm. We also introduce a distributed protocol that eliminates the need for the key server. This not only provides better protection but also allows less managed systems such as P2P systems to enjoy the high security level. Implementation and evaluation show that the scheme is both robust and practical.

References

[1]
Bitcasa. http://www.bitcasa.com/.
[2]
Ciphertite. http://www.ciphertite.com.
[3]
Dropbox. http://www.dropbox.com/.
[4]
flud. http://flud.org.
[5]
Freenet. https://freenetproject.org/.
[6]
GNUnet. http://gnunet.org.
[7]
P. Anderson and L. Zhang. Fast and secure laptop backups with encrypted de-duplication. In Proceedings of the 24th international conference on Large installation system administration, LISA'10, pages 1--8, Berkeley, CA, USA, 2010. USENIX Association.
[8]
B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In PODS '07, pages 273--282, New York, NY, USA, 2007. ACM Press.
[9]
M. Bellare, A. Boldyreva, and A. O'Neill. Deterministic and efficiently searchable encryption. In Proceedings of the 27th annual international cryptology conference on Advances in cryptology, CRYPTO'07, pages 535--552, Berlin, Heidelberg, 2007. Springer-Verlag. Full Version of this paper at http://www.cc.gatech.edu/ aboldyre/papers/bbo.pdf.
[10]
M. Bellare, A. Desai, E. Jokipii, and P. Rogaway. A concrete security treatment of symmetric encryption. In Proceedings of the 38th Annual Symposium on Foundations of Computer Science, FOCS '97, pages 394--403, Washington, DC, USA, 1997. IEEE Computer Society.
[11]
M. Bellare, M. Fischlin, A. O'Neill, and T. Ristenpart. Deterministic encryption: Definitional equivalences and constructions without random oracles. In Proceedings of the 28th Annual conference on Cryptology: Advances in Cryptology, CRYPTO 2008, pages 360--378, Berlin, Heidelberg, 2008. Springer-Verlag.
[12]
M. Bellare and S. Keelveedhi. DupLESS: Server-aided encryption for deduplicated storage. In USENIX Security Symposium 2013, 2013.
[13]
M. Bellare, S. Keelveedhi, and T. Ristenpart. Message-locked encryption and secure deduplication. In T. Johansson and P. Nguyen, editors, Advances in Cryptology lC EUROCRYPT 2013, volume 7881 of Lecture Notes in Computer Science, pages 296--312. Springer Berlin Heidelberg, 2013.
[14]
K. Bennett, C. Grothoff, T. Horozov, and I. Patrascu. Efficient sharing of encrypted data. In Proceedings of the 7th Australian Conference on Information Security and Privacy, ACISP '02, pages 107--120, London, UK, UK, 2002. Springer-Verlag.
[15]
A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: the SuLQ framework. In PODS '05, pages 128--138, New York, NY, USA, 2005. ACM Press.
[16]
A. Boldyreva, S. Fehr, and A. O'Neill. On notions of security for deterministic encryption, and efficient constructions without random oracles. In Proceedings of the 28th Annual conference on Cryptology: Advances in Cryptology, CRYPTO 2008, pages 335--359, Berlin, Heidelberg, 2008. Springer-Verlag.
[17]
W. J. Bolosky, J. R. Douceur, D. Ely, and M. Theimer. Feasibility of a serverless distributed file system deployed on an existing set of desktop pcs. In Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, SIGMETRICS '00, pages 34--43, New York, NY, USA, 2000. ACM.
[18]
D. Boneh and M. Franklin. Efficient generation of shared rsa keys. J. ACM, 48(4):702--722, July 2001.
[19]
D. Chaum. Blind signatures for untraceable payments. In D. Chaum, R. Rivest, and A. Sherman, editors, Advances in Cryptology, pages 199--203. Springer US, 1983.
[20]
F. Chin and G. Ozsoyoglu. Auditing for secure statistical databases. In ACM 81: Proceedings of the ACM '81 conference, pages 53--59, New York, NY, USA, 1981. ACM.
[21]
A. T. Clements, I. Ahmad, M. Vilayannur, and J. Li. Decentralized deduplication in san cluster file systems. In Proceedings of the 2009 conference on USENIX Annual technical conference, USENIX'09, pages 8--8, Berkeley, CA, USA, 2009. USENIX Association.
[22]
R. Cramer and V. Shoup. A practical public key cryptosystem provably secure against adaptive chosen ciphertext attack. In H. Krawczyk, editor, Advances in Cryptology al CRYPTO '98, volume 1462 of Lecture Notes in Computer Science, pages 13--25. Springer Berlin Heidelberg, 1998.
[23]
T. Dalenius. Towards a methodology for statistical disclosure control. Statistik Tidskrift, 15:429--444, 1977.
[24]
I. Damgård and M. Koprowski. Practical threshold rsa signatures without a trusted dealer. In Proceedings of the International Conference on the Theory and Application of Cryptographic Techniques: Advances in Cryptology, EUROCRYPT '01, pages 152--165, London, UK, UK, 2001. Springer-Verlag.
[25]
I. Damgård and G. L. Mikkelsen. Efficient, robust and constant-round distributed rsa key generation. In Proceedings of the 7th international conference on Theory of Cryptography, TCC'10, pages 183--200, Berlin, Heidelberg, 2010. Springer-Verlag.
[26]
I. Dinur and K. Nissim. Revealing information while preserving privacy. In PODS '03, pages 202--210, New York, NY, USA, 2003. ACM Press.
[27]
J. R. Douceur, A. Adya, W. J. Bolosky, D. Simon, and M. Theimer. Reclaiming space from duplicate files in a serverless distributed file system. In Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02), ICDCS '02, pages 617--624, Washington, DC, USA, 2002. IEEE Computer Society.
[28]
Y. Duan. Privacy without noise. In CIKM '09, New York, NY, USA, 2009. ACM.
[29]
Y. Duan and J. Canny. Protecting user data in ubiquitous computing: Towards trustworthy environments. In PET'04, 2004.
[30]
Y. Duan and J. Canny. How to construct multicast cryptosystems provably secure against adaptive chosen ciphertext attack. In RSA Conference 2006, Cryptographers' Track. San Jose, USA, volume 3860 of Lecture Notes in Computer Science, pages 244--261. Springer-Verlag, 2006.
[31]
Y. Duan, J. Canny, and J. Zhan. P4P: Practical large-scale privacy-preserving distributed computation robust against malicious users. In USENIX Security Symposium 2010, pages 609--618, 2010.
[32]
M. Dutch. Understanding data deduplication ratios. http://www.snia.org, 2008.
[33]
C. Dwork. An ad omnia approach to defining and achieving private data analysis. In PinKDD, pages 1--13, 2007.
[34]
C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT 2006. Springer, 2006.
[35]
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC 2006. Springer, 2006.
[36]
EMC. http://www.emc.com/solutions/samples/backuprecovery-archiving/backup-data-deduplication.htm.
[37]
P.-A. Fouque and J. Stern. Fully distributed threshold RSA under standard assumptions. In Proceedings of the 7th International Conference on the Theory and Application of Cryptology and Information Security: Advances in Cryptology, ASIACRYPT '01, pages 310--330, London, UK, UK, 2001. Springer-Verlag.
[38]
Y. Frankel, P. D. MacKenzie, and M. Yung. Robust efficient distributed rsa-key generation. In Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing, PODC '98, pages 320--, New York, NY, USA, 1998. ACM.
[39]
S. Goldwasser and S. Micali. Probabilistic encryption & how to play mental poker keeping secret all partial information. In Proceedings of the fourteenth annual ACM symposium on Theory of computing, STOC '82, pages 365--377, New York, NY, USA, 1982. ACM.
[40]
S. Goldwasser and S. Micali. Probabilistic encryption. Journal of Computer and System Sciences, 28(2):270-- 299, 1984.
[41]
S. Goldwasser, S. Micali, and C. Rackoff. The knowledge complexity of interactive proof systems. SIAM J. Comput., 18(1):186--208, Feb. 1989.
[42]
S. Goldwasser, S. Micali, and R. L. Rivest. A digital signature scheme secure against adaptive chosen-message attacks. SIAM Journal on Computing, 17(2):281--308, 1988.
[43]
S. K. Langford. Threshold dss signatures without a trusted party. In Proceedings of the 15th Annual International Cryptology Conference on Advances in Cryptology, CRYPTO '95, pages 397--409, London, UK, UK, 1995. Springer-Verlag.
[44]
F. McSherry and I. Mironov. Differentially private recommender systems: Building privacy into the netflix prize contenders. In KDD '09, pages 627--636, New York, NY, USA, 2009. ACM.
[45]
E. L. Miller, D. D. E. Long, W. E. Freeman, and B. Reed. Strong security for network-attached storage. In Proceedings of the Conference on File and Storage Technologies, FAST '02, pages 1--13, Berkeley, CA, USA, 2002. USENIX Association.
[46]
K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In STOC '07, pages 75--84. ACM, 2007.
[47]
D. H. Phan and D. Pointcheval. Deterministic Symmetric Encryption (Semantic Security and Pseudo-Random Permutations). In Proceedings of the 11th Annual Workshop on Selected Areas in Cryptography (SAC '04), volume 3357 of Lecture Notes in Computer Science, pages 185--200, Waterloo, Canada, 2004. Springer.
[48]
P. Rogaway, M. Bellare, and J. Black. Ocb: A block-cipher mode of operation for efficient authenticated encryption. ACM Trans. Inf. Syst. Secur., 6(3):365--403, Aug. 2003.
[49]
V. Shoup. Practical threshold signatures. In Proceedings of the 19th international conference on Theory and application of cryptographic techniques, EUROCRYPT'00, pages 207--220, Berlin, Heidelberg, 2000. Springer-Verlag.
[50]
Z. Wilcox-O'Hearn. Convergent encryption reconsidered. https://tahoe-lafs.org/pipermail/tahoedev/2008-March/000449.html, 2008.
[51]
Z. Wilcox-O'Hearn and B. Warner. Tahoe: the least-authority filesystem. In Proceedings of the 4th ACM international workshop on Storage security and survivability, StorageSS '08, pages 21--26, New York, NY, USA, 2008. ACM.
[52]
Y. Xing, Z. Li, and Y. Dai. Peerdedupe: Insights into the peer-assisted sampling deduplication. In Peer-to-Peer Computing, pages 1--10. IEEE, 2010.
[53]
X. Zhao, Y. Zhang, Y. Wu, K. Chen, J. Jiang, and K. Li. Liquid: A scalable deduplication file system for virtual machine images. IEEE Transactions on Parallel and Distributed Systems, 99(PrePrints):1, 2013.

Cited By

View all
  • (2025)PASCOINFOG/PASFOG: Privacy-Preserving Data Deduplication Algorithms for Fog Storage SystemsIEEE Consumer Electronics Magazine10.1109/MCE.2023.333355914:1(37-45)Online publication date: Jan-2025
  • (2024)Encrypted Data Reduction: Removing Redundancy from Encrypted Data in Outsourced StorageACM Transactions on Storage10.1145/368527820:4(1-30)Online publication date: 29-Jul-2024
  • (2024)Blockchain-Assisted Secure Deduplication for Large-Scale Cloud Storage ServiceIEEE Transactions on Services Computing10.1109/TSC.2024.335008617:3(821-835)Online publication date: May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CCSW '14: Proceedings of the 6th edition of the ACM Workshop on Cloud Computing Security
November 2014
160 pages
ISBN:9781450332392
DOI:10.1145/2664168
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cloud computing security
  2. deduplication
  3. deterministic encryption

Qualifiers

  • Research-article

Conference

CCS'14
Sponsor:

Acceptance Rates

CCSW '14 Paper Acceptance Rate 12 of 36 submissions, 33%;
Overall Acceptance Rate 37 of 108 submissions, 34%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)2
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)PASCOINFOG/PASFOG: Privacy-Preserving Data Deduplication Algorithms for Fog Storage SystemsIEEE Consumer Electronics Magazine10.1109/MCE.2023.333355914:1(37-45)Online publication date: Jan-2025
  • (2024)Encrypted Data Reduction: Removing Redundancy from Encrypted Data in Outsourced StorageACM Transactions on Storage10.1145/368527820:4(1-30)Online publication date: 29-Jul-2024
  • (2024)Blockchain-Assisted Secure Deduplication for Large-Scale Cloud Storage ServiceIEEE Transactions on Services Computing10.1109/TSC.2024.335008617:3(821-835)Online publication date: May-2024
  • (2024)SimLESS: A Secure Deduplication System Over Similar Data in Cloud Media SharingIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.338260319(4700-4715)Online publication date: 2024
  • (2024)LSDedup: Layered Secure Deduplication for Cloud StorageIEEE Transactions on Computers10.1109/TC.2023.333195373:2(422-435)Online publication date: Feb-2024
  • (2024)DEFD: Dual-Entity Fuzzy Deduplication for Untrusted Environments2024 21st Annual International Conference on Privacy, Security and Trust (PST)10.1109/PST62714.2024.10788052(1-11)Online publication date: 28-Aug-2024
  • (2024)Data Splitting Based Double Layer Encryption for Secure Ciphertext Deduplication in Cloud Storage2024 IEEE 17th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD62652.2024.00027(153-163)Online publication date: 7-Jul-2024
  • (2024)A Secure and Lightweight Cloud Data Deduplication Scheme with Efficient Access Control and Key ManagementComputer Communications10.1016/j.comcom.2024.05.003Online publication date: May-2024
  • (2024)Device-Enhanced Secure Cloud Storage with Keyword Searchable Encryption and DeduplicationComputer Security – ESORICS 202410.1007/978-3-031-70903-6_20(396-413)Online publication date: 5-Sep-2024
  • (2024)Convergent encryption enabled secure data deduplication algorithm for cloud environmentConcurrency and Computation: Practice and Experience10.1002/cpe.820536:21Online publication date: 21-Jun-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media