skip to main content
10.1145/2996429.2996432acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

On Information Leakage in Deduplicated Storage Systems

Published:28 October 2016Publication History

ABSTRACT

Most existing cloud storage providers rely on data deduplication in order to significantly save storage costs by storing duplicate data only once. While the literature has thoroughly analyzed client-side information leakage associated with the use of data deduplication techniques in the cloud, no previous work has analyzed the information leakage associated with access trace information information (e.g., object size and timing) that are available whenever a client uploads a file to a curious cloud provider.

In this paper, we address this problem and analyze information leakage associated with data deduplication on a curious storage server. We show that even if the data is encrypted using a key not known by the storage server, the latter can still acquire considerable information about the stored files and even determine which files are stored. We validate our results both analytically and experimentally using a number of real storage datasets.

References

  1. Frederik Armknecht, Jens-Matthias Bohli, Ghassan O. Karame, and Franck Youssef. Transparent data deduplication in the cloud. CCS '15, pages 886--900, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Storage Networking Industry Association. SNIA IOTTA Repository. http://iotta.snia.org/traces/3382.Google ScholarGoogle Scholar
  3. Mihir Bellare, Sriram Keelveedhi, and Thomas Ristenpart. Dupless: Server-aided encryption for deduplicated storage. In Proceedings of the 22Nd USENIX Conference on Security, SEC'13, pages 179--194. USENIX Association, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Mihir Bellare, Sriram Keelveedhi, and Thomas Ristenpart. Message-locked encryption and secure deduplication. EUROCRYPT '13. Springer, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  5. Roberto Di Pietro and Alessandro Sorniotti. Boosting efficiency and security in proof of ownership for deduplication. ASIACCS '12, pages 81--82, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. John R. Douceur, Atul Adya, William J. Bolosky, Dan Simon, and Marvin Theimer. Reclaiming space from duplicate files in a serverless distributed file system. In ICDCS, pages 617--624, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Kave Eshghi and Hsiu K. Tang. A framework for analyzing and improving content-based chunking algorithms. Technical report, 2005. http://www.hpl.hp.com/techreports/2005/HPL-2005--30R1.html.Google ScholarGoogle Scholar
  8. William Feller. An Introduction to Probability Theory and Its Applications. Wiley, 2nd edition, 1971.Google ScholarGoogle Scholar
  9. Shai Halevi, Danny Harnik, Benny Pinkas, and Alexandra Shulman-Peleg. Proofs of ownership in remote storage systems. CCS '11, pages 491--500, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Danny Harnik, Benny Pinkas, and Alexandra Shulman-Peleg. Side channels in cloud services: Deduplication in cloud storage. IEEE Security & Privacy, 8(6):40--47, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Adobe Systems Incorporated. Document management - Portable document format - Part 1: PDF 1.7, 2008. https://wwwimages2.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf.Google ScholarGoogle Scholar
  12. Kevin B. Korb and Ann E. Nicholson. Bayesian Artificial Intelligence. CRC Press, second edition, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Sven Laur, Riivo Talviste, and Jan Willemson. From oblivious aes to efficient and secure database join in the multiparty setting. ACNS'13. Springer-Verlag, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dirk Meister and André Brinkmann. Multi-level comparison of data deduplication in a backup scenario. In Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dutch T. Meyer and William J. Bolosky. A study of practical deduplication. Trans. Storage, 7(4):14:1--14:20, February 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. João Paulo and José Pereira. A survey and classification of storage deduplication systems. ACM Comput. Surv., 47(1):11:1--11:30, June 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Pasquale Puzio, Refik Molva, Melek Önen, and Sergio Loureiro. Block-level de-duplication with encrypted data. Open Journal of Cloud Computing (OJCC), 1(1):10--18, 2014.Google ScholarGoogle Scholar
  18. Pasquale Puzio, Refik Molva, Melek Önen, and Sergio Loureiro. PerfectDedup: Secure Data Deduplication, pages 150--166. Springer International Publishing, Cham, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Michael O. Rabin. Fingerprinting by random polynomials. Harvard Aiken Computation Laboratory, pages 1--12, 1981.Google ScholarGoogle Scholar
  20. Jan Stanek, Alessandro Sorniotti, Elli Androulaki, and Lukas Kencl. A Secure Data Deduplication Scheme for Cloud Storage. In 18th International Conference on Financial Cryptography and Data Security (FC), 2014.Google ScholarGoogle Scholar
  21. Mark W. Storer, Kevin Greenan, Darrell D.E. Long, and Ethan L. Miller. Secure data deduplication. StorageSS '08, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Marc van Leeuwen. Extended stars-and-bars problem, 2013. http://math.stackexchange.com/a/554237.Google ScholarGoogle Scholar

Index Terms

  1. On Information Leakage in Deduplicated Storage Systems

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            CCSW '16: Proceedings of the 2016 ACM on Cloud Computing Security Workshop
            October 2016
            116 pages
            ISBN:9781450345729
            DOI:10.1145/2996429

            Copyright © 2016 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 28 October 2016

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            CCSW '16 Paper Acceptance Rate8of23submissions,35%Overall Acceptance Rate37of108submissions,34%

            Upcoming Conference

            CCS '24
            ACM SIGSAC Conference on Computer and Communications Security
            October 14 - 18, 2024
            Salt Lake City , UT , USA

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader