Skip to main content

Enabling Secure Deduplication in Encrypted Decentralized Storage

  • Conference paper
  • First Online:
Network and System Security (NSS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13787))

Included in the following conference series:

Abstract

With the rapid development of blockchain technology, decentralized cloud storage services are emerging and have been a storage new option in this era. They aim to leverage the unused storage resources across the network to build a more economical and reliable distributed storage network and thus eliminate the trust in the centralized storage providers via matured blockchain consensus mechanisms. However, current solutions either lack the protection of user data privacy or apply conventional encryption methods that cannot support cross-user deduplication over encrypted data. These limitations make them struggle to balance the need for optimized storage space utilization and encrypted data protection, especially in the scenario where the user’s files are geographically distributed in different nodes around the world. In this paper, we propose a secure deduplication system in the context of encrypted decentralized cloud storage. It utilizes smart contract to incorporate the message-locked encryption (MLE) scheme, the most prominent cryptographic primitive in secure data deduplication. With a carefully tailored design, our proposed scheme can be seamlessly deployed to the public blockchain with transparency. Together, our design enables secure data deduplication over decentralized storage, while providing stringent cryptographic data privacy guarantees. In particular, our proposed design has a natural benefit to prevent potential malicious attacks such as file ownership cheating and file ciphertext poisoning. We implement a prototype of our system and deploy it to Ethereum. Comprehensive performance evaluations are conducted with real datasets to demonstrate the effectiveness and efficiency of our design.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For ease of presentation, we use fid to denote the address of the corresponding file on DCS, e.g., the content identifier (CID) for IPFS.

  2. 2.

    In Storj, for erasure code with the factor \(k=29\) and \(n=80\), the total storage is roughly \(80/29 \approx 2.76\) times, similarly, Sia will store 3 times.

References

  1. Alibaba cloud official website (2022). https://www.alibabacloud.com/

  2. The blockchain data platform - Chainalysis (2022). https://chainalysis.com/

  3. Data breach: Latest news & videos, photos about data breach. The Economic Times (2022). https://economictimes.indiatimes.com/topic/data-breach

  4. Downloads - the go programming language (2022). https://go.dev/dl/

  5. Elastic compute service (ECS): Elastic & secure cloud servers - Alibaba Cloud (2022). https://www.alibabacloud.com/product/ecs

  6. Ethereum (eth) blockchain explorer (2022). https://etherscan.io/

  7. Filecoin: a decentralized storage network (2022). https://filecoin.io/

  8. Ganache - truffle suite (2022). https://trufflesuite.com/ganache/

  9. IPFS powers the distributed web (2022). https://ipfs.io/

  10. Linux kernel source code (2022). https://www.kernel.org/

  11. MySQL: Download mysql community server (archived versions) (2022). https://downloads.mysql.com/archives/community/

  12. Previous releases \(|\) node.js (2022). https://nodejs.org/download/release/

  13. Rinkeby: Network dashboard (2022). https://www.rinkeby.io/

  14. Sia - decentralized data storage (2022). https://sia.tech/

  15. Solidity programming language (2022). https://soliditylang.org/

  16. Storj: Decentralized cloud storage (2022). https://www.storj.io/

  17. World’s biggest data breaches & hacks - information is beautiful (2022). https://informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/

  18. Adya, A., et al.: FARSITE: federated, available, and reliable storage for an incompletely trusted environment. In: Proceedings of the 5th Symposium on Operating Systems Design and Implementation, pp. 1–14. USENIX Association (2002)

    Google Scholar 

  19. Anderson, P., Zhang, L.: Fast and secure laptop backups with encrypted de-duplication. In: Proceedings of the LISA 2010, pp. 29–40. USENIX Association (2010)

    Google Scholar 

  20. Armknecht, F., Bohli, J.M., Karame, G.O., Youssef, F.: Transparent data deduplication in the cloud. In: Proceedings of the ACM CCS (2015)

    Google Scholar 

  21. Asokan, N., Niemi, V., Nyberg, K.: Man-in-the-middle in tunnelled authentication protocols. In: Security Protocols (2005)

    Google Scholar 

  22. Bellare, M., Keelveedhi, S., Ristenpart, T.: DupLESS: server-aided encryption for deduplicated storage. In: Proceedings of the USENIX Security (2013)

    Google Scholar 

  23. Bellare, M., Keelveedhi, S., Ristenpart, T.: Message-locked encryption and secure deduplication. In: Proceedings of the EUROCRYPT (2013)

    Google Scholar 

  24. Burns, J., Moore, D., Ray, K., Speers, R., Vohaska, B.: EC-OPRF: oblivious pseudorandom functions using elliptic curves. IACR Cryptology ePrint Archive, p. 111 (2017)

    Google Scholar 

  25. Cai, C., Xu, L., Zhou, A., Wang, C.: Toward a secure, rich, and fair query service for light clients on public blockchains. IEEE Trans. Dependable Secure Comput. 19, 3640–3655 (2021)

    Article  Google Scholar 

  26. Chen, R., Mu, Y., Yang, G., Guo, F.: BL-MLE: Block-level message-locked encryption for secure large file deduplication. IEEE Trans. Inf. Forensics Secur. 10(12), 2643–2652 (2015)

    Article  Google Scholar 

  27. Cui, H., Wang, C., Hua, Y., Du, Y., Yuan, X.: A bandwidth-efficient middleware for encrypted deduplication. In: Proceedings of IEEE DSC (2018)

    Google Scholar 

  28. Dimakis, A., Prabhakaran, V., Ramchandran, K.: Decentralized erasure codes for distributed networked storage. IEEE Trans. Inf. Theor. 52(6), 2809–2816 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  29. Douceur, J.R., Adya, A., Bolosky, W.J., Simon, D., Theimer, M.: Reclaiming space from duplicate files in a serverless distributed file system. In: Proceedings of the IEEE ICDCS (2002)

    Google Scholar 

  30. Dutch, M.: Understanding data deduplication ratios. In: SNIA Data Management Forum, vol. 7 (2008)

    Google Scholar 

  31. Ferguson, N., et al.: The skein hash function family. Submission to NIST (round 3), vol. 7, no. 7.5, p. 3 (2010)

    Google Scholar 

  32. Fu, M., et al.: Accelerating restore and garbage collection in deduplication-based backup systems via exploiting historical information. In: USENIX ATC 2014, pp. 181–192 (2014)

    Google Scholar 

  33. Halevi, S., Harnik, D., Pinkas, B., Shulman-Peleg, A.: Proofs of ownership in remote storage systems. In: Proceedings of the ACM CCS (2011)

    Google Scholar 

  34. Hu, S., Cai, C., Wang, Q., Wang, C., Wang, Z., Ye, D.: Augmenting encrypted search: a decentralized service realization with enforced execution. IEEE Trans. Dependable Secure Comput. 18(6), 2569–2581 (2021)

    Google Scholar 

  35. Ivanov, N., Lou, J., Chen, T., Li, J., Yan, Q.: Targeting the weakest link: social engineering attacks in Ethereum smart contracts. In: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, pp. 787–801 (2021)

    Google Scholar 

  36. Kushwah, S., Desai, A., Subramanyan, P., Seshia, S.A.: PSec: programming secure distributed systems using enclaves. In: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, pp. 802–816 (2021)

    Google Scholar 

  37. Lamport, L.: Constructing digital signatures from a one way function. Technical report, CSL-98, SRI International (1979)

    Google Scholar 

  38. Li, J., Wu, J., Chen, L., Li, J.: Deduplication with blockchain for secure cloud storage. In: Xu, Z., Gao, X., Miao, Q., Zhang, Y., Bu, J. (eds.) Big Data, pp. 558–570 (2018)

    Google Scholar 

  39. Li, M., Qin, C., Lee, P.P.C.: CDStore: toward reliable, secure, and cost-efficient cloud storage via convergent dispersal. In: USENIX ATC 2015, pp. 111–124 (2015)

    Google Scholar 

  40. Liu, J., Duan, L., Li, Y., Asokan, N.: Secure deduplication of encrypted data: refined model and new constructions. In: Proceedings of the CT-RSA (2018)

    Google Scholar 

  41. Merkle, R.C.: A digital signature based on a conventional encryption function. In: Pomerance, C. (ed.) CRYPTO 1987. LNCS, vol. 293, pp. 369–378. Springer, Heidelberg (1988). https://doi.org/10.1007/3-540-48184-2_32

    Chapter  Google Scholar 

  42. Naor, M., Reingold, O.: Number-theoretic constructions of efficient pseudo-random functions. J. ACM 51, 231–262 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  43. Rabotka, V., Mannan, M.: An evaluation of recent secure deduplication proposals. J. Inf. Secur. Appl. (JISA) 27, 3–18 (2016)

    Google Scholar 

  44. Shin, Y., Koo, D., Hur, J.: A survey of secure data deduplication schemes for cloud storage systems. ACM Comput. Surv. (CSUR) 49(4), 74 (2017)

    Article  Google Scholar 

  45. Shin, Y., Koo, D., Yun, J., Hur, J.: Decentralized server-aided encryption for secure deduplication in cloud storage. IEEE Trans. Dependable Secure Comput. 13(6), 1021–1033 (2020)

    Google Scholar 

  46. Szabo, N.: Formalizing and securing relationships on public networks. First Monday (1997)

    Google Scholar 

  47. Tarasov, V., Mudrankit, A., Buik, W., Shilane, P., Kuenning, G., Zadok, E.: Generating realistic datasets for deduplication analysis. In: USENIX ATC 2012, pp. 261–272 (2012)

    Google Scholar 

  48. Tian, G., et al.: Blockchain-based secure deduplication and shared auditing in decentralized storage. IEEE Trans. Dependable Secure Comput. 19(6), 3941–3954 (2022)

    Article  Google Scholar 

  49. Tikhomirov, S.: Ethereum: state of knowledge and research perspectives. In: Proceedings of the Foundations and Practice of Security (2017)

    Google Scholar 

  50. Xu, J., Chang, E.C., Zhou, J.: Weak leakage-resilient client-side deduplication of encrypted data in cloud storage. In: Proceedings of ACM ASIACCS (2013)

    Google Scholar 

  51. Zheng, Y., Yuan, X., Wang, X., Jiang, J., Wang, C., Gui, X.: Toward encrypted cloud media center with secure deduplication. IEEE Trans. Multimedia 19(2), 251–265 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Key R &D Program of China (No. 2019YFB2102200), the National Science Fund for Distinguished Young Scholars (No. 61725205), the National Natural Science Foundation of China (No. 62002294, 62202379), and the Fundamental Research Funds for the Central Universities (No. 3102019QD1001, D5000220127).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Helei Cui .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, B., Cui, H., Chen, Y., Liu, X., Yu, Z., Guo, B. (2022). Enabling Secure Deduplication in Encrypted Decentralized Storage. In: Yuan, X., Bai, G., Alcaraz, C., Majumdar, S. (eds) Network and System Security. NSS 2022. Lecture Notes in Computer Science, vol 13787. Springer, Cham. https://doi.org/10.1007/978-3-031-23020-2_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-23020-2_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-23019-6

  • Online ISBN: 978-3-031-23020-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics