Skip to main content

Advertisement

Log in

An encrypted deduplication scheme based on files diversity

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In order to improve the efficiency of cloud storage, deduplication technology has been widely used. In personal cloud storage, files are diverse. Diverse files contain files of different sizes and different popularity, some of which have many copies. Existing deduplication schemes mainly focus on data security but fail to improve the overall performance, including low computing overhead for diverse data. Using deterministic tags based on convergent encryption to identify data may leak data information, while using fully random tags generated by complex encryption algorithms may generate more computing overhead. To address the above issues, we proposed an encrypted deduplication scheme based on files diversity (FD-Dedup). Diverse files are identified by semi-random tags. We also designed a semi-random tag generation (SRTG) algorithm that coordinates the computing overhead and security. Security analysis and performance comparison show that FD-Dedup can balance security and computing overhead for diverse files in personal cloud storage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Availability of data and materials

Not applicable.

References

  1. Zhang D, Le J, Mu N, Wu J, Liao X (2023) Secure and efficient data deduplication in jointcloud storage. IEEE Trans Cloud Comput 11(1):156–167

    Article  Google Scholar 

  2. Luo S, Zhang G, Wu C, Khan S, Li K (2020) Boafft: distributed deduplication for big data storage in the cloud. IEEE Trans Cloud Comput 8(4):1199–1211

    Article  Google Scholar 

  3. Mageshkumar N, Lakshmanan L (2022) An improved secure file deduplication avoidance using ckho based deep learning model in a cloud environment. J Supercomput 78(13):14892–14918

    Article  Google Scholar 

  4. Xia W, Feng D, Jiang H, Zhang Y, Chang V, Zou X (2019) Accelerating content-defined-chunking based data deduplication by exploiting parallelism. Futur Gener Comput Syst 98:406–418

    Article  Google Scholar 

  5. Wang L, Wang B, Song W, Zhang Z (2019) A key-sharing based secure deduplication scheme in cloud storage. Inf Sci 504:48–60

    Article  MathSciNet  Google Scholar 

  6. Xia W, Zou X, Jiang H, Zhou Y, Liu C, Feng D, Hua Y, Hu Y, Zhang Y (2020) The design of fast content-defined chunking for data deduplication based storage systems. IEEE Trans Parallel Distrib Syst 31(9):2017–2031

    Article  Google Scholar 

  7. Yang X, Lu R, Choo KKR, Yin F, Tang X (2022) Achieving efficient and privacy-preserving cross-domain big data deduplication in cloud. IEEE Trans Big Data 8(1):73–84

    Article  Google Scholar 

  8. Douceur JR, Adya A, Bolosky WJ, Simon P, Theimer M (2002) Reclaiming space from duplicate files in a serverless distributed file system. In: Proceedings 22nd International Conference on Distributed Computing Systems, pp 617–624

  9. Bellare M, Keelveedhi S, Ristenpart T (2013) Message-locked encryption and secure deduplication. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, pp 296–312

  10. Bellare M, Keelveedhi S (2015) Interactive message-locked encryption and secure deduplication. In: Public-Key Cryptography—PKC 2015: 18th IACR International Conference on Practice and Theory in Public-Key Cryptography, Gaithersburg, MD, USA, March 30–April 1, 2015, Proceedings 18. Springer, pp 516–538

  11. Bellare M, Keelveedhi S, Ristenpart T (2013) Dupless: server-aided encryption for deduplicated storage. In: 22nd USENIX Security Symposium (USENIX Security 13), pp 179–194

  12. Ha G, Jia C, Huang Y, Chen H, Li R, Jia Q (2023) Scalable and popularity-based secure deduplication schemes with fully random tags. IEEE Trans Depend Secure Comput 21(3):1484–1500

    Article  Google Scholar 

  13. Ha G, Chen H, Jia C, Li R, Jia Q (2021) A secure deduplication scheme based on data popularity with fully random tags. In: 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp 207–214

  14. Guo J, Wang Q, Xu X, Wang T, Lin J (2021) Secure multiparty computation and application in machine learning. J Comput Res Dev 58:2163–2186

    Google Scholar 

  15. Hong H, Sun Z (2021) A flexible attribute based data access management scheme for sensor-cloud system. J Syst Architect 119:102234

    Article  Google Scholar 

  16. Hong H, Sun Z (2023) Constructing conditional pkeet with verification mechanism for data privacy protection in intelligent systems. J Supercomput 79(13):15004–15022

    Article  Google Scholar 

  17. Stanek J, Kencl L (2018) Enhanced secure thresholded data deduplication scheme for cloud storage. IEEE Trans Depend Secure Comput 15(4):694–707

    Article  Google Scholar 

  18. Puzio P, Molva R, Önen M, Loureiro S (2016) Perfectdedup: secure data deduplication. In: Data Privacy Management, and Security Assurance: 10th International Workshop, DPM 2015, and 4th International Workshop, QASA 2015, Vienna, Austria, September 21–22, 2015. Revised Selected Papers 10. Springer, pp 150–166

  19. Gao W, Xian H, Tian C, Li Z, He Y (2020) A cloud storage deduplication method based on double-layered encryption. J Cryptol Res 7:698–712

    Google Scholar 

  20. Gao W, Xian H, Cheng R (2021) A cloud data deduplication method based on double-layered encryption and key sharing. Chin J Comput 44(11):2203–2215

    Google Scholar 

  21. Ha G, Jia Q, Chen H, Jia C (2022) Data popularity-based encrypted deduplication scheme without third-party servers. J Commun 43:17–29

    Google Scholar 

  22. Xian H, Gao Y, Mu X, Gao W (2021) Deduplication scheme based on threshold dynamic adjustment. J Softw 32(11):3563–3575

    Google Scholar 

  23. Jiang T, Chen X, Wu Q, Ma J, Susilo W, Lou W (2017) Secure and efficient cloud data deduplication with randomized tag. IEEE Trans Inf Forens Secur 12(3):532–543

    Article  Google Scholar 

  24. He Y, Xian H, Wang L, Zhang S (2021) Secure encrypted data deduplication based on data popularity. Mobile Netw Appl 26:1686–1695

    Article  Google Scholar 

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

Mr. He and Mr. Zhu wrote the main manuscript text. Mr. Zhu did experiments and prepared figures and tables. All authors reviewed the manuscript.

Corresponding author

Correspondence to Yifan Zhu.

Ethics declarations

Conflict of interest

Not applicable.

Ethics approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, X., Zhu, Y. An encrypted deduplication scheme based on files diversity. J Supercomput 80, 22860–22884 (2024). https://doi.org/10.1007/s11227-024-06325-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-024-06325-0

Keywords