Abstract
In order to improve the efficiency of cloud storage, deduplication technology has been widely used. In personal cloud storage, files are diverse. Diverse files contain files of different sizes and different popularity, some of which have many copies. Existing deduplication schemes mainly focus on data security but fail to improve the overall performance, including low computing overhead for diverse data. Using deterministic tags based on convergent encryption to identify data may leak data information, while using fully random tags generated by complex encryption algorithms may generate more computing overhead. To address the above issues, we proposed an encrypted deduplication scheme based on files diversity (FD-Dedup). Diverse files are identified by semi-random tags. We also designed a semi-random tag generation (SRTG) algorithm that coordinates the computing overhead and security. Security analysis and performance comparison show that FD-Dedup can balance security and computing overhead for diverse files in personal cloud storage.













Similar content being viewed by others
Availability of data and materials
Not applicable.
References
Zhang D, Le J, Mu N, Wu J, Liao X (2023) Secure and efficient data deduplication in jointcloud storage. IEEE Trans Cloud Comput 11(1):156–167
Luo S, Zhang G, Wu C, Khan S, Li K (2020) Boafft: distributed deduplication for big data storage in the cloud. IEEE Trans Cloud Comput 8(4):1199–1211
Mageshkumar N, Lakshmanan L (2022) An improved secure file deduplication avoidance using ckho based deep learning model in a cloud environment. J Supercomput 78(13):14892–14918
Xia W, Feng D, Jiang H, Zhang Y, Chang V, Zou X (2019) Accelerating content-defined-chunking based data deduplication by exploiting parallelism. Futur Gener Comput Syst 98:406–418
Wang L, Wang B, Song W, Zhang Z (2019) A key-sharing based secure deduplication scheme in cloud storage. Inf Sci 504:48–60
Xia W, Zou X, Jiang H, Zhou Y, Liu C, Feng D, Hua Y, Hu Y, Zhang Y (2020) The design of fast content-defined chunking for data deduplication based storage systems. IEEE Trans Parallel Distrib Syst 31(9):2017–2031
Yang X, Lu R, Choo KKR, Yin F, Tang X (2022) Achieving efficient and privacy-preserving cross-domain big data deduplication in cloud. IEEE Trans Big Data 8(1):73–84
Douceur JR, Adya A, Bolosky WJ, Simon P, Theimer M (2002) Reclaiming space from duplicate files in a serverless distributed file system. In: Proceedings 22nd International Conference on Distributed Computing Systems, pp 617–624
Bellare M, Keelveedhi S, Ristenpart T (2013) Message-locked encryption and secure deduplication. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, pp 296–312
Bellare M, Keelveedhi S (2015) Interactive message-locked encryption and secure deduplication. In: Public-Key Cryptography—PKC 2015: 18th IACR International Conference on Practice and Theory in Public-Key Cryptography, Gaithersburg, MD, USA, March 30–April 1, 2015, Proceedings 18. Springer, pp 516–538
Bellare M, Keelveedhi S, Ristenpart T (2013) Dupless: server-aided encryption for deduplicated storage. In: 22nd USENIX Security Symposium (USENIX Security 13), pp 179–194
Ha G, Jia C, Huang Y, Chen H, Li R, Jia Q (2023) Scalable and popularity-based secure deduplication schemes with fully random tags. IEEE Trans Depend Secure Comput 21(3):1484–1500
Ha G, Chen H, Jia C, Li R, Jia Q (2021) A secure deduplication scheme based on data popularity with fully random tags. In: 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp 207–214
Guo J, Wang Q, Xu X, Wang T, Lin J (2021) Secure multiparty computation and application in machine learning. J Comput Res Dev 58:2163–2186
Hong H, Sun Z (2021) A flexible attribute based data access management scheme for sensor-cloud system. J Syst Architect 119:102234
Hong H, Sun Z (2023) Constructing conditional pkeet with verification mechanism for data privacy protection in intelligent systems. J Supercomput 79(13):15004–15022
Stanek J, Kencl L (2018) Enhanced secure thresholded data deduplication scheme for cloud storage. IEEE Trans Depend Secure Comput 15(4):694–707
Puzio P, Molva R, Önen M, Loureiro S (2016) Perfectdedup: secure data deduplication. In: Data Privacy Management, and Security Assurance: 10th International Workshop, DPM 2015, and 4th International Workshop, QASA 2015, Vienna, Austria, September 21–22, 2015. Revised Selected Papers 10. Springer, pp 150–166
Gao W, Xian H, Tian C, Li Z, He Y (2020) A cloud storage deduplication method based on double-layered encryption. J Cryptol Res 7:698–712
Gao W, Xian H, Cheng R (2021) A cloud data deduplication method based on double-layered encryption and key sharing. Chin J Comput 44(11):2203–2215
Ha G, Jia Q, Chen H, Jia C (2022) Data popularity-based encrypted deduplication scheme without third-party servers. J Commun 43:17–29
Xian H, Gao Y, Mu X, Gao W (2021) Deduplication scheme based on threshold dynamic adjustment. J Softw 32(11):3563–3575
Jiang T, Chen X, Wu Q, Ma J, Susilo W, Lou W (2017) Secure and efficient cloud data deduplication with randomized tag. IEEE Trans Inf Forens Secur 12(3):532–543
He Y, Xian H, Wang L, Zhang S (2021) Secure encrypted data deduplication based on data popularity. Mobile Netw Appl 26:1686–1695
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
Mr. He and Mr. Zhu wrote the main manuscript text. Mr. Zhu did experiments and prepared figures and tables. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
Not applicable.
Ethics approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
He, X., Zhu, Y. An encrypted deduplication scheme based on files diversity. J Supercomput 80, 22860–22884 (2024). https://doi.org/10.1007/s11227-024-06325-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-024-06325-0