Abstract
Cloud storage systems have been turned into the primary services of Internet users nowadays. While the application of such systems is exponentially increasing, deduplication algorithms help face scalability issues. Although source-side deduplication optimizes both storage and bandwidth, the main concern that deduplication algorithms suffer from is still data confidentiality. Message-locked encryption (MLE) is a well-known key management framework for secure deduplication to provide confidentiality. This framework is the basis of almost all the proposed secure deduplication solutions. Even though there are lots of literature works trying to provide secure deduplication algorithms, to the best of our knowledge, none of them provide an effective anonymity service for data owners. In this paper, we propose an N-anonymity algorithm to provide an effective anonymity service, capable of prohibiting even the cloud storage provider from knowing which users are storing the same data. The algorithm is analytically studied, and the results are validated by exhaustive implementations using real data. Furthermore, we propose an ID-based key management algorithm as the cornerstone of the secure cloud storage system. The proposed algorithm, which could be considered as an asymmetric extension of MLE, is easy to implement and compatible with the existed cloud architectures as well as the proposed anonymity-based deduplication system.





Similar content being viewed by others
Data availibility
Not applicable.
References
Jia G, Han G, Rodrigues J, Lloret J, Li W (2015) Coordinate memory deduplication and partition for improving performance in cloud computing. IEEE Trans Cloud Comput 7(2):357–368. https://doi.org/10.1109/TCC.2015.2511738
Fu Y, Xiao N, Jiang H, Hu G, Chen W (2017) Application-aware big data deduplication in cloud environment. IEEE Trans Cloud Comput 7(4):921–934. https://doi.org/10.1109/TCC.2017.2710043
Sengupta B, Dixit A, Ruj S (2020) Secure cloud storage with data dynamics using secure network coding techniques. IEEE Trans Cloud Comput. https://doi.org/10.1109/TCC.2020.3000342
Yang A, Xu J, Weng J, Zhou J, Wong DS (2021) Lightweight and privacy-preserving delegatable proofs of storage with data dynamics in cloud storage. IEEE Trans Cloud Comput 9(1):212–225. https://doi.org/10.1109/TCC.2018.2851256
Luo S, Zhang G, Wu C, Khan S, Li K (2015) Boafft: distributed deduplication for big data storage in the cloud. IEEE Trans Cloud Comput. https://doi.org/10.1109/TCC.2015.2511752
Paulo J, Pereira J (2016) Efficient deduplication in a distributed primary storage infrastructure. Trans Storage 12(4):1–35. https://doi.org/10.1145/2876509
Li Y-K, Xu M, Ng C-H, Lee PPC (2014) Efficient hybrid inline and out-of-line deduplication for backup storage. Trans Storage 11(1):2–1221. https://doi.org/10.1145/2641572
Dropbox A file-storage and sharing service. http://www.dropbox.com
Google drive A file-storage and sharing service. http://drive.google.com
Mozy A file-storage and sharing service. http://mozy.com/
Mao B, Jiang H, Wu S, Fu Y, Tian L (2014) Read-performance optimization for deduplication-based storage systems in the cloud. Trans Storage 10(2):6–22. https://doi.org/10.1145/2512348
Luo S, Zhang G, Wu C, Khan SU, Li K (2020) Boafft: distributed deduplication for big data storage in the cloud. IEEE Trans Cloud Comput 8(4):1199–1211. https://doi.org/10.1109/TCC.2015.2511752
Yu C-M, Gochhayat SP, Conti M, Lu C-S (2020) Privacy aware data deduplication for side channel in cloud storage. IEEE Trans Cloud Comput 8(2):597–609. https://doi.org/10.1109/TCC.2018.2794542
Huang K, Zhang X-S, Mu Y, Rezaeibagha F, Du X (2021) Bidirectional and malleable proof-of-ownership for large file in cloud storage. IEEE Trans Cloud Comput. https://doi.org/10.1109/TCC.2021.3054751
Opendedup Cloud storage gateway and filesystem. http://opendedup.org/
Meyer DT, Bolosky WJ (2011) A study of practical deduplication. ACM Trans Storage (ToS) 7(4):1–20
Yan Z, Ding W, Yu X, Zhu H, Deng RH (2016) Deduplication on encrypted big data in cloud. IEEE Trans Big Data 2(2):138–150. https://doi.org/10.1109/TBDATA.2016.2587659
Wu T, Dou W, Hu C, Chen J (2014) Service mining for trusted service composition in cross-cloud environment. IEEE Syst J 11(1):283–294. https://doi.org/10.1109/JSYST.2014.2361841
Halevi S, Harnik D, Pinkas B, Shulman-Peleg A (2011) Proofs of ownership in remote storage systems. In: Proceedings of the 18th ACM Conference on Computer and Communications Security, ACM, New York, pp 491-500
Douceur JR, Adya A, Bolosky WJ, Simon P, Theimer M (2002) Reclaiming space from duplicate files in a serverless distributed file system. In: Proceedings 22nd International Conference on Distributed Computing Systems, pp 617–624. https://doi.org/10.1109/ICDCS.2002.1022312
Bellare M, Keelveedhi S, Ristenpart T (2013) Message-locked encryption and secure deduplication. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques, pp. 296-312 Springer, Heidelberg. https://doi.org/10.1007/978-3-642-38348-918
Shamir A (1985) Identity-based cryptosystems and signature schemes. In: Proceedings of CRYPTO 84 on Advances in Cryptology. Springer, New York, pp 47–53
Zhang Y, Mao Y, Xu M, Xu F, Zhong S (2021) Towards Thwarting Template Side-Channel Attacks in Secure Cloud Deduplications. IEEE Trans Depend Secure Comput 18(3):1008–1018. https://doi.org/10.1109/TDSC.2019.2911502
Sun Z, Shen J, Yong J (2011) DeDu: Building a deduplication storage system over cloud computing. In: Proceedings of the 2011 15th International Conference on Computer Supported Cooperative Work in Design (CSCWD), IEEE, pp 348–355. https://doi.org/10.1109/CSCWD.2011.5960097
Marques L, Costa CJ (2011) Secure deduplication on mobile devices. In: Proceedings of the 2011 Workshop on Open Source and Design of Communication OSDOC ’11, ACM, New York, pp 19–26. https://doi.org/10.1145/2016716.2016721
Anderson P, Zhang L (2010) Fast and secure laptop backups with encrypted de-duplication. In: Proceedings of the 24th International Conference on Large Installation System Administration LISA’10, USENIX Association, Berkeley, pp 1–8
Bellare M, Keelveedhi S, Ristenpart T (2013) Dupless: server-aided encryption for deduplicated storage. In: Proceedings of the 22Nd USENIX Conference on Security SEC’13, pp 179–194
Chen R, Mu Y, Yang G, Guo F (2015) Bl-mle: block-level message-locked encryption for secure large file deduplication. IEEE Trans Inf Forensics Secur 10(12):2643–2652. https://doi.org/10.1109/TIFS.2015.2470221
Atul A, William JB, Miguel C, Gerald C, Ronnie C, John RD, Jon H, Jacob RL, Marvin T, Roger PW (2002) Farsite: federated available and reliable storage for an incompletely trusted environment. SIGOPS Oper Syst Rev 36:1–4. https://doi.org/10.1145/844128.844130
Storer MW, Greenan K, Long DDE, Miller EL (2008) Secure data deduplication. In: Proceedings of the 4th ACM International Workshop on Storage Security and Survivability StorageSS ’08, ACM, New York, pp 1–10. https://doi.org/10.1145/1456469.1456471
Wilcox-O’Hearn Z, Warner B (2008) Tahoe: the least-authority filesystem. In: Proceedings of the 4th ACM International Workshop on Storage Security and Survivability StorageSS ’08, ACM, New York, pp 21–26. https://doi.org/10.1145/1456469.1456474
Rahumed A, Chen HCH, Tang Y, Lee PPC, Lui JCS (2011) A secure cloud backup system with assured deletion and version control. In: 2011 40th International Conference on Parallel Processing Workshops, pp 160–167. https://doi.org/10.1109/ICPPW.2011.17
Puzio P, Molva R, Onen M, Loureiro S (2013) Cloudedup: secure deduplication with encrypted data for cloud storage. In: 2013 IEEE 5th International Conference on Cloud Computing Technology and Science, vol 1, pp 363–370. https://doi.org/10.1109/CloudCom.2013.54
Wen Z, Luo J, Chen H, Meng J, Li X, Li J (2014) A verifiable data deduplication scheme in cloud computing. In: 2014 International Conference on Intelligent Networking and Collaborative Systems, pp 85–90. https://doi.org/10.1109/INCoS.2014.111
Li J, Chen X, Li M, Li J, Lee PPC, Lou W (2014) Secure deduplication with efficient and reliable convergent key management. IEEE Trans Parallel Distrib Syst 25(6):1615–1625. https://doi.org/10.1109/TPDS.2013.284
Li J, Li YK, Chen X, Lee PPC, Lou W (2015) A hybrid cloud approach for secure authorized deduplication. IEEE Trans Parallel Distrib Syst 26(5):1206–1216. https://doi.org/10.1109/TPDS.2014.2318320
Jung T, Li XY, Wan Z, Wan M (2015) Control cloud data access privilege and anonymity with fully anonymous attribute-based encryption. IEEE Trans Inf Forensics Secur 10(1):190–199. https://doi.org/10.1109/TIFS.2014.2368352
Harnik D, Pinkas B, Shulman-Peleg A (2010) Side channels in cloud services: deduplication in cloud storage. IEEE Secur Privacy 8(6):40–47
Wang B, Lou W, Hou YT (2015) Modeling the side-channel attacks in data deduplication with game theory. In: 2015 IEEE Conference on Communications and Network Security (CNS), pp 200–208. https://doi.org/10.1109/CNS.2015.7346829
Meyer DT, Bolosky WJ (2012) A study of practical deduplication. Trans Storage 7(4):14–20. https://doi.org/10.1145/2078861.2078864
Daemen J, Rijmen V (2002) The design of Rijndael: AES - the advanced encryption standard, 1st edn. Springer, Heidelberg
Ronald L, Rivest RS, Robshaw MJB, Yin YL The RC6 Block Cipher. https://people.csail.mit.edu/rivest/pubs/RRSY98.pdf
Boneh D, Franklin M (2001) Identity-based encryption from the weil pairing. In: Kilian J (ed) Advances in Cryptology - CRYPTO 2001, vol 2139, Springer, pp 213–229. https://doi.org/10.1007/3-540-44647-813
Gharib M, Moradlou Z, Doostari MA, Movaghar A (2017) Fully distributed ECC-based key management for mobile ad hoc networks. Comput Netw 113:269–283. https://doi.org/10.1016/j.comnet.2016.12.017
Youtube. https://www.youtube.com/
Cheng X, Dale C, Liu J (2008) Dataset: Statistics and Social Network of YouTube Videos. In: 2008 16th Interntional Workshop on Quality of Service, IEEE, pp 229–238.http://netsg.cs.sfu.ca/youtubedata/
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
All authors have participated in conception and design, or analysis and interpretation of this paper.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gharib, M., Fazli, M. Secure cloud storage with anonymous deduplication using ID-based key management. J Supercomput 79, 2356–2382 (2023). https://doi.org/10.1007/s11227-022-04751-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04751-6