Achieving low-entropy secure cloud data auditing with file and authenticator deduplication
Introduction
It is estimated by IDC [19], by the year 2020, the data held by each person would reach 5200 GB. It’s difficult for individual users to store such large-scale data locally. Thus, more and more users utilize the cloud storage service owing to the cloud’s massive storage capacity and computational power. However, 75% of the cloud data are duplicated according to a survey from EMC [10]. Storing one single copy of the duplicated files is a better choice for reducing the storage overhead of the cloud [1]. As a result, the deduplication technique has drawn significant attention of researchers.
Cloud data auditing with deduplication technique can check whether the user’s data is intact, and reduce the storage cost of the cloud. For each data block, the user utilizes his/her own secret key to compute an authenticator. This authenticator is used for data integrity verification. Different users will generate different authenticators for the same data block because they hold different secret keys. As a result, this approach cannot achieve authenticator deduplication. In reality, authenticators occupy considerable storage space in the cloud. If the security parameter is set as 80bits, one data block will occupy 20bytes storage space, whereas one authenticator will occupy up to 64bytes storage space [20]. Therefore, deduplicating the file and the authenticator at the same time is an important issue in the cloud data auditing systems.
Using message-locked key (the file’s hash value) as the secret key to compute authenticators can achieve authenticator deduplication [12]. However, it brings new problems. The linkage between the data and its owner is broken. And the large-scale of users’ cloud files bring a huge key management burden. Besides, some files have low entropy, e.g., electronic medical records and sensor data from IoT devices. Once the malicious cloud correctly deduces the content (or hash) of the file, it can deduce the message-locked key. As a result, the malicious cloud could forge the authenticators and make the cloud data auditing unable to function anymore.
To solve these issues, recently, Liu et al. [20] have proposed One-tag Checker, a cloud data auditing scheme with file and authenticator deduplication. They claim that their scheme is secure under the condition that the cloud can somehow know the content (or hash) of the file. Unfortunately, once the malicious cloud correctly deduces the hash of the file, it actually can forge the corresponding authenticators in this scheme, which contradicts with their security definition [20]. We will give the attack strategy in the APPENDIX. Besides, in their scheme, users have to be always online and interact with the Third Party Auditor (TPA) during each auditing process. Obviously, this approach is infeasible in real-world applications.
Contribution:
In this paper, we propose a novel cloud data auditing scheme with file and authenticator deduplication. For a group of users who own the same file, only one copy of the data block and authenticator is stored in the cloud. To the best of our knowledge, the proposed scheme is the first practical one that truly achieves low-entropy security. For the low-entropy file, the cloud cannot forge any authenticator to pass the auditing verification. In addition, the proposed scheme is user-friendly. Users do not need to keep interacting with the TPA during each auditing task. As a result, users are relieved from the tedious auditing task.
In the proposed scheme, we give a new method to compute the authenticator and design a new form of the file tag. The authenticator guarantees low-entropy security, since it is generated by the randomize-message-locked key. The new form of the file tag links the file and its owner. It also allows the TPA to perform the auditing task without always interacting with the user.
We give rigorous security analysis, showing that the proposed scheme satisfies soundness and low-entropy security. We also give detailed experiments to show the efficiency of our scheme.
Organization: The organization of this paper is demonstrated as follows. In Section 2, we introduce the preliminaries. In Section 3, we show our system model, definition and design goals. In Section 4, the scheme’s detail is shown. In Section 5, we show the correctness and the security proof of the proposed scheme. In Section 6, we give the comparison and the experiment results. In Section 7, we introduce the related work. In Section 8, we give the conclusion.
Section snippets
Preliminaries
We use and to denote two q-order multiplicative cyclic groups. We use g to denote the generator in . We use to denote two random elements in and to denote two random elements in .
System model
As shown in Fig. 1, the system model contains four entities: the initial user, the subsequent user, the TPA and the cloud.
- •
The initial user: It is the first person who uploads the file F that did not exist previously in the cloud. The initial user generates authenticators for the encrypted data blocks. And then the initial user uploads the file tag, authenticators and data blocks to the cloud. Finally, the initial user goes offline.
- •
The subsequent user: It is the person who wishes to upload the
High level explanation
Convergent Encryption can be used to achieve secure file deduplication. Using the message-locked key (the file’s hash value) as the secret key to compute authenticators can achieve authenticator deduplication. However, this approach cannot guarantee low-entropy security. The malicious cloud could easily deduce the content (or the hash) of the low-entropy file. With this convergent key, it’s easy for the malicious cloud to forge authenticators. Thus, how to achieve low-entropy security in cloud
Correctness and security analysis
Theorem 1 (Correctness): If the cloud correctly stores the data, the auditing proof can pass the verification. Proof According to the properties of the bilinear map, the Eq. (1) and the Eq. (2) hold because:
Comparison
Table 2 gives the comparison between the proposed scheme and some related schemes. Ref. [20] does not truly achieve low-entropy security. The malicious cloud can forge the authenticator. We will give the attack strategy at the APPENDIX. Besides, in their scheme, the user has to keep interacting with the TPA in the auditing phase. It brings a heavy communication and computation burden to the user, which is also unpractical in reality. Ref. [12] does not achieve low-entropy security. In their
Related work
Cloud data Auditing To perform the integrity auditing of the cloud data without retrieving all of them, Ateniese et al. [2] proposed ”Provable Data Possession (PDP)”. Juels and Kaliski Jr. [23] proposed ”Proof of Retrievability (PoR)”, which can guarantee both the possession and the retrievability of the cloud data. In the PDP/POR scheme, the user needs to compute authenticators for data integrity verification. These authenticators are derived from the homomorphic signature (e.g. BLS short
Conclusion
In this paper, a low-entropy secure cloud data auditing scheme with file and authenticator deduplication has been proposed. We design a new way to compute authenticators and a new form of file tags. In the proposed scheme, the cloud only stores one copy of data blocks and authenticators for the duplicated file. For the low-entropy file, the malicious cloud cannot forge any authenticator to pass the integrity verification. Comprehensive experiments show that the storage performance of the cloud
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This research is supported by National Natural Science Foundation of China (61572267), National Cryptography Development Fund of China (MMJJ20170118), Key Research and Development Project of Shandong Province (2019GGX101051), the National Key R&D Program of China under Grant (No. 2017YFB0802300), the Open Project of the State Key Laboratory of Information Security (2019-MS-03).
References (35)
- et al.
Enabling public auditing for shared data in cloud storage supporting identity privacy and traceability
J. Syst. Softw.
(2016) - et al.
File-specific deduplication for cloud storages
Int. J. Appl. Res. Inform. Technol. Comput.
(2017) - et al.
Provable data possession at untrusted stores, in
- et al.
Message-locked encryption and secure deduplication
- D. Boneh, X. Boyen, H. Shacham, Short group signatures, in: Annual International Cryptology Conference, Springer, 2004,...
- et al.
Aggregate and verifiably encrypted signatures from bilinear maps
- et al.
Short signatures from the Weil pairing
- et al.
Secure auditing and deduplication for encrypted cloud data supporting ownership modification
Soft Comput.
(2020) - et al.
Boosting efficiency and security in proof of ownership for deduplication, in
- et al.
Reclaiming space from duplicate files in a serverless distributed file system, in
Proofs of ownership in remote storage systems, in
Enabling secure auditing and deduplicating data without owner-relationship exposure in cloud storage
Cluster Comput.
EVA: efficient versatile auditing scheme for IoT-based datamarket in jointcloud
IEEE Internet Things
Secure data deduplication with dynamic ownership management in cloud storage
IEEE T Knowl Data En
Cited by (22)
Certificateless cloud storage auditing supporting data ownership transfer
2024, Computers and SecurityBlockchain-powered distributed data auditing scheme for cloud-edge healthcare system
2023, Cyber Security and ApplicationsSecure auditing and deduplication with efficient ownership management for cloud storage
2023, Journal of Systems ArchitectureBGNBA-OCO based privacy preserving attribute based access control with data duplication for secure storage in cloud
2024, Journal of Cloud ComputingRESIST: Randomized Encryption for Deduplicated Cloud Storage System
2024, Research SquarePay-Per-Proof: Decentralized Outsourced Multi-User PoR for Cloud Storage Payment Using Blockchain
2024, IEEE Transactions on Cloud Computing