Abstract
Data deduplication is a process that gets rid of excessive duplicates of data and minimizes the storage capacity to a large extent. This process mainly optimizes redundancies without compromising the data fidelity or integrity. However, the major challenge faced by most data deduplication systems is secure cloud storage. Cloud computing relies on the ability and security of all information. In the case of distributed storage, data protection and security are critical. This paper presents a Secure Cloud Framework for owners to effectively handle cloud-based information and provide high security for information (SCF). Weaknesses, Cross-Site Scripting (XSS), SQL perfusion, adverse processing, and wrapping are all examples of significant attacks in the cloud. This paper proposes an improved Secure File Deduplication Avoidance (SFDA) algorithm for block-level deduplication and security. The deduplication process allows cloud customers to adequately manage the distributed storage space by avoiding redundant information and saving transfer speed. A deep learning classifier is used to distinguish the familiar and unfamiliar data. A dynamic perfect hashing scheme is used in the SFDA approach to perform convergent encryption and offer secure storage. The Chaotic krill herd optimization (CKHO) algorithm is used for the optimal secret key generation process of the Advanced Encryption Standard (AES) algorithm. In this way, the unfamiliar data are encrypted one more time and stored in the cloud. The efficiency of the results is demonstrated via the experiments conducted in terms of computational cost, communication overhead, deduplication rate, and attack level. For file sizes of 8 MB, 16 MB, 32 MB, and 64 MB, the proposed methodology yields a deduplication rate of 53%, 62%, 54%, and 44%, respectively. The dynamic perfect hashing and the optimal key generation using the CKHO algorithm minimizes the data update time and the time taken to update a total of 1024 MB data is 341.5 ms. The improved SFDA algorithm's optimal key selection approach reduces the impact of an attack by up to 12% for a data size of 50 MB, whereas the existing system is mostly impacted by data size, and its attack level rises by up to 19 percent for the same data size.












Data availability
Data and coding will be shared whenever it is required for the review.
Change history
15 July 2024
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1007/s11227-024-06360-x
References
Alam T (2021) Cloud computing and its role in the information technology. IAIC Trans Sustain Dig Innov (ITSDI) 1:108–115
Namasudra S (2021) Data access control in the cloud computing environment for bioinformatics. Int J Appl Res Bioinform (IJARB) 11(1):40–50
Xu LJ, Hao R, Yu J, Vijayakumar P (2021) Secure deduplication for big data with efficient dynamic ownership updates. Comput Elect Eng 96:107531
Storage O, Appliances NAS, Storage B, Storage ID, Storage IS, SAS DC, Blades T, Storage FC, Storage S (2008) Data deduplication.
Storer MW, Greenan K, Long DD, Miller EL (2008) Secure data deduplication. In: Proceedings of the 4th ACM international workshop on Storage security and survivability, pp 1–10
Li J, Chen X, Li M, Li J, Lee PP, Lou W (2013) Secure deduplication with efficient and reliable convergent key management. IEEE Trans Parall Distrib Syst 25(6):1615–1625
Zhou Y, Feng D, Xia W, Fu M, Huang F, Zhang Y, Li C (2015) SecDep: a user-aware efficient fine-grained secure deduplication scheme with multi-level key management. In: Proceedings of the 2015 31st symposium on mass storage systems and technologies (MSST) (pp. 1–14). IEEE.
Gao Y, Xian H, Yu A (2020) Secure data deduplication for internet-of-things sensor networks based on threshold dynamic adjustment. Int J Distrib Sens Netw 16(3):1550147720911003
Mahajan P, Sachdeva A (2013) A study of encryption algorithms AES, DES and RSA for security. Glob J Comput Sci Technol
Bhanot R, Hans R (2015) A review and comparative analysis of various encryption algorithms. Int J Sec Appl 9(4):289–306
Abeshu A, Chilamkurti N (2018) Deep learning: the frontier for distributed attack detection in fog-to-things computing. IEEE Commun Mag 56(2):169–175
Ebinazer SE, Savarimuthu N (2021) An efficient secure data deduplication method using radix trie with bloom filter (SDD-RT-BF) in cloud environment. Peer-to-Peer Netw Appl 14(4):2443–2451
Senthil P, Selvakumar S (2021) Digital proof collective model using integrated fusion data modelling. Wireless Person Commun 1–14.
Ebinazer SE, Savarimuthu N, Bhanu SMS (2021) ESKEA: enhanced symmetric key encryption algorithm based secure data storage in cloud networks with data deduplication. Wireless Person Commun 117(4):3309–3325
Shynu PG, Nadesh RK, Menon VG, Venu P, Abbasi M, Khosravi MR (2020) A secure data deduplication system for integrated cloud-edge networks. J Cloud Comput 9(1):1–12
Zhang G, Xie H, Yang Z, Tao X, Liu W (2021) BDKM: a blockchain-based secure deduplication scheme with reliable key management. Neural Process Lett 1–18.
Wang Y, Miao M, Wang J, Zhang X (2021) Secure deduplication with efficient user revocation in cloud storage. Comput Stand Interf 78:103523
Zhang G, Yang Z, Xie H, Liu W (2021) A secure authorized deduplication scheme for cloud data based on blockchain. Inform Process Manag 58(3):102510
Jiang S, Jiang T, Wang L (2017) Secure and efficient cloud data deduplication with ownership management. IEEE Trans Serv Comput
Bosman E, Razavi K, Bos H, Giuffrida C (2016) Dedup est machina: memory deduplication as an advanced exploitation vector. In: Proceedings of the 2016 IEEE symposium on security and privacy (SP) (pp. 987–1004). IEEE.
Bayat-Sarmadi S, Mozaffari-Kermani M, Reyhani-Masoleh A (2014) Efficient and concurrent reliable realization of the secure cryptographic SHA-3 algorithm. IEEE Trans Comput Aided Des Integ Circ Syst 33(7):1105–1109
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access 7:41525–41550
Lv Y, Duan Y, Kang W, Li Z, Wang FY (2014) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16(2):865–873
Wright LG, Onodera T, Stein MM, Wang T, Schachter DT, Hu Z, McMahon PL (2021) Deep physical neural networks enabled by a backpropagation algorithm for arbitrary physical systems. arXiv preprint arXiv:2104.13386.
Abdullah A (2017) Advanced encryption standard (AES) algorithm to encrypt and decrypt data. Cryptogr Netw Sec 16:1–11
Gandomi AH, Alavi AH (2012) Krill herd: a new bio-inspired optimization algorithm. Commun Nonlinear Sci Numer Simul 17(12):4831–4845
Saremi S, Mirjalili SM, Mirjalili S (2014) Chaotic krill herd optimization algorithm. Proced Technol 12:180–185
Levy D (1994) Chaos theory and strategy: theory, application, and managerial implications. Strat Manag J 15(S2):167–178
Bilal H, Öztürk F (2021) Rubber bushing optimization by using a novel chaotic krill herd optimization algorithm. Soft Comput 25:1–23
Wang HG, Chen KF, Qin BD, Wang LL (2015) Randomized convergent encryption in the standard model via UCEs. In: Proceedings of the 2015 international conference on computer science and applications (CSA), pp. 298–302 IEEE.
Jayapandian N, Md Zubair Rahman AMJ (2018) Secure deduplication for cloud storage using interactive message-locked encryption with convergent encryption, to reduce storage space. Brazil Arch Biol Technol 61.
Dinesh N, Juvanna I (2017) Dynamic auditing and deduplication with secure data deletion in Cloud. Artificial intelligence and evolutionary computations in engineering systems. Springer, Singapore, pp 305–313
Praveena D, Rangarajan P (2020) A machine learning application for reducing the security risks in hybrid cloud networks. Multim Tools Appl 79(7):5161–5173
Daniel E, Vasanthi NA (2019) LDAP: lightweight deduplication and auditing protocol for secure data storage in a cloud environment. Cluster Comput 22(1):1247–1258
Sundararaj V, Muthukumar S, Kumar RS (2018) An optimal cluster formation based energy efficient dynamic scheduling hybrid MAC protocol for heavy traffic load in wireless sensor networks. Comput Sec 77:277–288
Ravikumar S, Kavitha D (2020) IoT based home monitoring system with secure data storage by Keccak–Chaotic sequence in cloud server. J Ambient Intell Hum Comput 1–13.
Hassan BA (2020) CSCF: a chaotic sine cosine firefly algorithm for practical application problems. Neural Comput Appl 1–20
Sundararaj V (2016) An efficient threshold prediction scheme for wavelet based ECG signal noise reduction using variable step size firefly algorithm. Int J Intell Eng Syst 9(3):117–126
Sundararaj V (2019) Optimised denoising scheme via opposition-based self-adaptive learning PSO algorithm for wavelet-based ECG signal noise reduction. Int J Biomed Eng Technol 31(4):325
Sundararaj V, Anoop V, Dixit P, Arjaria A, Chourasia U, Bhambri P, Rejeesh MR, Sundararaj R (2020) CCGPA-MPPT: cauchy preferential crossover-based global pollination algorithm for MPPT in photovoltaic system. Prog Photovolt Res Appl 28(11):1128–1145
Ravikumar S, Kavitha D (2021) CNN‐OHGS: CNN‐oppositional‐based Henry gas solubility optimization model for autonomous vehicle control system. J Field Robot
Rejeesh MR (2019) Interest point based face recognition using adaptive neuro fuzzy inference system. Multimed Tools Appl 78(16):22691–22710
Kavitha D, Ravikumar S (2021) IOT and context-aware learning-based optimal neural network model for real-time health monitoring. Trans Emerg Telecommun Technol 32(1):e4132
Hassan BA, Rashid TA (2020) Datasets on statistical analysis and performance evaluation of backtracking search optimisation algorithm compared with its counterpart algorithms. Data Brief 28:105046
Hassan BA, Rashid TA, Mirjalili S (2021) Formal context reduction in deriving concept hierarchies from corpora using adaptive evolutionary clustering algorithm star. Compl Intell Syst 1–16.
Alam MG (2021) A new hybrid approach for data clustering analysis using hybrid fuzzy C-means and fuzzy particle swarm optimization. Des Eng 480–492.
GowthulAlam MM, Baulkani S (2017) Reformulated query-based document retrieval using optimised kernel fuzzy clustering algorithm. Int J Bus Intell Data Min 12(3):299
GowthulAlam MM, Baulkani S (2019) Geometric structure information based multi-objective function to increase fuzzy clustering performance with artificial and real-life data. Soft Comput 23(4):1079–1098
Acknowledgements
I/We declare that “it is not been submitted anywhere before as well as not been published in other journals”. It does not comprise that is outrageous, indecent, deception, stealing, defamatory, or else opposing to rules. I/we pursued the Journal’s accepted “Publication ethics and malpractice” declaration provided in website of journal in concern part and responsible for the rightness (or copying) and article genuineness and quot.
Author information
Authors and Affiliations
Contributions
Both the authors are equally contributed their skills and effort to produce this article.
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1007/s11227-024-06360-x"
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mageshkumar, N., Lakshmanan, L. RETRACTED ARTICLE: An improved secure file deduplication avoidance using CKHO based deep learning model in a cloud environment. J Supercomput 78, 14892–14918 (2022). https://doi.org/10.1007/s11227-022-04436-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04436-0