Skip to main content
Log in

An improved secure file deduplication avoidance using CKHO based deep learning model in a cloud environment

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Data deduplication is a process that gets rid of excessive duplicates of data and minimizes the storage capacity to a large extent. This process mainly optimizes redundancies without compromising the data fidelity or integrity. However, the major challenge faced by most data deduplication systems is secure cloud storage. Cloud computing relies on the ability and security of all information. In the case of distributed storage, data protection and security are critical. This paper presents a Secure Cloud Framework for owners to effectively handle cloud-based information and provide high security for information (SCF). Weaknesses, Cross-Site Scripting (XSS), SQL perfusion, adverse processing, and wrapping are all examples of significant attacks in the cloud. This paper proposes an improved Secure File Deduplication Avoidance (SFDA) algorithm for block-level deduplication and security. The deduplication process allows cloud customers to adequately manage the distributed storage space by avoiding redundant information and saving transfer speed. A deep learning classifier is used to distinguish the familiar and unfamiliar data. A dynamic perfect hashing scheme is used in the SFDA approach to perform convergent encryption and offer secure storage. The Chaotic krill herd optimization (CKHO) algorithm is used for the optimal secret key generation process of the Advanced Encryption Standard (AES) algorithm. In this way, the unfamiliar data are encrypted one more time and stored in the cloud. The efficiency of the results is demonstrated via the experiments conducted in terms of computational cost, communication overhead, deduplication rate, and attack level. For file sizes of 8 MB, 16 MB, 32 MB, and 64 MB, the proposed methodology yields a deduplication rate of 53%, 62%, 54%, and 44%, respectively. The dynamic perfect hashing and the optimal key generation using the CKHO algorithm minimizes the data update time and the time taken to update a total of 1024 MB data is 341.5 ms. The improved SFDA algorithm's optimal key selection approach reduces the impact of an attack by up to 12% for a data size of 50 MB, whereas the existing system is mostly impacted by data size, and its attack level rises by up to 19 percent for the same data size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

Data and coding will be shared whenever it is required for the review.

References

  1. Alam T (2021) Cloud computing and its role in the information technology. IAIC Trans Sustain Dig Innov (ITSDI) 1:108–115

    Article  Google Scholar 

  2. Namasudra S (2021) Data access control in the cloud computing environment for bioinformatics. Int J Appl Res Bioinform (IJARB) 11(1):40–50

    Article  Google Scholar 

  3. Xu LJ, Hao R, Yu J, Vijayakumar P (2021) Secure deduplication for big data with efficient dynamic ownership updates. Comput Elect Eng 96:107531

    Article  Google Scholar 

  4. Storage O, Appliances NAS, Storage B, Storage ID, Storage IS, SAS DC, Blades T, Storage FC, Storage S (2008) Data deduplication.

  5. Storer MW, Greenan K, Long DD, Miller EL (2008) Secure data deduplication. In: Proceedings of the 4th ACM international workshop on Storage security and survivability, pp 1–10

  6. Li J, Chen X, Li M, Li J, Lee PP, Lou W (2013) Secure deduplication with efficient and reliable convergent key management. IEEE Trans Parall Distrib Syst 25(6):1615–1625

    Article  Google Scholar 

  7. Zhou Y, Feng D, Xia W, Fu M, Huang F, Zhang Y, Li C (2015) SecDep: a user-aware efficient fine-grained secure deduplication scheme with multi-level key management. In: Proceedings of the 2015 31st symposium on mass storage systems and technologies (MSST) (pp. 1–14). IEEE.

  8. Gao Y, Xian H, Yu A (2020) Secure data deduplication for internet-of-things sensor networks based on threshold dynamic adjustment. Int J Distrib Sens Netw 16(3):1550147720911003

    Article  Google Scholar 

  9. Mahajan P, Sachdeva A (2013) A study of encryption algorithms AES, DES and RSA for security. Glob J Comput Sci Technol

  10. Bhanot R, Hans R (2015) A review and comparative analysis of various encryption algorithms. Int J Sec Appl 9(4):289–306

    Google Scholar 

  11. Abeshu A, Chilamkurti N (2018) Deep learning: the frontier for distributed attack detection in fog-to-things computing. IEEE Commun Mag 56(2):169–175

    Article  Google Scholar 

  12. Ebinazer SE, Savarimuthu N (2021) An efficient secure data deduplication method using radix trie with bloom filter (SDD-RT-BF) in cloud environment. Peer-to-Peer Netw Appl 14(4):2443–2451

    Article  Google Scholar 

  13. Senthil P, Selvakumar S (2021) Digital proof collective model using integrated fusion data modelling. Wireless Person Commun 1–14.

  14. Ebinazer SE, Savarimuthu N, Bhanu SMS (2021) ESKEA: enhanced symmetric key encryption algorithm based secure data storage in cloud networks with data deduplication. Wireless Person Commun 117(4):3309–3325

    Article  Google Scholar 

  15. Shynu PG, Nadesh RK, Menon VG, Venu P, Abbasi M, Khosravi MR (2020) A secure data deduplication system for integrated cloud-edge networks. J Cloud Comput 9(1):1–12

    Google Scholar 

  16. Zhang G, Xie H, Yang Z, Tao X, Liu W (2021) BDKM: a blockchain-based secure deduplication scheme with reliable key management. Neural Process Lett 1–18.

  17. Wang Y, Miao M, Wang J, Zhang X (2021) Secure deduplication with efficient user revocation in cloud storage. Comput Stand Interf 78:103523

    Article  Google Scholar 

  18. Zhang G, Yang Z, Xie H, Liu W (2021) A secure authorized deduplication scheme for cloud data based on blockchain. Inform Process Manag 58(3):102510

    Article  Google Scholar 

  19. Jiang S, Jiang T, Wang L (2017) Secure and efficient cloud data deduplication with ownership management. IEEE Trans Serv Comput

  20. Bosman E, Razavi K, Bos H, Giuffrida C (2016) Dedup est machina: memory deduplication as an advanced exploitation vector. In: Proceedings of the 2016 IEEE symposium on security and privacy (SP) (pp. 987–1004). IEEE.

  21. Bayat-Sarmadi S, Mozaffari-Kermani M, Reyhani-Masoleh A (2014) Efficient and concurrent reliable realization of the secure cryptographic SHA-3 algorithm. IEEE Trans Comput Aided Des Integ Circ Syst 33(7):1105–1109

    Article  Google Scholar 

  22. Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access 7:41525–41550

    Article  Google Scholar 

  23. Lv Y, Duan Y, Kang W, Li Z, Wang FY (2014) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16(2):865–873

    Google Scholar 

  24. Wright LG, Onodera T, Stein MM, Wang T, Schachter DT, Hu Z, McMahon PL (2021) Deep physical neural networks enabled by a backpropagation algorithm for arbitrary physical systems. arXiv preprint arXiv:2104.13386.

  25. Abdullah A (2017) Advanced encryption standard (AES) algorithm to encrypt and decrypt data. Cryptogr Netw Sec 16:1–11

    Google Scholar 

  26. Gandomi AH, Alavi AH (2012) Krill herd: a new bio-inspired optimization algorithm. Commun Nonlinear Sci Numer Simul 17(12):4831–4845

    Article  MathSciNet  MATH  Google Scholar 

  27. Saremi S, Mirjalili SM, Mirjalili S (2014) Chaotic krill herd optimization algorithm. Proced Technol 12:180–185

    Article  Google Scholar 

  28. Levy D (1994) Chaos theory and strategy: theory, application, and managerial implications. Strat Manag J 15(S2):167–178

    Article  Google Scholar 

  29. Bilal H, Öztürk F (2021) Rubber bushing optimization by using a novel chaotic krill herd optimization algorithm. Soft Comput 25:1–23

    Article  Google Scholar 

  30. Wang HG, Chen KF, Qin BD, Wang LL (2015) Randomized convergent encryption in the standard model via UCEs. In: Proceedings of the 2015 international conference on computer science and applications (CSA), pp. 298–302 IEEE.

  31. Jayapandian N, Md Zubair Rahman AMJ (2018) Secure deduplication for cloud storage using interactive message-locked encryption with convergent encryption, to reduce storage space. Brazil Arch Biol Technol 61.

  32. Dinesh N, Juvanna I (2017) Dynamic auditing and deduplication with secure data deletion in Cloud. Artificial intelligence and evolutionary computations in engineering systems. Springer, Singapore, pp 305–313

    Chapter  Google Scholar 

  33. Praveena D, Rangarajan P (2020) A machine learning application for reducing the security risks in hybrid cloud networks. Multim Tools Appl 79(7):5161–5173

    Article  Google Scholar 

  34. Daniel E, Vasanthi NA (2019) LDAP: lightweight deduplication and auditing protocol for secure data storage in a cloud environment. Cluster Comput 22(1):1247–1258

    Article  Google Scholar 

  35. Sundararaj V, Muthukumar S, Kumar RS (2018) An optimal cluster formation based energy efficient dynamic scheduling hybrid MAC protocol for heavy traffic load in wireless sensor networks. Comput Sec 77:277–288

    Article  Google Scholar 

  36. Ravikumar S, Kavitha D (2020) IoT based home monitoring system with secure data storage by Keccak–Chaotic sequence in cloud server. J Ambient Intell Hum Comput 1–13.

  37. Hassan BA (2020) CSCF: a chaotic sine cosine firefly algorithm for practical application problems. Neural Comput Appl 1–20

  38. Sundararaj V (2016) An efficient threshold prediction scheme for wavelet based ECG signal noise reduction using variable step size firefly algorithm. Int J Intell Eng Syst 9(3):117–126

    Google Scholar 

  39. Sundararaj V (2019) Optimised denoising scheme via opposition-based self-adaptive learning PSO algorithm for wavelet-based ECG signal noise reduction. Int J Biomed Eng Technol 31(4):325

    Article  Google Scholar 

  40. Sundararaj V, Anoop V, Dixit P, Arjaria A, Chourasia U, Bhambri P, Rejeesh MR, Sundararaj R (2020) CCGPA-MPPT: cauchy preferential crossover-based global pollination algorithm for MPPT in photovoltaic system. Prog Photovolt Res Appl 28(11):1128–1145

    Article  Google Scholar 

  41. Ravikumar S, Kavitha D (2021) CNN‐OHGS: CNN‐oppositional‐based Henry gas solubility optimization model for autonomous vehicle control system. J Field Robot

  42. Rejeesh MR (2019) Interest point based face recognition using adaptive neuro fuzzy inference system. Multimed Tools Appl 78(16):22691–22710

    Article  Google Scholar 

  43. Kavitha D, Ravikumar S (2021) IOT and context-aware learning-based optimal neural network model for real-time health monitoring. Trans Emerg Telecommun Technol 32(1):e4132

    Google Scholar 

  44. Hassan BA, Rashid TA (2020) Datasets on statistical analysis and performance evaluation of backtracking search optimisation algorithm compared with its counterpart algorithms. Data Brief 28:105046

    Article  Google Scholar 

  45. Hassan BA, Rashid TA, Mirjalili S (2021) Formal context reduction in deriving concept hierarchies from corpora using adaptive evolutionary clustering algorithm star. Compl Intell Syst 1–16.

  46. Alam MG (2021) A new hybrid approach for data clustering analysis using hybrid fuzzy C-means and fuzzy particle swarm optimization. Des Eng 480–492.

  47. GowthulAlam MM, Baulkani S (2017) Reformulated query-based document retrieval using optimised kernel fuzzy clustering algorithm. Int J Bus Intell Data Min 12(3):299

    Google Scholar 

  48. GowthulAlam MM, Baulkani S (2019) Geometric structure information based multi-objective function to increase fuzzy clustering performance with artificial and real-life data. Soft Comput 23(4):1079–1098

    Article  Google Scholar 

Download references

Acknowledgements

I/We declare that “it is not been submitted anywhere before as well as not been published in other journals”. It does not comprise that is outrageous, indecent, deception, stealing, defamatory, or else opposing to rules. I/we pursued the Journal’s accepted “Publication ethics and malpractice” declaration provided in website of journal in concern part and responsible for the rightness (or copying) and article genuineness and quot.

Author information

Authors and Affiliations

Authors

Contributions

Both the authors are equally contributed their skills and effort to produce this article.

Corresponding author

Correspondence to N. Mageshkumar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mageshkumar, N., Lakshmanan, L. An improved secure file deduplication avoidance using CKHO based deep learning model in a cloud environment. J Supercomput 78, 14892–14918 (2022). https://doi.org/10.1007/s11227-022-04436-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04436-0

Keywords

Navigation