Skip to main content
Log in

Protection of data privacy from vulnerability using two-fish technique with Apriori algorithm in data mining

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

A Correction to this article was published on 24 June 2022

This article has been updated

Abstract

The confidential data is mainly managed by creating passwords, tokens, and unique identifiers in an authorized manner. These records must be kept in a safe location away from the reach of unauthorized third parties. Both the client and server sides must be encrypted using the two-fish algorithm, which secures the distinction of private data. By gaining access to the user's information, a data miner may be able to steal it. To avoid such situations, both the data miner and the server must be encrypted. Further, the previous techniques faced several shortcomings in case of higher computational overhead, poor resource utilization, prone to single point failure, lower accuracy, noise, poor security, higher distortion, etc. In this study, both the client and server sides are encrypted using a two-fish algorithm to avoid information loss while transferring data to overcome these problems. The way the state-of-art techniques handled the privacy preservation issue often leads to privacy violations. This paper focuses on mining frequent itemsets present in the medical data by also ensuring privacy. Frequent itemset mining mainly aims to extract highly correlated items from the database and to achieve this novel fruitfly whale optimization algorithm (FWOA) combined with the Apriori algorithm. The Apriori heuristic and bio-inspired algorithms are integrated to solve the frequent itemset problem by reducing the low runtime performance when handling large datasets and also offering high-quality solutions. The adaptive k-anonymity approach is used for preserving data privacy by transforming the original data into an encrypted mode and offering privacy to the top-k frequent itemsets mining. The main advantage of the adaptive k-anonymity approach is that the confidential information disclosed by an individual user cannot be identified from at least k − 1 individuals. We ensure that the proposed methodology can offer data privacy in real time by the experiments conducted in a medical dataset. The experimental results obtained highlight the robustness of this scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Change history

References

  1. Al Shorman HM, Jbara YH (2017) An improved association rule mining algorithm based on Apriori and ant colony approaches. IOSR J Eng 7(7):18–23

    Article  Google Scholar 

  2. Arava K, Lingamgunta S (2019) Adaptive k-anonymity approach for privacy preserving in cloud. Arab J Sci Eng 66:1–8

    Google Scholar 

  3. Balashunmugaraja B, Ganeshbabu TR (2022) Privacy preservation of cloud data in business application enabled by multi-objective red deer-bird swarm algorithm. Knowl Based Syst 236:107748

    Article  Google Scholar 

  4. Derouiche A, Layeb A, Habbas Z (2020) Metaheuristics guided by the apriori principle for association rule mining: case study-CRO metaheuristic. Int J Organ Collect Intell 10(3):14–37

    Article  Google Scholar 

  5. Djenouri Y, Comuzzi M (2017) Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem. Inf Sci 420:1–15

    Article  Google Scholar 

  6. Djoudi M, Kacha L, Zitouni A (2021) KAB: a new k-anonymity approach based on black hole algorithm. J King Saud Univ Comput Inf Sci 6:66

    Google Scholar 

  7. Fan Y, Wang P, Heidari AA, Wang M, Zhao X, Chen H, Li C (2020) Boosted hunting-based fruit fly optimization and advances in real-world problems. Expert Syst Appl 159:113502

    Article  Google Scholar 

  8. Gartner_Inc. (n.d.) Forecast analysis: information security, worldwide, 2Q18 update. Retrieved from https://www.gartner.com/en/documents/3889055/forecast-analysis-information-security-worldwide-2q18-up

  9. Gowthul AMM, Baulkani S (2019) Geometric structure information based multi-objective function to increase fuzzy clustering performance with artificial and real-life data. Soft Comput 23(4):1079–1098

    Article  Google Scholar 

  10. Hassan BA (2020) CSCF: a chaotic sine cosine firefly algorithm for practical application problems. Neural Comput Appl 66:1–20

    Google Scholar 

  11. Hassan BA, Rashid TA (2020) Datasets on statistical analysis and performance evaluation of backtracking search optimisation algorithm compared with its counterpart algorithms. Data Brief 28:105046

    Article  Google Scholar 

  12. Jang-Jaccard J, Nepal S (2014) A survey of emerging threats in cybersecurity. J Comput Syst Sci 80(5):973–993

    Article  MathSciNet  MATH  Google Scholar 

  13. Javid T, Gupta MK, Gupta A (2020) A hybrid-security model for privacy-enhanced distributed data mining. J King Saud Univ Comput Inf Sci 6:66

    Google Scholar 

  14. Khokhar RH, Iqbal F, Fung BCM, Bentahar J (2020) Enabling secure trustworthiness assessment and privacy protection in integrating data for trading person-specific information. IEEE Trans Eng Manag 68(1):149–169

    Article  Google Scholar 

  15. Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67

    Article  Google Scholar 

  16. Patel TS, Mayur P, Dhara L, Jahnvi K, Piyusha D, Ashish P, Reecha P, Tushar SP, Mayur P, Dhara L (2013) An analytical study of various frequent itemset mining algorithms. Res J Comput IT Sci 1(1):6–9

    Google Scholar 

  17. Pramanik MI, Lau RYK, Zhang W (2016) K-anonymity through the enhanced clustering method. In: 2016 IEEE 13th International Conference on e-Business Engineering (ICEBE). IEEE, pp 85–91

  18. Pushpa B (2020) Hybrid data encryption algorithm for secure medical data transmission in cloud environment. In: 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC). IEEE, pp 329–334

  19. Qiao Y, Lan Q, Zhou Z, Ma C (2022) Privacy-preserving credit evaluation system based on blockchain. Expert Syst Appl 188:115989

    Article  Google Scholar 

  20. Rafiei M, van der Aalst WMP (2021) Group-based privacy preservation techniques for process mining. arXiv preprint arXiv:2105.11983

  21. Ramesh S, Jayasankar T, Bhavadharini RM, Nagarajan NR, Mani G (2021) Securing medical data using extended role based access control model and twofish algorithms on cloud platform. Eur J Mol Clin Med 8(1):1075–1089

    Google Scholar 

  22. Ren Y, Li X, Miao Y, Deng R, Weng J, Ma S, Ma J (2022) DistPreserv: maintaining user distribution for privacy-preserving location-based services. IEEE Trans Mob Comput 6:66

    Google Scholar 

  23. Sobers R (2020) Updated: the likelihood of a cyber attack compared. Inside out security. March 30, 2020. Accessed 9 June 2021. https://www.varonis.com/blog/likelihood-of-a-cyber-attack/

  24. Shabtay L, Fournier-Viger P, Yaari R, Dattner I (2021) A guided FP-Growth algorithm for mining multitude-targeted item-sets and class association rules in imbalanced data. Inf Sci 553:353–375

    Article  MathSciNet  MATH  Google Scholar 

  25. Shanmugapriya E, Kavitha R (2019) Medical big data analysis: preserving security and privacy with hybrid cloud technology. Soft Comput 23(8):2585–2596

    Article  Google Scholar 

  26. Shao Z, Wang H, Zou Y, Gao Z, Lv H (2021) From centralized protection to distributed edge collaboration: a location difference-based privacy-preserving framework for mobile crowdsensing. Secur Commun Netw 6:66

    Google Scholar 

  27. Sharif MHU (2021) Privacy preservation of medical data using random decision tree

  28. Sharma J, Kim D, Lee A, Seo D (2021) On differential privacy-based framework for enhancing user data privacy in mobile edge computing environment. IEEE Access 9:38107–38118

    Article  Google Scholar 

  29. Sudrajat AW, Cholid I (2021) Application of the Apriori algorithm and FP-growth to find out the association rule between gender, education level on wages of SMEs workers in Palembang City. In: International Conference Health, Science and Technology (ICOHETECH), pp 170–173

  30. Sun Z, Wang Y, Shu M, Liu R, Zhao H (2019) Differential privacy for data and model publishing of medical data. IEEE Access 7:152103–152114

    Article  Google Scholar 

  31. Sundararaj V (2016) An efficient threshold prediction scheme for wavelet based ECG signal noise reduction using variable step size firefly algorithm. Int J Intell Eng Syst 9(3):117–126

    Google Scholar 

  32. Sundararaj V (2019) Optimised denoising scheme via opposition-based self-adaptive learning PSO algorithm for wavelet-based ECG signal noise reduction. Int J Biomed Eng Technol 31(4):325

    Article  Google Scholar 

  33. Sundararaj V, Anoop V, Dixit P, Arjaria A, Chourasia U, Bhambri P, Rejeesh MR, Sundararaj R (2020) CCGPA-MPPT: Cauchy preferential crossover-based global pollination algorithm for MPPT in photovoltaic system. Prog Photovolt Res Appl 28(11):1128–1145

    Article  Google Scholar 

  34. Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzz Knowl Based Syst 10(5):557–570

    Article  MathSciNet  MATH  Google Scholar 

  35. Van Blarkom GW, Borking JJ, Olk JGE (2003) Handbook of privacy and privacy-enhancing technologies. In: Privacy Incorporated Software Agent (PISA) Consortium, vol 198. The Hague, p 14

  36. Vinu S (2019) Optimal task assignment in mobile cloud computing by queue based ant-bee algorithm. Wirel Pers Commun 104(1):173–197

    Article  Google Scholar 

  37. Wang L, Liu R, Liu S (2016) An effective and efficient fruitfly optimization algorithm with level probability policy and its applications. Knowl Based Syst 97(2016):158–174

    Article  Google Scholar 

  38. Wang Y, Gu M, Ma J, Jin Q (2019) DNN-DP: Differential privacy enabled deep neural network learning framework for sensitive crowdsourcing data. IEEE Trans Comput Soc Syst 7(1):215–224

    Article  Google Scholar 

  39. Wang Z, Pang X, Chen Y, Shao H, Wang Q, Wu L, Chen H, Qi H (2018) Privacy-preserving crowd-sourced statistical data publishing with an untrusted server. IEEE Trans Mob Comput 18(6):1356–1367

    Article  Google Scholar 

  40. Yaacoub J-PA, Noura M, Noura HN, Salman O, Yaacoub E, Couturier R, Chehab A (2020) Securing internet of medical things systems: limitations, issues and recommendations. Fut Gener Comput Syst 105:581–606

    Article  Google Scholar 

  41. Yaacoub J-PA, Salman O, Noura HN, Kaaniche N, Chehab A, Malli M (2020) Cyber-physical systems security: limitations, issues and future trends. Microprocess Microsyst 77:103201

    Article  Google Scholar 

  42. Yin C, Xi J, Sun R, Wang J (2017) Location privacy protection based on differential privacy strategy for big data in industrial internet of things. IEEE Trans Ind Inform 14(8):3628–3636

    Article  Google Scholar 

  43. Zhang H, Lin L, Xu L, Wang X (2021) Graph partition based privacy-preserving scheme in social networks. J Netw Comput Appl 195:103214

    Article  Google Scholar 

  44. Zheng X, Zhang L, Li K, Zeng X (2021) Efficient publication of distributed and overlapping graph data under differential privacy. Tsinghua Sci Technol 27(2):235–243

    Article  Google Scholar 

  45. Zhong S, Yang Z, Wright RN (2005) Privacy-enhancing k-anonymization of customer data. In: Proceedings of the Twenty-Fourth ACM SIGMOD-SIGACT-SIGART Symposium on Principles Of Database Systems, pp 139–147

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Dhinakaran.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The affiliation details for Author P. M. Joe Prathap were incorrectly given.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dhinakaran, D., Prathap, P.M.J. Protection of data privacy from vulnerability using two-fish technique with Apriori algorithm in data mining. J Supercomput 78, 17559–17593 (2022). https://doi.org/10.1007/s11227-022-04517-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04517-0

Keywords

Navigation