Skip to main content
Log in

Statistical modelling and parametric optimization in document fragmentation

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In recent days, most of the business enterprises and individuals are attracted towards cloud computing due to its cost efficiency and scalability. Though the cloud adoption is significant, its security has become nightmare due to its multi-tenancy property. Generally cloud service providers commit to ensure data reliability and security, but they may get depleted due to the rapid growth rate of cloud customers. To overcome the security issues and to protect the documents uploaded in cloud, cryptography is more used. Data security can further be improved with a technique called fragmentation which helps in outsourcing data partitions instead of entire document. The fragmentation becomes a difficult and time-consuming process when the size of document grows. In this paper, an efficient fragmentation process with virtualization is proposed. CPU cycles are efficiently used by the generation of VMs which reduce the time complexity of fragmentation process. The factors such as document size, processor capacity, storage capacity and number of VMs are taken into consideration to analyse their influence on the fragmentation time. Healthcare documents’ fragmentation process is conducted, and measured real-time values are analysed statistically. For experimentation purpose, a private cloud OpenStack on Oracle virtual box is used. Taguchi technique (L27 orthogonal array) is employed to find the optimum levels of the parameters on the fragmentation time, while analysis of variance is used to analyse the contribution of the parameters towards the performance of fragmentation process. Results reveal that the document size is the most dominant factor influencing the fragmentation time followed by processor speed. By parallelizing the fragmentation process with the help of multiple VMs, the time complexity of the process gets reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Buyya R, Yeo CS, Venugopal S, Broberg J, Brandic I (2009) Cloud computing and emerging it platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gen Comput Syst 25(6):599–616

    Article  Google Scholar 

  2. Priyan MK, Devi U, Manogaran G, Sundarasekar R, Chilamkurti N, Varatharajan R (2018) Ant colony optimization algorithm with Internet of Vehicles for intelligent traffic control system. Comput Netw 144:154–162

    Article  Google Scholar 

  3. Hameed A, Khoshkbarforoushha A, Ranjan R, Jayaraman PP, Kolodziej J, Balaji P, Zeadally S, Malluhi QM, Tziritas N, Vishnu A, Khan SU, Zomaya A (2016) A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems. Computing 98(7):751–774

    Article  MathSciNet  Google Scholar 

  4. Priyan MK, Lokesh S, Varatharajan R, Babu GC, Parthasarathy P (2018) Cloud and IoT based disease prediction and diagnosis system for healthcare using fuzzy neural classifier. Future Gen Comput Syst 86:527–534

    Article  Google Scholar 

  5. Kossmann D, Kraska T, Loesing S (2010) An evaluation of alternative architectures for transaction processing in the cloud. In: Proceeding of 2010 ACM SIGMOD international conference on management of data (SIGMOD’10), pp 579–590

  6. Qureshi M, Patt Y (2006) Utility-based cache partitioning: a low-overhead, high-performance, runtime mechanism to partition shared caches. In: Proceedings of the 39th annual IEEE/ACM international symposium on microarchitecture (MICRO 39), pp 423–432

  7. Chandra Babu G, Shantharajah SP (2018) Optimal body mass index cutoff point for cardiovascular disease and high blood pressure. Neural computing and applications, pp 1–10

  8. Rolia J, Vetland V (1995) Parameter estimation for performance models of distributed application systems. In: Proceedings of CASCON, IBM Press, Toronto, Ontario, Canada, pp 54–59

  9. Kanisha B, Lokesh S, Kumar PM, Parthasarathy P, Chandra Babu G (2018) Speech recognition with improved support vector machine using dual classifiers and cross fitness validation. Personal and ubiquitous computing, pp 1–9

  10. Nathuji R, Kansal A, Ghaffarkhah A (2010) Q-clouds: managing performance interference effects for QoS-aware clouds. In: Proceedings of the 5th European conference on computer systems (EuroSys’10), ACM, pp 237–250

  11. Manogaran G, Varatharajan R, Lopez D, Priyan MK, Sundarasekar R, Thota C (2018) A new architecture of Internet of Things and big data ecosystem for secured smart healthcare monitoring and alerting system. Future Gen Comput Syst 82:375–387

    Article  Google Scholar 

  12. Kraft S, Pacheco-Sanchez S, Casale G, Dawson S (2009). Estimating service resource consumption from response time measurements. In: Proceedings of the fourth international ICST conference on performance evaluation methodologies and tools, SAP research, ICST VALUETOOLS 2009 (VALUETOOLS’09)

  13. Beebe NHF (1994). The impact of memory and architecture on computer performance, PDF text document. Center for Scientific Computing Department of Mathematics, University of Utah, Salt Lake City, UT 84112, USA

  14. Manogaran G, Vijayakumar V, Varatharajan R, Kumar PM, Sundarasekar R, Hsu CH (2018) Machine learning based big data processing framework for cancer diagnosis using hidden Markov model and GM clustering. Wirel Pers Commun 102(3):2099–2116

    Article  Google Scholar 

  15. Si L, Callan J (2004) The effect of database size distribution on resource selection algorithms, vol 2924. Springer, Berlin, pp 31–42

    Google Scholar 

  16. Powell AL, French JC, Callan J (2000) The impact of database selection on distributed searching. In: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR’00), pp 232–239

  17. Devi GU, Priyan MK, Gokulnath C (2018) Wireless camera network with enhanced SIFT algorithm for human tracking mechanism. Int J Internet Technol Secur Trans 8(2):185–194

    Article  Google Scholar 

  18. Bourlai T, Kittler J, Messer K (2006) Database size effects on performance on a smart card face verification system. In: 7th international conference on automatic face and gesture recognition (FGR06), IEEE, pp 66–72

  19. Varatharajan R, Manogaran G, Priyan MK, Balaş VE, Barna C (2018) Visual analysis of geospatial habitat suitability model based on inverse distance weighting with paired comparison analysis. Multimed Tools Appl 77(14):17573–17593

    Article  Google Scholar 

  20. Zhang H, Chen G, Tan K-L, Zhang M (2015) In-memory big data management and processing: a survey. IEEE Trans Knowl Data Eng 27(7):1920–1948

    Article  Google Scholar 

  21. Priya S, Varatharajan R, Manogaran G, Sundarasekar R, Kumar PM (2018) Paillier homomorphic cryptosystem with poker shuffling transformation based water marking method for the secured transmission of digital medical images. Personal and ubiquitous computing, pp 1–11

  22. Lagar-Cavilla HA, Whitney JA, Scannell A, Patchin P, Rumble SM, de Lara E, Brudno M, Satyanarayanan M (2009). SnowFlock: rapid virtual machine cloning for cloud computing. In: Proceedings of the 4th ACM European conference on computer systems (EuroSys’09), pp 1–12

  23. Varatharajan R, Preethi AP, Manogaran G, Kumar PM, Sundarasekar R (2018) Stealthy attack detection in multi-channel multi-radio wireless networks. Multimedia tools and applications, pp 1–24

  24. Kalaiselvi R, Kousalya K (2018) Enhanced protection for textual healthcare documents in cloud environment. Taga J Graph Technol 14:1940–1956

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Kalaiselvi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kalaiselvi, R., Kousalya, K. Statistical modelling and parametric optimization in document fragmentation. Neural Comput & Applic 32, 5909–5918 (2020). https://doi.org/10.1007/s00521-019-04068-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04068-1

Keywords

Navigation