Skip to main content
Log in

A New Adaptive Hybrid Mutation Black Widow Clustering Based Data Partitioning for Big Data Analysis

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

In recent years, big data plays a significant role in data storage development in high demand. The big data consists of a large number of datasets and it becomes trouble in handling large traditional based database management systems. Big data turns out to be more populous; since it has the capability in managing different data sources and formats under several advanced technologies. On the other hand, few research works are ineffective while dealing with today’s issues. So to overcome such shortcomings, this paper proposes a novel adaptive hybrid mutation black widow optimization (AHMBWO) based clustering approach for distributed data management system in HDFS. Also, the proposed AHMBWO approach summarizes three different phases namely the construction of resource description framework (RDF) graphs, AHMBWO based clustering approach for distributed data management system in HDFS as well as placement and partition for handling and managing the distribution of data. In addition to this, seven test functions are employed to compute the performances of the proposed AHMBWO algorithm. Then the evaluation results based on the clustering process of the proposed AHMBWO with several other approaches such as BWO, PSO, GA and BBO to test the validity of various approaches for the respective datasets. The experimental analysis reveals that the proposed AHMBWO approach provides better performances with less execution time when compared with all other approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Sreedhar, C., Kasiviswanath, N., & Reddy, P. C. (2017). Clustering large datasets using K-means modified inter and intra clustering (KM-I2C) in Hadoop. Journal of Big Data, 4(1), 27

    Article  Google Scholar 

  2. Zhou, W., Feng, D., Tan, Z., & Zheng, Y. (2018). Improving big data storage performance in hybrid environment. Journal of Computational Science, 26, 409–418

    Article  Google Scholar 

  3. Sreedhar, C., Kasiviswanath, N., & Reddy, P. C. (2015). A survey on big data management and job scheduling. International Journal of Computers and Applications, 130(13), 41–49

    Article  Google Scholar 

  4. Sun, G., Joo, Y., Chen, Y., Chen, Y., & Xie, Y. (2014). A hybrid solid-state storage architecture for the performance, energy consumption, and lifetime improvement. In Emerging memory technologies (pp. 51–77). Springer.

  5. Maheswari, K., & Ramakrishnan, M. (2019). Kernelized spectral clustering based conditional map reduce function with big data. International Journal of Computers and Applications, 2019, 1–11

    Google Scholar 

  6. Badri, S. J. (2019). A novel map-scan-reduce based density peaks clustering and privacy protection approach for large datasets. International Journal of Computers and Applications, 2019, 1–11

    Google Scholar 

  7. Ming, Y., Zhu, E., Wang, M., Liu, Q., Liu, X., & Yin, J. (2019). Scalable k-means for large-scale clustering. Intelligent Data Analysis, 23(4), 825–838

    Article  Google Scholar 

  8. Katal, A., Wazid, M., & Goudar, R. H. (2013). Big data: Issues, challenges, tools and good practices. In 2013 6th international conference on contemporary computing (IC3). IEEE.

  9. Kumar, D., & Jha, V. K. (2020). An improved query optimization process in big data using ACO-GA algorithm and HDFS map reduce technique. Distributed and Parallel Databases, 2020, 1–18

    Google Scholar 

  10. Siddiqui, I. F., Qureshi, N. M. F., Chowdhry, B. S., & Uqaili, M. A. (2020). Pseudo cache based IoT small files management framework in HDFS cluster. Wireless Personal Communications, 113(3), 1495–1522

    Article  Google Scholar 

  11. Maghsoudloo, M., Khoshavi, N., & Elastic, H. D. F. S. (2020). Interconnected distributed architecture for availability–scalability enhancement of large-scale cloud storages. The Journal of Supercomputing, 76(1), 174–203

    Article  Google Scholar 

  12. Jin, R., Kou, C., Liu, R., & Li, Y. (2013). Efficient parallel spectral clustering algorithm design for large data sets under cloud computing environment. Journal of Cloud Computing: Advances, Systems and Applications, 2(1), 18

    Article  Google Scholar 

  13. Tang, Y., Fan, A., Wang, Y., & Yao, Y. (2014). mDHT: A multi-level-indexed DHT algorithm to extra-large-scale data retrieval on HDFS/Hadoop architecture. Personal and Ubiquitous Computing, 18(8), 1835–1844

    Article  Google Scholar 

  14. Ansari, Z., Afzal, A., & Sardar, T. H. (2019). Data categorization using hadoop MapReduce-based parallel K-means clustering. Journal of The Institution of Engineers (India): Series B, 100(2), 95–103

    Article  Google Scholar 

  15. Sinha, A., & Jana, P. K. (2018). A hybrid MapReduce-based k-means clustering using genetic algorithm for distributed datasets. The Journal of Supercomputing, 74(4), 1562–1579

    Article  Google Scholar 

  16. Xuan, P., Ligon, W. B., Srimani, P. K., Ge, R., & Luo, F. (2017). Accelerating big data analytics on HPC clusters using two-level storage. Parallel Computing, 61, 18–34

    Article  MathSciNet  Google Scholar 

  17. Singh, H., & Bawa, S. (2017). A MapReduce-based scalable discovery and indexing of structured big data. Future Generation Computer Systems, 73, 32–43

    Article  Google Scholar 

  18. Wang, M., & Zhang, Q. (2020). Optimized data storage algorithm of IoT based on cloud computing in distributed system. Computer Communications, 157, 124–131

    Article  Google Scholar 

  19. Hajeer, M., Dasgupta, D., Semenov, A., & Veijalainen, J, (2014). Distributed evolutionary approach to data clustering and modelling. In 2014 IEEE symposium computational intelligence and data mining (CIDM).

  20. Huang, J., Abadi, D. J., & Ren, K. (2011). Scalable SPARQL querying of large RDF graphs. Proceedings of the VLDB Endowment, 4(11), 1123–2113

    Article  Google Scholar 

  21. Hajeer, M., & Dasgupta, D. (2017). Handling big data using a data-aware HDFS and evolutionary clustering technique. IEEE Transactions on Big Data, 5(2), 134–147

    Article  Google Scholar 

  22. Sebastian, P. A., & Peter, K. V. (2009). Spiders of India. Universities Press, India. Retrieved https://books.google.com/books?id=9oVHO-3ZGx4C

  23. Hayyolalam, V., & Kazem, A. A. P. (2020). Black widow optimization algorithm: A novel meta-heuristic approach for solving engineering optimization problems. Engineering Applications of Artificial Intelligence, 87, 103249

    Article  Google Scholar 

  24. Hamdan, M. (2010). On the disruption-level of polynomial mutation for evolutionary multi-objective optimisation algorithms. Computers, Informatics, 29(5), 783–800

    MATH  Google Scholar 

  25. Zhou, C., Gao, H. B., Gao, L., & Zhang, W.-G. (2003). Particle swarm optimization (PSO) algorithm. Application Research of Computers, 12, 7–11

    Google Scholar 

  26. Maulik, U., & Bandyopadhyay, S. (2000). Genetic algorithm-based clustering technique. Pattern recognition, 33(9), 1455–1465

    Article  Google Scholar 

  27. Rahmati, S. H. A., & Zandieh, M. (2012). A new biogeography-based optimization (BBO) algorithm for the flexible job shop scheduling problem. The International Journal of Advanced Manufacturing Technology, 58(9–12), 1115–1129

    Article  Google Scholar 

  28. Ackermann, M. R., Märtens, M., Raupach, C., Swierkot, K., Lammersen, C., & Sohler, C. (2012). StreamKM++: A clustering algorithm for data streams. ACM Journal of Experimental Algorithmics, 17(1), 327–338

    MathSciNet  MATH  Google Scholar 

  29. Sirmacek, B., & Kivits, M. (2019). Semantic segmentation of skin lesions using a small data set. Preprint arXiv:1910.10534.

  30. Sundararaj, V., Muthukumar, S., & Kumar, R. S. (2018). An optimal cluster formation based energy efficient dynamic scheduling hybrid MAC protocol for heavy traffic load in wireless sensor networks. Computers and Security, 77, 277–288

    Article  Google Scholar 

  31. Ravikumar, S., & Kavitha, D. (2020). IoT based home monitoring system with secure data storage by Keccak–Chaotic sequence in cloud server. Journal of Ambient Intelligence and Humanized Computing, 2020, 1–13

    Google Scholar 

  32. Sundararaj, V. (2017). Optimized denoising scheme via opposition based self-adaptive learning PSO algorithm for wavelet based ECG signal noise reduction. International Journal of Biomedical Engineering and Technology, 1(1), 1

    Article  Google Scholar 

  33. Rejeesh, M. R. (2019). Interest point based face recognition using adaptive neuro fuzzy inference system. Multimedia Tools and Applications, 78(16), 22691–22710

    Article  Google Scholar 

  34. Sundararaj, V. (2016). An efficient threshold prediction scheme for wavelet based ECG signal noise reduction using variable step size firefly algorithm. International Journal of Intelligent Engineering and Systems, 9(3), 117–126

    Article  Google Scholar 

  35. Vinu, S. (2019). Optimal task assignment in mobile cloud computing by queue based ant-bee algorithm. Wireless Personal Communications, 104(1), 173–197

    Article  Google Scholar 

  36. Rejeesh, M. R., & Thejaswini, P. (2020). MOTF: Multi-objective optimal trilateral filtering based partial moving frame algorithm for image denoising. Multimedia Tools and Applications, 79(37), 28411–28430

    Article  Google Scholar 

  37. Sundararaj, V., Anoop, V., Dixit, P., Arjaria, A., Chourasia, U., Bhambri, P., Rejeesh, M. R., & Sundararaj, R. (2020). CCGPA-MPPT: Cauchy preferential crossover-based global pollination algorithm for MPPT in photovoltaic system. Progress in Photovoltaics: Research and Applications, 28(11), 1128–1145

    Article  Google Scholar 

  38. Jose, J., Gautam, N., Tiwari, M., Tiwari, T., Suresh, A., Sundararaj, V., & Rejeesh, M. R. (2021). An image quality enhancement scheme employing adolescent identity search algorithm in the NSST domain for multimodal medical image fusion. Biomedical Signal Processing and Control, 66, 102480

    Article  Google Scholar 

  39. Kavitha, D., & Ravikumar, S. (2021). IOT and context-aware learning-based optimal neural network model for real-time health monitoring. Transactions on Emerging Telecommunications Technologies, 32(1), e4132

    Article  Google Scholar 

  40. Kavitha, D., & Ravikumar, S. (2015). A survey of different software security attacks and risk analysis based on security threats. International Journal of Innovative Research in Computer and Communication Engineering, 3, 3452–3458

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Ravikumar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ravikumar, S., Kavitha, D. A New Adaptive Hybrid Mutation Black Widow Clustering Based Data Partitioning for Big Data Analysis. Wireless Pers Commun 120, 1313–1339 (2021). https://doi.org/10.1007/s11277-021-08516-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-021-08516-x

Keywords

Navigation