Skip to main content
Log in

Optimization of Big Data Parallel Scheduling Based on Dynamic Clustering Scheduling Algorithm

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

In today’s data age, the big data processing analysis framework plays an important role in mass information processing, along with the increasing of massive data. “Sharing Data” is proposed to enhance the performance of data processing through structured data scheduling. However, such approach makes the higher communication cost and buffer cost for the extra data copy and buffering. Hence, in the big data analysis environment, this paper uses based on the correlation of data, Dynamic Cluster Scheduling Algorithm(DCSA) is proposed for parallel optimization of big data tasks. Firstly, a dynamic data queue based on the server’s request database is generated. The priority of data item and size of data item are as the considerations of dynamic data queue for data clustering association. And then the weights are introduced, the dynamic data item is made equalization to provide the basis for the multi-channel optimal scheduling. Secondly, according to the relevance of the data items, the mechanism of data optimized placement is used to make the data which are aggregated in the same frame. After the placement is completed, the dynamic data is uniformly scheduled to minimize the cost at the time of migration, with the local characteristics of the data item as constraints. Through the target iteration, the optimal scheduling scheme is adjusted, and finally to achieve multi-channel optimal scheduling. Experiments show that the proposed method enables dynamic data to achieve optimal scheduling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4

Similar content being viewed by others

References

  1. Wu, G., et al. (2013). A decentralized approach for mining event correlations in distributed system monitoring. Journal of Parallel and Distributed Computing, 73(3), 330–340. https://doi.org/10.1016/j.jpdc.2012.09.007

    Article  MATH  Google Scholar 

  2. Qiu, M., et al. (2015). Data allocation for hybrid memory with genetic algorithm. IEEE Transactions on Emerging Topics in Computing, 3(4), 544–555. https://doi.org/10.1109/TETC.2015.2398824

    Article  Google Scholar 

  3. Qiu, M., et al. (2008). Energy minimization with loop fusion and multi-functional-unit scheduling for multidimensional DSP. Journal of Parallel and Distributed Computing, 68(4):443–455. https://doi.org/10.1016/j.jpdc.2007.06.014. URL https://www.sciencedirect.com/science/article/pii/S0743731507001013

  4. Wang, J., Qiu, M., & Guo, B. (2017). Enabling real-time information service on telehealth system over cloud-based big data platform. Journal of Systems Architecture, 72, 69–79.

    Article  Google Scholar 

  5. Qiu, L., Gai, K., & Qiu, M. (2016). Optimal big data sharing approach for tele-health in cloud computing. 2016 IEEE International Conference on Smart Cloud (SmartCloud), 184–189. https://doi.org/10.1109/SmartCloud.2016.21

  6. Qiu, M., et al. (2013). Rna nanotechnology for computer design and in vivo computation. Philosophical Transactions Series A, Mathematical, Physical, and Engineering Sciences, 371(2000)

  7. Qiu, M., Li, H., & Sha, E. H. (2009). Heterogeneous real-time embedded software optimization considering hardware platform. In Shin SY, Ossowski S (Eds.) Proceedings of the 2009 ACM Symposium on Applied Computing (SAC), (pp. 1637–1641). Honolulu, Hawaii, USA, March 9-12, 2009, ACM. https://doi.org/10.1145/1529282.1529651

  8. Qiu, M., et al. (2013). Security-aware optimization for ubiquitous computing systems with SEAT graph approach. Journal of Computer and System Sciences, 79(5), 518–529. https://doi.org/10.1016/j.jcss.2012.11.002

    Article  MathSciNet  MATH  Google Scholar 

  9. Li, Y., Song, Y., Jia, L., et al. (2020). Intelligent fault diagnosis by fusing domain adversarial training and maximum mean discrepancy via ensemble learning. IEEE Trans on Industrial Informatics, 17(4), 2833–2841.

    Article  Google Scholar 

  10. Qiu, M., Gai, K., & Xiong, Z. (2018). Privacy-preserving wireless communications using bipartite matching in social big data. FGCS, 87, 772–781.

    Article  Google Scholar 

  11. Novak, A., Sucha, P., Novotny, M., Stec, R., & Hanzalek, Z. (2022). Scheduling jobs with normally distributed processing times on parallel machines. European Journal of Operational Research, 297(2), 422–441. https://doi.org/10.1016/j.ejor.2021.05.01. URL https://ideas.repec.org/a/eee/ejores/v297y2022i2p422-441.html

  12. Qiu, M., et al. (2008). Energy minimization with loop fusion and multi-functional-unit scheduling for multidimensional DSP. Journal of Parallel and Distributed Computing, 68(4), 443–455. URL https://www.sciencedirect.com/science/article/pii/S0743731507001013. https://doi.org/10.1016/j.jpdc.2007.06.014

  13. Qiu, M., Guo, M., Liu, M., et al. (2009). Loop scheduling and bank type assignment for heterogeneous multi-bank memory. JPDC, 69, 546–558.

    Google Scholar 

  14. Goossens, S., Chandrasekar, K., Akesson, B., & Goossens, K. (2016). Memory Controllers for Mixed-Time-Criticality Systems: Architectures. Methodologies and Trade-Offs: Springer Publishing Company, Incorporated.

    Book  MATH  Google Scholar 

  15. Kordon, A. M. (2020). A fixed-parameter algorithm for scheduling unit dependent tasks on parallel machines with time windows. Discrete Applied Mathematics. URL https://hal.archives-ouvertes.fr/hal-03041735

  16. Niño, A., Reyes, S., & Carbó-Dorca, R. (2021). An HPC hybrid parallel approach to the experimental analysis of fermat’s theorem extension to arbitrary dimensions on heterogeneous computer systems. J Supercomput, 77(10), 11328–11352. https://doi.org/10.1007/s11227-021-03727-2

    Article  Google Scholar 

  17. Niu, J., Gao, Y., Qiu, M., & Ming, Z. (2012). Selecting proper wireless network interfaces for user experience enhancement with guaranteed probability. JPDC, 72, 1565–1575.

    Google Scholar 

  18. Qiu, M., et al. (2006). Efficent algorithm of energy minimization for heterogeneous wireless sensor network. In E. Sha, S. K. Han, C. Z. Xu, M. H. Kim, L. T. Yang, & B. Xiao (Eds.), Embedded and Ubiquitous Computing (pp. 25–34). Heidelberg: Springer, Berlin Heidelberg, Berlin.

  19. Lu, Z., Wang, N., Wu, J., & Qiu, M. (2018). IoTDeM: An IoT Big Data-oriented MapReduce performance prediction extended model in multiple edge clouds. J Parallel Distributed Comput, 118, 316–327.

    Article  Google Scholar 

  20. Jiang, W., Shen, Y., Liu, L., Zhao, X., & Shi, L. (2021). A new method for a class of parallel batch machine scheduling problem. Flexible Services and Manufacturing Journal, 1–33.

  21. Lei, Z., Lei, X., & Long, J. (2021). Memory-aware scheduling parallel real-time tasks for multicore systems. International Journal of Software Engineering and Knowledge Engineering, 31, 613–634.

    Article  Google Scholar 

  22. Du, Y., et al. (2020). A data-driven parallel scheduling approach for multiple agile earth observation satellites. IEEE Transactions on Evolutionary Computation, 24, 679–693.

    Article  Google Scholar 

  23. Alidaee, B., Wang, H., Kethley, B., & Landram, F. G. (2019). A unified view of parallel machine scheduling with interdependent processing rates. Journal of Scheduling, 1–17.

  24. Guan, L. Y., Li, J., Li, W., & Lichen, J. (2019). Improved approximation algorithms for the combination problem of parallel machine scheduling and path. Journal of Combinatorial Optimization, 1–9.

  25. Peng, W. (2021). Big data mining and analysis based on convolutional fuzzy neural network. Arabian Journal for Science and Engineering.

  26. Shang, T., Zhao, Z., Ren, X., & Liu, J. (2021). Differential identifiability clustering algorithms for big data analysis. Science China Information Sciences, 64.

  27. Pasupathi, S., Shanmuganathan, V., Kaliappan, M., Robinson, Y. H., & Kim, M. (2021). Trend analysis using agglomerative hierarchical clustering approach for time series big data. The Journal of Supercomputing, 1–20.

  28. Cui, M. (2021). Big data medical behavior analysis based on machine learning and wireless sensors. Neural Computing and Applications.

  29. Mansour, R. F., et al. (2021). Artificial intelligence with big data analytics-based brain intracranial hemorrhage e-diagnosis using ct images. Neural Computing and Applications, 1–13.

  30. Anuradha, J. (2021). Big data based stock trend prediction using deep cnn with reinforcement-lstm model. International Journal of Systems Assurance Engineering and Management, 1–11.

  31. Maghsoud, Z., Noori, H., & Mozaffari, S. P. (2021). Peps: predictive energy-efficient parallel scheduler for multi-core processors. The Journal of Supercomputing, 1–20

Download references

Acknowledgements

This paper is supported by the National Natural Science Foundation of China under Grant No. 61972293.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanxiang He.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection on: Big Data Security Track

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, F., He, Y., He, J. et al. Optimization of Big Data Parallel Scheduling Based on Dynamic Clustering Scheduling Algorithm. J Sign Process Syst 94, 1243–1251 (2022). https://doi.org/10.1007/s11265-022-01765-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-022-01765-4

Keywords

Navigation