Skip to main content
Log in

Task scheduling for MapReduce in heterogeneous networks

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

In this paper, the task scheduling in MapReduce is considered for geo-distributed data centers on heterogeneous networks. Adaptive heartbeats, job deadlines and data locality are concerned. Job deadlines are divided according to the maximum data volume of tasks. With the considered constraints, the task scheduling is formulated as an assignment problem in each heartbeat, in which adaptive heartbeats are calculated by the processing times of tasks, jobs are sequencing in terms of the divided deadlines and tasks are scheduled by the Hungarian algorithm. Taking into account both the data transfer and processing times, the most suitable data center for all mapped jobs are determined in the reduce phase. Experimental results show that the proposed algorithms outperform the current existing ones. The proposals with sorted task-sequences have better performance than those with random task-sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Magnusson, J., Kvernvik, T.: Subscriber classification within telecom networks utilizing big data technologies and machine learning. In: Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, pp. 77–84. ACM (2012)

  2. Graaff, A.J., Engelbrecht, A.P.: Clustering data in stationary environments with a local network neighborhood artificial immune system. Int. J. Mach. Learn. Cybern. 3(1), 1–26 (2012)

    Article  Google Scholar 

  3. Li, Y., Liu, C., Gao, J.X., Shen, W.: An intergrated feature-based dynamic control system for online machining, inspection and monitoring. Integr. Comput. Aided Eng. 22(2), 187–200 (2015)

    Google Scholar 

  4. Li, Y., Liu, C., Hao, X., Gao, J.X., Maropoulos, P.G.: Responsive fixture design using dynamic product inspection and monitoring technologies for the precision machining of large-scale aerospace parts. CIRP Ann. Manuf. Technol. 64, 173–176 (2015)

    Article  Google Scholar 

  5. Dou, Y., Huang, Y., Li, Q., Luo, S.: A fast template matching-based algorithms for railway bolts detection. Int. J. Mach. Learn. Cybern. 5(6), 835–844 (2014)

    Article  Google Scholar 

  6. Tauer, G., Nagi, R.: A map-reduce lagrangian heuristic for multidimensional assignment problems with decomposable costs. Parallel Comput. 39(11), 653–668 (2013)

    Article  Google Scholar 

  7. Guo, Z., Fox, G., Zhou, M.: Investigation of data locality in mapreduce. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 419–426. IEEE Computer Society (2012)

  8. Zaharia, M., Borthakur, D., Sen Sarma, J., Elmeleegy, K., Shenker, S., Stoica, I.: Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proceedings of the 5th European conference on Computer systems, pp. 265–278. ACM (2010)

  9. Fischer, M.J., Su, X., Yin, Y.: Assigning tasks for efficiency in Hadoop. In: Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures, pp. 30–39. ACM (2010)

  10. Ibrahim, S., Jin, H., Lu, L., He, B., Antoniu, G., Wu, S.: Maestro: replica-aware map scheduling for mapreduce. In: 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 2012, pp. 435–442. IEEE (2012)

  11. Polo, J., Becerra, Y., Carrera, D., Steinder, M., Whalley, I., Torres, J., Ayguad, E.: Deadline-based mapreduce workload management. IEEE Trans. Netw. Serv. Manag. 10(2), 231–244 (2013)

    Article  Google Scholar 

  12. Dong, X., Wang, Y., Liao, H.: Scheduling mixed real-time and non-real-time applications in mapreduce environment. In: IEEE 17th International Conference on Parallel and Distributed Systems, 2011, pp. 9–16. IEEE (2011)

  13. Tang, Z., Zhou, J., Li, K., Li, R.: A mapreduce task scheduling algorithm for deadline constraints. Clust. Comput. 16(4), 651–662 (2013)

    Article  Google Scholar 

  14. Li, H., Wei, X., Fu, Q., Luo, Y.: Mapreduce delay scheduling with deadline constraint. Concurr. Comput. Pract. Exp. 26(3), 766–778 (2014)

    Article  Google Scholar 

  15. Yang, J., Li, X., Wang, D., Wang, J.: A group mining method for big data on distributed vehicle trajectories in wan. Int. J. Distrib. Sens. Netw. (2014). doi:10.1155/2015/756107

  16. White, W.: Hadoop: the definitive guide. O’Reilly Media,Inc., Sebastopol (2012)

    Google Scholar 

  17. Hwang, E., Kim, K.H.: Minimizing cost of virtual machines for deadline-constrained mapreduce applications in teh cloud. In: ACM/IEEE 13th International Conference on Grid Computing, 2012, pp. 130–138. IEEE (2012)

  18. Dou, A., Kalogeraki, V., Gunopulos, D., Mielikainen, T., Tuulos, V.H.: Misco: a mapreduce framework for mobile systems. In: Proceedings of the 3rd international conference on pervasive technologies related to assistive environments. ACM (2010)

  19. Dou, A.J., Kalogeraki, V., Gunopulos, D., Mielikainen, T., Tuulos, V., Foley, S., Yu, C.: Data clustering on a network of mobile smartphones. In: IEEE/IPSJ 11th International Symposium on Applications and the Internet (SAINT), 2011, pp. 118–127. IEEE (2011)

  20. Dou, A.J., Kalogeraki, V., Gunopulos, D., Mielikainen, T., Tuulos, V.: Scheduling for real-time mobile mapreduce systems. In: Proceedings of the 5th ACM international conference on Distributed event-based system, pp. 347–358. ACM (2011)

  21. Laurila, J.K., Gatica-Perez, D., Aad, I., Bornet, O., Do, T.M.T., Dousse, O., Eberle, J., Miettinen, M., et al.: The mobile data challenge: big data for mobile computing research. In: Pervasive Computing, EPFL-CONF-192489. (2012)

  22. Verma, A., Cherkasova, L., Kumar, V.S., Campbell, R.H.: Deadline-based workload management for mapreduce environments: pieces of the performance puzzle. In: Network Opertions and Management Symposium, 2012 IEEE, pp. 900–905. IEEE (2012)

  23. Zhu, Y., Jiang, Y., Wu, W., Ding, L., Teredesai, A., Li, D., Lee, W.: Minimizing makespan and total completion time in mapreduce-like systems. In: Proceedings of IEEE INFOCOM, 2014, pp. 2166–2174. IEEE (2014)

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (61572127, 61272377) and the Specialized Research Fund for the Doctoral Program of Higher Education (20120092110027).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoping Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Li, X. Task scheduling for MapReduce in heterogeneous networks. Cluster Comput 19, 197–210 (2016). https://doi.org/10.1007/s10586-015-0503-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-015-0503-3

Keywords

Navigation