Abstract
In this work we investigate the online over-list MapReduce processing problem on two identical parallel machines, aiming at minimizing the makespan. Jobs are revealed one by one, and each job consists of one map task and one reduce task. The map task can be arbitrarily split and processed on both machines simultaneously, while the reduce task has to be processed on a single machine and it cannot be started unless the map task has been completed. We first show that the general case of the problem reduces to the classical two machine online scheduling model with an optimal competitive ratio of 3/2. For a special case where the map task is at least as long as the reduce task, we prove that no online algorithm can be less than 4/3-competitive. An optimal Greedy algorithm with a matching competitive ratio is proposed as well.

Similar content being viewed by others
References
Chen F, Kodialam M, Lakshman TV (2012) Joint scheduling of processing and shuffle phases in MapReduce systems. In: INFOCOM, 2012 Proceedings IEEE, pp 1143–1151
Chen C, Xu Y, Zhu Y, Sun C (2017) Online MapReduce scheduling problem of minimizing the makespan. J Comb Optim 33:590–608
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Faigle U, Kern W, Turan G (1989) On the performance of on-line algorithms for partition problems. Acta Cybern 9:107–119
Fiat A, Woeginger G (1998) Competitive analysis of algorithms. In: LNCS 1442. Springer Berlin, pp 1–12
Graham RL (1966) Bounds for certain multiprocessor anomalies. Bell Syst Tech J 45:1563–1581
Luo T, Zhu Y, Wu W, Xu Y, Du D (2017) Online makespan minimization in MapReduce-like systems with complex reduce tasks. Optim Lett 11:271–277
Moseley B, Dasgupta A, Kumar R, Sarlós T (2011) On scheduling in map-reduce and flow-shops. In: Proceedings of the twenty-third annual ACM symposium on parallelism in algorithms and architectures, ACM, SPAA, vol 11, pp 289–298
Sandholm T, Lai K (2009) MapReduce optimization using regulated dynamic prioritization. SIGMETRICS Perform Eval Rev 37(1):299–310
Zheng Y, Shroff N, Sinha P (2013) A new analytical technique for designing provably efficient MapReduce schedulers. In: INFOCOM, 2013 Proceedings IEEE, pp 1600–1608
Zhu Y, Jiang Y, Wu W, Ding L, Teredesai A, Li D, Lee W (2014) Minimizing makespan and total completion time in MapReduce-like systems. In: INFOCOM, 2014 Proceedings IEEE, pp 2166–2174
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huang, J., Zheng, F., Xu, Y. et al. Online MapReduce processing on two identical parallel machines. J Comb Optim 35, 216–223 (2018). https://doi.org/10.1007/s10878-017-0167-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10878-017-0167-4