SMART: Speedup Job Completion Time by Scheduling Reduce Tasks

Dong, Jia-Qing; He, Ze-Hao; Gong, Yuan-Yuan; Yu, Pei-Wen; Tian, Chen; Dou, Wan-Chun; Chen, Gui-Hai; Xia, Nai; Guan, Hao-Ran

doi:10.1007/s11390-022-2118-5

SMART: Speedup Job Completion Time by Scheduling Reduce Tasks

Regular Paper
Published: 30 July 2022

Volume 37, pages 763–778, (2022)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Jia-Qing Dong¹,
Ze-Hao He²,
Yuan-Yuan Gong²,
Pei-Wen Yu²,
Chen Tian²,
Wan-Chun Dou²,
Gui-Hai Chen²,
Nai Xia² &
…
Hao-Ran Guan³

182 Accesses
1 Citation
Explore all metrics

Abstract

Distributed computing systems have been widely used as the amount of data grows exponentially in the era of information explosion. Job completion time (JCT) is a major metric for assessing their effectiveness. How to reduce the JCT for these systems through reasonable scheduling has become a hot issue in both industry and academia. Data skew is a common phenomenon that can compromise the performance of such distributed computing systems. This paper proposes SMART, which can effectively reduce the JCT through handling the data skew during the reducing phase. SMART predicts the size of reduce tasks based on part of the completed map tasks and then enforces largest-first scheduling in the reducing phase according to the predicted reduce task size. SMART makes minimal modifications to the original Hadoop with only 20 additional lines of code and is readily deployable. The robustness and the effectiveness of SMART have been evaluated with a real-world cluster against a large number of datasets. Experiments show that SMART reduces JCT by up to 6.47%, 9.26%, and 13.66% for Terasort, WordCount and InvertedIndex respectively with the Purdue MapReduce benchmarks suite (PUMA) dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Historical data based approach to mitigate stragglers from the Reduce phase of MapReduce in a heterogeneous Hadoop cluster

Article 01 February 2022

Predoop: Preempting Reduce Task for Job Execution Accelerations

Improvement of Makespan and TCTime in Dynamic Job Ordering and Slot Utilization for MapReduce Workloads

References

Grandl R, Kandula S, Rao S, Akella A, Kulkarni J. Graphene: Packing and dependency-aware scheduling for data-parallel clusters. In Proc. the 12th USENIX Conference on Operating Systems Design and Implementation, Nov. 2016, pp.81-97.
Chang H, Kodialam M, Kompella R R, Lakshman T V, Lee M, Mukherjee S. Scheduling in MapReduce-like systems for fast completion time. In Proc. the 30th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, Apr. 2011, pp.3074-3082. DOI: 10.1109/IN-FCOM.2011.5935152.
Peng Y, Chen K, Wang G, Bai W, Zhao Y, Wang H, Geng Y, Ma Z, Gu L. Towards comprehensive traffic forecasting in cloud computing: Design and application. IEEE/ACM Transactions on Networking, 2016, 24(4): 2210-2222. DOI: https://doi.org/10.1109/TNET.2015.2458892.
Article Google Scholar
Ullah I, Khan M S, Amir M, Kim J, Kim S M. LSTPD: Least slack time-based preemptive deadline constraint scheduler for Hadoop clusters. IEEE Access, 2020, 8: 111751-111762. DOI: https://doi.org/10.1109/ACCESS.2020.3002565.
Article Google Scholar
Gao Y, Zhou Y, Zhou B, Shi L, Zhang J. Handling data skew in MapReduce cluster by using partition tuning. Journal of Healthcare Engineering, 2017, 2017: Article No. 1425102. DOI: https://doi.org/10.1155/2017/1425102.
Hammoud M, Sakr M F. Locality-aware reduce task scheduling for MapReduce. In Proc. the 3rd IEEE International Conference on Cloud Computing Technology and Science, Nov. 29-Dec. 1, 2011, pp.570-576. DOI: 10.1109/CloudCom.2011.87.
Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. Commun. ACM, 2008, 51(1): 107-113. DOI: https://doi.org/10.1145/1327452.1327492.
Article Google Scholar
Ahmad F, Lee S, Thottethodi M, Vijaykumar T. PUMA: Purdue MapReduce benchmarks suite. Technical Report, Purdue University, 2012. https://engineering.purdue.edu/~puma/puma.pdf, May 2022.
Graham R L. Bounds for certain multiprocessing anomalies. The Bell System Technical Journal, 1966, 45(9): 1563-1581. DOI: https://doi.org/10.1002/j.1538-7305.1966.tb01709.x.
Article MATH Google Scholar
Mosharaf C, Ion S. Coow: A networking abstraction for cluster applications. In Proc. the 11th ACM Workshop on Hot Topics in Networks, Oct. 2012, pp.31-36. DOI: 10.1145/2390231.2390237.
Kwon Y, Balazinska M, Howe B, Rolia J. Skew-Tune: Mitigating skew in MapReduce applications. In Proc. the 2012 ACM SIGMOD International Conference on Management of Data, May 2012, pp.25-36. DOI: 10.1145/2213836.2213840.
Chowdhury M, Zhong Y, Stoica I. Efficient coow scheduling with Varys. ACM SIGCOMM Comput. Commun. Rev., 2014, 44(4): 443-454. DOI: https://doi.org/10.1145/2740070.2626315.
Article Google Scholar
Chowdhury M, Stoica I. Efficient coow scheduling without prior knowledge. In Proc. the 2015 ACM Conference on Special Interest Group on Data Communication, Aug. 2015, pp.393-406. DOI: 10.1145/2785956.2787480.
Mao H, Schwarzkopf M, Venkatakrishnan S B, Meng Z, Alizadeh M. Learning scheduling algorithms for data processing clusters. In Proc. the ACM Special Interest Group on Data Communication, Aug. 2019, pp.270-288. DOI: https://doi.org/10.1145/3341302.3342080.
Nguyen K, Wang K, Bu Y, Fang L, Hu J, Xu G. FACADE: A compiler and runtime for (almost) object-bounded big data applications. In Proc. the 20th International Conference on Architectural Support for Programming Languages and Operating Systems, Mar. 2015, pp.675-690. DOI: 10.1145/2694344.2694345.
Nguyen K, Fang L, Xu G, Demsky B, Lu S, Alamian S, Mutlu O. Yak: A high-performance big-data-friendly garbage collector. In Proc. the 12th USENIX Symposium on Operating Systems Design and Implementation, November 2016, pp.349-365.
Rasmussen A, Lam V T, Conley M, Porter G, Kapoor R, Vahdat A. Themis: An I/O-efficient MapReduce. In Proc. the 3rd ACM Symposium on Cloud Computing, Oct. 2012, Article No. 13. DOI: https://doi.org/10.1145/2391229.2391242.
Rao S, Ramakrishnan R, Silberstein A, Ovsiannikov M, Reeves D. Sailfish: A framework for large scale data processing. In Proc. the 3rd ACM Symposium on Cloud Computing, Oct. 2012, Article No. 4. DOI: https://doi.org/10.1145/2391229.2391233.
Zhang H, Cho B, Seyfe E, Ching A, Freedman M J. Riffle: Optimized shuffle service for large-scale data analytics. In Proc. the 13th EuroSys Conference, Apr. 2018, Article No. 43. DOI: https://doi.org/10.1145/3190508.3190534.
Zaharia M, Borthakur D, Sarma S J, Elmeleegy K, Shenker S, Stoica I. Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling. In Proc. the 5th European Conference on Computer Systems, Apr. 2010, pp.265-278. DOI: 10.1145/1755913.1755940.
Ibrahim S, Jin H, Lu L, He B, Antoniu G, Wu S. Maestro: Replica-aware map scheduling for MapReduce. In Proc. the 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2012, pp.435-442. DOI: 10.1109/CCGrid.2012.122.
Tang Z, Jiang L, Zhou J, Li K, Li K. A self-adaptive scheduling algorithm for reduce start time. Future Generation Computer Systems, 2015, 43/44: 51-60. DOI: https://doi.org/10.1016/j.future.2014.08.011.
Article Google Scholar
Ibrahim S, Jin H, Lu L, Wu S, He B, Qi L. LEEN: Locality/fairness-aware key partitioning for MapReduce in the cloud. In Proc. the 2nd IEEE International Conference on Cloud Computing Technology and Science, Nov. 30-Dec. 3, 2010, pp.17-24. DOI: 10.1109/CloudCom.2010.25.
Tan J, Meng X, Zhang L. Coupling task progress for MapReduce resource-aware scheduling. In Proc. the 2013 IEEE INFOCOM, Apr. 2013, pp.1618-1626. DOI: 10.1109/INFCOM.2013.6566958.

Download references

Author information

Authors and Affiliations

State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing, 100024, China
Jia-Qing Dong
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
Ze-Hao He, Yuan-Yuan Gong, Pei-Wen Yu, Chen Tian, Wan-Chun Dou, Gui-Hai Chen & Nai Xia
School of Computer Science, The University of Sydney, Sydney, NSW, 2006, Australia
Hao-Ran Guan

Authors

Jia-Qing Dong
View author publications
You can also search for this author in PubMed Google Scholar
Ze-Hao He
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Yuan Gong
View author publications
You can also search for this author in PubMed Google Scholar
Pei-Wen Yu
View author publications
You can also search for this author in PubMed Google Scholar
Chen Tian
View author publications
You can also search for this author in PubMed Google Scholar
Wan-Chun Dou
View author publications
You can also search for this author in PubMed Google Scholar
Gui-Hai Chen
View author publications
You can also search for this author in PubMed Google Scholar
Nai Xia
View author publications
You can also search for this author in PubMed Google Scholar
Hao-Ran Guan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wan-Chun Dou.

Supplementary Information

ESM 1

(PDF 155 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, JQ., He, ZH., Gong, YY. et al. SMART: Speedup Job Completion Time by Scheduling Reduce Tasks. J. Comput. Sci. Technol. 37, 763–778 (2022). https://doi.org/10.1007/s11390-022-2118-5

Download citation

Received: 27 December 2021
Accepted: 29 June 2022
Published: 30 July 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s11390-022-2118-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SMART: Speedup Job Completion Time by Scheduling Reduce Tasks

Abstract

Access this article

Similar content being viewed by others

Historical data based approach to mitigate stragglers from the Reduce phase of MapReduce in a heterogeneous Hadoop cluster

Predoop: Preempting Reduce Task for Job Execution Accelerations

Improvement of Makespan and TCTime in Dynamic Job Ordering and Slot Utilization for MapReduce Workloads

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SMART: Speedup Job Completion Time by Scheduling Reduce Tasks

Abstract

Access this article

Similar content being viewed by others

Historical data based approach to mitigate stragglers from the Reduce phase of MapReduce in a heterogeneous Hadoop cluster

Predoop: Preempting Reduce Task for Job Execution Accelerations

Improvement of Makespan and TCTime in Dynamic Job Ordering and Slot Utilization for MapReduce Workloads

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation