Jargon of Hadoop MapReduce scheduling techniques: a scientific categorization

Muhammad Hanif; Choonhwa Lee

doi:10.1017/S0269888918000371

Jargon of Hadoop MapReduce scheduling techniques: a scientific categorization

Published online by Cambridge University Press: 15 March 2019

Muhammad Hanif and

Choonhwa Lee

Show author details

Muhammad Hanif: Affiliation:
Division of Computer Science and Engineering, Hanyang University, Seoul, Republic of Korea; e-mail: honeykhan@hanyang.ac.kr, lee@hanyang.ac.kr
Choonhwa Lee: Affiliation:
Division of Computer Science and Engineering, Hanyang University, Seoul, Republic of Korea; e-mail: honeykhan@hanyang.ac.kr, lee@hanyang.ac.kr

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Recently, valuable knowledge that can be retrieved from a huge volume of datasets (called Big Data) set in motion the development of frameworks to process data based on parallel and distributed computing, including Apache Hadoop, Facebook Corona, and Microsoft Dryad. Apache Hadoop is an open source implementation of Google MapReduce that attracted strong attention from the research community both in academia and industry. Hadoop MapReduce scheduling algorithms play a critical role in the management of large commodity clusters, controlling QoS requirements by supervising users, jobs, and tasks execution. Hadoop MapReduce comprises three schedulers: FIFO, Fair, and Capacity. However, the research community has developed new optimizations to consider advances and dynamic changes in hardware and operating environments. Numerous efforts have been made in the literature to address issues of network congestion, straggling, data locality, heterogeneity, resource under-utilization, and skew mitigation in Hadoop scheduling. Recently, the volume of research published in journals and conferences about Hadoop scheduling has consistently increased, which makes it difficult for researchers to grasp the overall view of research and areas that require further investigation. A scientific literature review has been conducted in this study to assess preceding research contributions to the Apache Hadoop scheduling mechanism. We classify and quantify the main issues addressed in the literature based on their jargon and areas addressed. Moreover, we explain and discuss the various challenges and open issue aspects in Hadoop scheduling optimizations.

Type: Review
Information: The Knowledge Engineering Review , Volume 34 , 2019 , e4

DOI: https://doi.org/10.1017/S0269888918000371 [Opens in a new window]
Copyright: © Cambridge University Press, 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ahmad, F., Chakradhar, S. T., Raghunathan, A. & Vijaykumar, T. N. 2014. ShuffleWatcher: shuffle-aware scheduling in multi-tenant MapReduce clusters. In 2014 USENIX Annual Technical Conference (USENIX ATC 14), 1–13. https://www.usenix.org/conference/atc14/technical-sessions/presentation/ahmad.Google Scholar

Althebyan, Q., ALQudah, O., Jararweh, Y. & Yaseen, Q. 2014. Multi-threading based MapReduce tasks scheduling. In 2014 5th International Conference on Information and Communication Systems (ICICS), 16. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber= 6841943.Google Scholar

Amazon! 2016a. Amazon! Elastic Block Store (EBS) – AWS Block Storage. https://aws.amazon.com/rds/ [accessed January 18, 2016].Google Scholar

Amazon! 2016b. Amazon! Relational Database Service (RDS). https://aws.amazon.com/rds/. [accessed January 18, 2016]Google Scholar

Amazon! 2016c. Amazon! Simple Storage Service (S3) – Object Storage. https://aws.amazon.com/s3/. [accessed January 18, 2016]Google Scholar

Amazon! 2016d. Elastic Compute Cloud (EC2). https://aws.amazon.com/ec2/. [accessed January 11, 2016]Google Scholar

Anjos, J. C. S., Carrera, I., Kolberg, W., Tibola, A. L., Arantes, L. B. & Geyer, C. R. 2015. MRA++: scheduling and data placement on MapReduce for heterogeneous environments. Future Generation Computer Systems 42, 22–35, http://dx.doi.org/10.1016/j.future.2014.09.001.Google Scholar

Apache! 2015a. Apache Hadoop: Capacity Scheduler. https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html [accessed December 31, 2015].Google Scholar

Apache! 2015b. Apache Hadoop: Fair Scheduler. https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html [accessed December 31,2015].Google Scholar

Apache! 2015c. ApacheTM HadoopÂ®! http://hadoop.apache.org/ [accessed December 31, 2015].Google Scholar

Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., Lee, Q., Patterson, D., Rabkin, A., Stoica, I. & Zaharia, M. 2010. A view of cloud computing. Communications of the ACM 53(4), 50–58.Google Scholar

Arslan, E., Shekhar, M. & Kosar, T. 2014. Locality and network-aware reduce task scheduling for data-intensive applications. In 2014 5th International Workshop on Data-Intensive Computing in the Clouds, 1724. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=7017949.Google Scholar

Balmin, A. & Beyer, K. S. Adaptive MapReduce using situation-aware mappers. In EDBT ‘12 Proceedings of the 15th International Conference on Extending Database Technology, 420–431.Google Scholar

Bezerra, A., Hernandez, P., Espinosa, A. & Moure, J. C. 2013. Job scheduling for optimizing data locality in Hadoop clusters. In Proceedings of the 20th European MPI User’s Group Meeting on – EuroMPI ‘13, 271. http://dl.acm.org/citation.cfm?doid= 2488551.2488591.Google Scholar

Bincy, P. A. & Binu, A. 2013. Survey on job schedulers in Hadoop cluster. IOSR Journal of Computer Engineering (IOSR-JCE) 15(1), 4650, http://www.iosrjournals.org/iosr-jce/papers/Vol15-issue1/I01514650.pdf?id=7558.Google Scholar

Bortnikov, E., Frank, A., Hillel, E. & Rao, S. 2012. Predicting execution bottlenecks in map-reduce clusters. In Proceedings of 4th USENIX Conference on Hot Topics in Cloud Computing. http://dl.acm.org/citation.cfm?id= 2342781.Google Scholar

Bruno, R. & Ferreira, P. 2014. SCADAMAR: scalable and data-efficient internet MapReduce. In Proceedings of the 2nd International Workshop on CrossCloud Systems, 2. ACM.Google Scholar

Chen, Q., Zhang, D., Guo, M., Deng, Q. & Guo, S. 2010. SAMR: a self-adaptive MapReduce scheduling algorithm in heterogeneous environment. In Proceedings – 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010, (Cit), 27362743.Google Scholar

Chen, Q., Liu, C. & Xiao, Z. 2014. Improving MapReduce performance using smart speculative execution strategy. IEEE Transactions on Computers 63(4), 954–967.Google Scholar

Chen, T. Y., Wei, H. W., Wei, M. F., Chen, Y. J., Hsu, T. S. & Shih, W. K. 2013. LaSA: a locality-aware scheduling algorithm for Hadoop-MapReduce resource assignment. In Proceedings of the 2013 International Conference on Collaboration Technologies and Systems, CTS 2013, 342346.Google Scholar

Chintapalli, S. R. 2014. Analysis of Data Placement Strategy based on Computing Power of Nodes on Heterogeneous Hadoop Clusters. Doctoral dissertation, Auburn University.Google Scholar

Chu, C. T., Kim, S. K., Lin, Y. A., Yu, Y., Bradski, G., Olukotun, K. & Ng, A. Y. 2007. Map-Reduce for machine learning on multicore. Advances in Neural Information Processing Systems 19, 281–288.Google Scholar

Dean, J. & Ghemawat, S. 2008. MapReduce. Communications of the ACM 51(1), 107. http://dl.acm.org/citation.cfm?id= 1327452.1327492.Google Scholar

Douglas, C., Murthy, A. C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S. & Saha, B. 2013. Apache Hadoop YARN – Yet Another Resource Negotiator. In Proceedings – IEEE Fourth International Conference on eScience, 277–284. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber= 736768.Google Scholar

Ekanayake, J., Pallickara, S. & Fox, G. 2008. MapReduce for data intensive scientific analyses. In 2008 IEEE Fourth International Conference on eScience, 277–284. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber= 736768.Google Scholar

Facebook! 2015. Under the Hood: Scheduling MapReduce jobs more efficiently with Corona. https://www.facebook.com/notes/facebook-engineering/under-the-hoodscheduling-mapreduce-jobs-more-efficiently-withcorona/10151142560538920[accessed December 31, 2015].Google Scholar

Geetha, J., UdayBhaskar, N. & ChennaReddy, P. 2016. Data-local reduce task scheduling. Procedia Computer Science 85, 598–605.Google Scholar

Ghodsi, A., Zaharia, M., Hindman, B., Konwinski, A., Shenker, S. & Stoica, I. 2011. Dominant resource fairness: fair allocation of multiple resource types. In Nsdi, 11, 24–24. http://www.usenix.org/events/nsdi11/tech/fullpapers/Ghodsi.pdf.Google Scholar

Gu, L., Tang, Z. & Xie, G. 2014. The implementation of MapReduce scheduling algorithm based on priority. Parallel Computational Fluid Dynamics, (61103047), 100–111. http://link.springer.com/chapter/10.1007/978-3-642-53962-69.Google Scholar

Gu, T., Zuo, C., Liao, Q., Yang, Y. & Li, T. 2013. Improving MapReduce performance by data prefetching in heterogeneous or shared environments. International Journal of Grid and Distributed Computing 6(5), 71–82, http://www.sersc.org/journals/IJGDC/vol6no5/7.pdf.Google Scholar

Gulati, A., Shanmuganathan, G., Holler, A. M. & Ahmad, I. 2011. Cloud-scale resource management: challenges and techniques. HotCloud 2011, 1–6 papers2://publication/uuid/EE3F25DD-34BB-4C32-9F0C-1FA53AAB86FD.Google Scholar

Gunelius, S. 2015. Per day information processed. http://aci.info/2014/07/12/the-data-explosion-in-2014-minute-by-minute-infographic/ [accessed December 31, 2015].Google Scholar

Hammoud, M., Rehman, M. S. & Sakr, M. F. 2012. Center-of-gravity reduce task scheduling to lower MapReduce network traffic. In Proceedings – 2012 IEEE 5th International Conference on Cloud Computing, CLOUD 2012, 4958.Google Scholar

Hammoud, M. & Sakr, M. F. 2011. Locality-aware reduce task scheduling for MapReduce. In Proceedings – 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011, 570–576.Google Scholar

Hanif, M. & Lee, C. 2016. An efficient key partitioning scheme for heterogeneous MapReduce clusters. In 2016 18th International Conference on Advanced Communication Technology (ICACT), 364–367. IEEE.Google Scholar

He, C., Lu, Y. & Swanson, D. 2011. Matchmaking: a new MapReduce scheduling technique. In Proceedings – 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011, 40–47.Google Scholar

Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A.D., Katz, R.H., Shenker, S. & Stoica, I. 2011. Mesos: a platform for fine-grained resource sharing in the data center. NSDI, 11, 22–22. http://static.usenix.org/events/nsdi11/tech/fullpapers/Hindmannew.pdfnhttps://www.usenix.org/conference/nsdi11/mesos-platform-fine-grained-resource-sharing-data-center.Google Scholar

Ibrahim, S., Jin, H., Lu, L., Wu, S., He, B. & Qi, L. 2010. LEEN: locality/fairness-aware key partitioning for MapReduce in the cloud. In Proceedings – 2nd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2010, (2), 17–24.Google Scholar

Ibrahim, S., Jin, H., Lu, L., He, B., Antoniu, G. & Wu, S. 2012. Maestro: replica-aware map scheduling for MapReduce. In Proceedings – 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2012, 435–442.Google Scholar

Jiang, W. Z. & Sheng, Z. Q. 2012. A new task scheduling algorithm in hybrid cloud environment. In International Conference on Cloud and Service Computing, 45–49. http://dl.acm.org/citation.cfm?id= 2469449.2469626.Google Scholar

Jin, J., Luo, J., Song, A., Dong, F. & Xiong, R. 2011. BAR: an efficient data locality driven task scheduling algorithm for cloud computing. In Proceedings – 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2011, 295–304.Google Scholar

Jin, S., Yang, S. & Jia, Y. 2012. Optimization of task assignment strategy for map-reduce. In Proceedings of 2nd International Conference on Computer Science and Network Technology, ICCSNT 2012, 57-61.Google Scholar

Jung, H. & Nakazato, H. 2014. Dynamic scheduling for speculative execution to improve MapReduce performance in heterogeneous environment. In 2014 IEEE 34th International Conference on Distributed Computing Systems Workshops (ICDCSW), 119–124.Google Scholar

Kc, K. & Anyanwu, K. 2010. Scheduling Hadoop jobs to meet deadlines. In Proceedings – 2nd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2010, 388–392.Google Scholar

Ko, S. Y. & Cho, B. 2009. On availability of intermediate data in cloud computations. Solutions, 6–6, http://portal.acm.org/citation.cfm?id= 1855574.Google Scholar

Kondikoppa, P., Chiu, C. H., Cui, C., Xue, L. & Park, S. J. 2012. Network-aware scheduling of MapReduce framework on distributed clusters over high speed networks. In Proceedings of the 2012 workshop on Cloud services, federation, and the 8th open cirrus summit, 39–44. http://doi.acm.org/10.1145/2378975.2378985.Google Scholar

Lee, G., Chun, B. & Katz, R. H. 2011. Heterogeneity-aware resource allocation and scheduling in the cloud. In Proceedings of HotCloud, 1, 47–52. http://www.usenix.org/events/hotcloud11/tech/finalfiles/Lee.pdf.Google Scholar

Li, H. PWBRR Algorithm of Hadoop Platform.Google Scholar

Li, W., Yang, H., Luan, Z. & Qian, D. 2011. Energy prediction for mapreduce workloads. In 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing (DASC), 443–448. IEEE.Google Scholar

Liang, A., Xiao, L. & Li, R. 2013. An energy-aware dynamic clustering-based scheduling algorithm for parallel tasks on clusters. International Journal of Advancements in Computing Technology, 5(5), 785–792, http://www.aicit.org/ijact/global/paperdetail.html?jname=IJACT&q=2412.Google Scholar

Liu, H. 2011. Cutting MapReduce Cost with Spot Market. USENIX HotCloud'11, 5.Google Scholar

Mackey, G., Sehrish, S., Bent, J., Lopez, J., Habib, S. & Wang, J. 2008. Introducing map-reduce to high end computing. In 2008 3rd Petascale Data Storage Workshop, 3, 1–6. http://ieeexplore.ieee.org/articleDetails.jsp?arnumber= 4811889.Google Scholar

Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C. & Byers, A. H. 2011. Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute, (June), 156.Google Scholar

Matsunaga, A., Tsugawa, M. & Fortes, J. 2008. CloudBLAST: combining MapReduce and virtualization on distributed resources for bioinformatics applications. In 2008 IEEE Fourth International Conference on eScience, 222–229. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber= 4736761.Google Scholar

Morton, K., Balazinska, M. & Grossman, D. 2010. ParaTimer: a progress indicator for MapReduce DAGs. In Proceedings of the 2010 International Conference on Management of Data, 507–518. papers://b48995dc-e14b-47dc-9998-dcf47f651d40/P aper/p66.Google Scholar

Nanduri, R., Maheshwari, N., Reddyraja, A. & Varma, V. 2011. Job aware scheduling algorithm for MapReduce framework. In Proceedings – 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011, (November), 724–729.Google Scholar

Nita, M. C., Pop, F., Voicu, C., Dobre, C. & Xhafa, F. 2015. MOMTH: multi-objective scheduling algorithm of many tasks in Hadoop. Cluster Computing, 18(3), 1–14. http://dl.acm.org/citation.cfm?id= 2740070.2626334.Google Scholar

Palanisamy, B., Singh, A., Liu, L. & Jain, B. 2011. Purlieus: locality-aware resource allocation for MapReduce in a cloud. In 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 1–11.Google Scholar

Park, J., Lee, D., Kim, B., Huh, J. & Maeng, S. 2012. Locality-aware dynamic VM reconfiguration on MapReduce clouds. In HPDC ‘12: Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing SE – HPDC ‘12, 27–36. http://dx.doi.org/10.1145/2287076.2287082.Google Scholar

Phan, L. T., Zhang, Z., Loo, B. T. & Lee, I. 2010. Real-time MapReduce scheduling. Technical Reports (CIS), (January). http://repository.upenn.edu/cisreports/942.Google Scholar

Polo, J., Carrera, D., Becerra, Y., Torres, J., Ayguadé, E., Steinder, M. & Whalley, I. 2010. Performance-driven task co-scheduling for MapReduce environments. In Proceedings of the 2010 IEEE/IFIP Network Operations and Management Symposium, NOMS 2010, 373–380.Google Scholar

Rao, B. T., Sridevi, N. V., Reddy, V. K. & Reddy, L. S. S. 2012. Performance issues of heterogeneous Hadoop clusters in cloud computing. XI(Viii), 6. http://arxiv.org/abs/1207.0894.Google Scholar

Rao, B. T. & Reddy, L. S. S. 2012. Survey on improved scheduling in Hadoop MapReduce in cloud environments. International Journal of Computer Applications 34(9), 29–33, http://adsabs.harvard.edu/abs/2012arXiv1207.0780T.Google Scholar

Ren, X. 2015. Speculation-Aware Resource Allocation for Cluster Schedulers. CITP, California, 2015.Google Scholar

Sandholm, T. & Lai, K. 2010. Dynamic Proportional Share Scheduling in Hadoop. Job scheduling Strategies for Parallel Processing 2010. Springer Berlin Heidelberg, 110–131.Google Scholar

Seo, S., Jang, I., Woo, K., Kim, I., Kim, J. S. & Maeng, S. 2009. HPMR: prefetching and pre-shuffling in shared MapReduce computation environment. In 2009 IEEE International Conference on Cluster Computing and Workshops, 1–8. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber= 5289171.Google Scholar

Shafer, J., Rixner, S. & Cox, A. L. 2010. The Hadoop distributed filesystem: balancing portability and performance. In ISPASS 2010 – IEEE International Symposium on Performance Analysis of Systems and Software, 122–133.Google Scholar

Shang, F., Chen, X. & Yan, C. 2017. A Strategy for Scheduling Reduce Task Based on Intermediate Data Locality of the MapReduce. Cluster Computing.Google Scholar

Su, Y. L., Chen, P. C., Chang, J. B. & Shieh, C. K. 2011. Variable-sized map and locality-aware reduce on public-resource grids. Future Generation Computer Systems 27(6), 843–849, http://dx.doi.org/10.1016/j.future.2010.09.001.Google Scholar

Sun, R., Yang, J., Gao, Z. & He, Z. 2014. A virtual machine based task scheduling approach to improving data locality for virtualized Hadoop. In 2014 IEEE/ACIS 13th International Conference on Computer and Information Science (ICIS), 297–302.Google Scholar

Sun, X., He, C. & Lu, Y. 2012. ESAMR: an enhanced self-adaptive mapreduce scheduling algorithm. In Proceedings of the International Conference on Parallel and Distributed Systems – ICPADS, 148–155.Google Scholar

Suresh, S. & Gopalan, N. 2014. An optimal task selection scheme for Hadoop scheduling. IERI Procedia 10, 70–75, http://dx.doi.org/10.1016/j.ieri.2014.09.093.Google Scholar

Tanenbaum, A. S. 2009. Modern Operating Systems. Education, 2. http://www.amazon.com/dp/0136006639.Google Scholar

Tang, X., Wang, L. & Geng, Z. 2015. A reduce task scheduler for MapReduce with minimum transmission cost based on sampling. Evaluation. 8(1), 1–10.Google Scholar

Tang, Z., Zhou, J., Li, K. and Li, R. 2012. MTSD: a task scheduling algorithm for MapReduce base on deadline constraints. In Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012, 2012–2018.Google Scholar

Teng, F., Magoulès, F., Yu, L. & Li, T. 2014. A novel real-time scheduling algorithm and performance analysis of a MapReduce-based cloud. The Journal of Supercomputing 69(2), 739–765, http://link.springer.com/10.1007/s11227-014-1115-z.Google Scholar

Tian, C., Zhou, H., He, Y. & Zha, L. 2009. A dynamic MapReduce scheduler for heterogeneous workloads. In 8th International Conference on Grid and Cooperative Computing, GCC 2009, 218–224.Google Scholar

Tiwari, N., Sarkar, S., Bellur, U. & Indrawan, M. 2015. Classification framework of MapReduce scheduling algorithms. ACM Computing Surveys 47(3), 1–38, http://dl.acm.org/citation.cfm?doid= 2737799.2693315.Google Scholar

Wei, H. W., Wu, T. Y., Lee, W. T. & Hsu, C. W. 2015. Shareability and locality aware scheduling algorithm in Hadoop for mobile cloud computing. Journal of Information Hiding and Multimedia Signal Processing 6, 1215–1230.Google Scholar

Wolf, J., Nabi, Z., Nagarajan, V., Saccone, R., Wagle, R., Hildrum, K., Pring, E. & Sarpatwar, K. 2014. The X-flex cross-platform scheduler: who’s the fairest of them all? In Proceedings of the Middleware Industry Track, 1. ACM.Google Scholar

Xia, Y., Wang, L., Zhao, Q. & Zhang, G. 2011. Research on job scheduling algorithm in Hadoop. Journal of Computational Information Systems 7(16), 5769–5775.Google Scholar

Xie, J., Yin, S., Ruan, X., Ding, Z., Tian, Y., Majors, J., Manzanares, A. & Qin, X. 2010. Improving MapReduce performance through data placement in heterogeneous Hadoop clusters. In 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and PhD Forum (IPDPSW), 1–9. IEEE.Google Scholar

Yoo, D. & Sim, K. M. 2011. A comparative review of job scheduling for MapReduce. In CCIS2011 – Proceedings: 2011 IEEE International Conference on Cloud Computing and Intelligence Systems, 353–358.Google Scholar

Yu, X. & Hong, B. 2013. Bi-Hadoop: extending Hadoop to improve support for binary-input applications. In Proceedings – 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013, 245–252.Google Scholar

Zaharia, M., Borthakur, D. et al.. 2010. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In Proceedings of the 5th European Conference on Computer Systems, 265–278. http://portal.acm.org/citation.cfm?id= 1755913.1755940.Google Scholar

Zaharia, M., Konwinski, A., Joseph, A. D., Katz, R. H. & Stoica, I. 2008. Improving MapReduce performance in heterogeneous environments. In Osdi, 8(4), 29–42. http://www.usenix.org/event/osdi08/tech/fullpapers/zaharia/zahariahtml/.Google Scholar

Zaharia, M., Borthakur, D., Sarma, J. S., Elmeleegy, K., Shenker, S. & Stoica, I. 2009. Job scheduling for multi-user MapReduce clusters. EECS Department University of California Berkeley Tech Rep UCBEECS200955 Apr, (UCB/EECS-2009-55), 2009-55. http://www.eecs.berkeley.edu/P ubs/T echRpts/2009/EECS-2009-55.pdf.Google Scholar

Zaharia, M., Chowdhury, M., Franklin, M. J., Shenker, S. & Stoica, I. 2010. Spark: cluster computing with working sets. In HotCloud'10 Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, 10.Google Scholar

Zhang, X., Feng, Y., Feng, S., Fan, J. & Ming, Z. 2011. An effective data locality aware task scheduling method for MapReduce framework in heterogeneous environments. In Proceedings – 2011 International Conference on Cloud and Service Computing, CSC 2011, 235–242.Google Scholar

Article contents

Jargon of Hadoop MapReduce scheduling techniques: a scientific categorization

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests