Distributed Scheduling Extension on Hadoop

Dadan, Zeng; Xieqin, Wang; Ningkang, Jiang

doi:10.1007/978-3-642-10665-1_73

Distributed Scheduling Extension on Hadoop

Zeng Dadan¹⁹,
Wang Xieqin¹⁹ &
Jiang Ningkang¹⁹

Conference paper

15k Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 5931))

Abstract

Distributed computing splits a large-scale job into multiple tasks and deals with them on clusters. Cluster resource allocation is the key point to restrict the efficiency of distributed computing platform. Hadoop is the current most popular open-source distributed platform. However, the existing scheduling strategies in Hadoop are kind of simple and cannot meet the needs such as sharing the cluster for multi-user, ensuring a concept of guaranteed capacity for each job, as well as providing good performance for interactive jobs. This paper researches the existing scheduling strategies, analyses the inadequacy and adds three new features in Hadoop which can raise the weight of job temporarily, grab cluster resources by higher-priority jobs and support the computing resources share among multi-user. Experiments show they can help in providing better performance for interactive jobs, as well as more fairly share of computing time among users.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lammel, R.: Google’s mapreduce programming model revisited. Science of Computer Programming 70(1), 1–30 (2008)
Article MathSciNet Google Scholar
Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM 51(1), 107–113 (2008)
Article Google Scholar
http://hadoop.apache.org/core/docs/r0.17.2/mapred_tutorial.html
Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R., Stoica, I.: Improving MapReduce Performance in Heterogeneous Environments. University of California, United States (2004)
Google Scholar
Amazon EC2 Instance Types, tinyurl.com/3zjlrd
Yahoo! Launches World’s Largest Hadoop Production Application, http://tinyurl.com/2hgzv7
Chu, C.-T., Kim, S.K., Lin, Y.-A., Yu, Y., Bradski, G., Ng, A.Y., Olukotun, K.: Map-reduce for machine learning on multicore. In: Advances in Neural Information Processing Systems, pp. 281–288. MIT Press, Cambridge (2007)
Google Scholar
Lin, J.: Brute Force and Indexed Approaches to Pairwise Document Similarity Comparisons with MapReduce. In: Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2009), Boston, Massachusetts (July 2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Software Engineering Institute, East China Normal University,
Zeng Dadan, Wang Xieqin & Jiang Ningkang

Authors

Zeng Dadan
View author publications
You can also search for this author in PubMed Google Scholar
Wang Xieqin
View author publications
You can also search for this author in PubMed Google Scholar
Jiang Ningkang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

SINTEF ICT, NO-7465, Trondheim, Norway
Martin Gilje Jaatun
School of Computer Science, South China Normal University, Guangzhou, China
Gansen Zhao
Department of Electrical Engineering and Computer Science, University of Stavanger, NO- 4036, Stavanger, Norway
Chunming Rong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dadan, Z., Xieqin, W., Ningkang, J. (2009). Distributed Scheduling Extension on Hadoop. In: Jaatun, M.G., Zhao, G., Rong, C. (eds) Cloud Computing. CloudCom 2009. Lecture Notes in Computer Science, vol 5931. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10665-1_73

Download citation

DOI: https://doi.org/10.1007/978-3-642-10665-1_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10664-4
Online ISBN: 978-3-642-10665-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics