Skip to main content

Delay Scheduling with Reduced Workload on JobTracker in Hadoop

  • Conference paper
  • First Online:
Book cover Innovations in Bio-Inspired Computing and Applications

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 424))

Abstract

Job scheduling is one of the critical issues in MapReduce processing that affects the performance of Hadoop framework. Delay scheduling introduces a small delay during job scheduling to optimize the data locality. Delay scheduler may scan a job more than once before reaching a certain deadline after which the job is scheduled. This causes extra overhead on the scheduler. Moreover a higher priority job may get delayed. We propose an algorithm in which the load is distributed among the individual nodes. Our algorithm insists the scheduler to launch a high priority job on a free node. The node then executes the job locally or schedules it to some other node based on the availability of data. Experimental results show that the proposed algorithm performs better than Hadoop and records less execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Turner, V., et al.: The digital universe of opportunities: rich data and the increasing value of the internet of things. In: International Data Corporation, White Paper, IDC_1672 (2014)

    Google Scholar 

  2. Philip Chen, C.L., Zhang, Chun-Yang: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)

    Article  Google Scholar 

  3. Hashem, Ibrahim Abaker Targio, Yaqoob, Ibrar, Badrul Anuar, Nor, Mokhtar, Salimah, Gani, Abdullah, Ullah Khan, Samee: The rise of big data on cloud computing: review and open research issues. Inf. Syst. 47, 98–115 (2015)

    Article  Google Scholar 

  4. Kambatla, Karthik, Kollias, Giorgos, Kumar, Vipin, Grama, Ananth: Trends in big data analytics. J. Parallel Distrib. Comput. 74(7), 2561–2573 (2014)

    Article  Google Scholar 

  5. Hashem, Targio, Ibrahim Abaker, et al.: The rise of “big data” on cloud computing: review and open research issues. Inf. Syst. 47, 98–115 (2015)

    Article  Google Scholar 

  6. Dean, Jeffrey, Ghemawat, Sanjay: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  7. Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: Distributed data-parallel programs from sequential building blocks. In: Conference Computer System (EuroSys), pp. 59–72 (2007)

    Google Scholar 

  8. Yang, H.C., Dasdan, A., Hsiao, R.-L., Parker, D.S.: Map-Reduce-Merge: simplified relational data processing on large clusters. In: Proceeding of ACM SIGMOD International Conference Management of Data (2007)

    Google Scholar 

  9. Polato, Ivanilton, et al.: A comprehensive view of Hadoop research—A systematic literature review. J. Netw. Comput. Appl. 46, 1–25 (2014)

    Article  Google Scholar 

  10. Apache Hadoop.: http://hadoop.apache.orgJune 2011

  11. Zaharia, M., et al.: Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proceedings of the 5th European Conference on Computer Systems. ACM (2010)

    Google Scholar 

  12. Hadoop’s Fair Scheduler.: https://hadoop.apache.org/docs/r1.2.1/fair_scheduler

  13. Zaharia, M., et al.: Improving MapReduce performance in heterogeneous environments. In: OSDI, vol. 8(4) (2008)

    Google Scholar 

  14. Chen, Q., et al.: Samr: A self-adaptive Mapreduce scheduling algorithm in heterogeneous environment. In: 2010 IEEE 10th International Conference on Computer and Information Technology (CIT). IEEE (2010)

    Google Scholar 

  15. Guo, Z., Fox, G., Zhou, M.: Investigation of data locality in Mapreduce. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012). IEEE Computer Society (2012)

    Google Scholar 

  16. Ibrahim, S., et al.: LEEN: Locality/fairness-aware key partitioning for Mapreduce in the cloud. In: IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom), (2010)

    Google Scholar 

  17. Nguyen, P., et al.: A hybrid scheduling algorithm for data intensive workloads in a Mapreduce environment. In: Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing. IEEE Computer Society (2012)

    Google Scholar 

  18. He, C., Lu, Y., Swanson, D.: Matchmaking: a new Mapreduce scheduling technique. In: 2011 IEEE Third International Conference on Cloud Computing Technology and Science (CloudCom). IEEE (2011)

    Google Scholar 

  19. Abad, C.L., Lu, Y., Campbell, R.H.: DARE: Adaptive data replication for efficient cluster scheduling. In: 2011 IEEE International Conference on Cluster Computing (CLUSTER). IEEE (2011)

    Google Scholar 

  20. Ibrahim, S., et al.: Maestro: Replica-aware map scheduling for Mapreduce. In: 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). IEEE (2012)

    Google Scholar 

  21. Ahmad, Faraz, et al.: MapReduce with communication overlap (MaRCO). J. Parallel Distrib. Comput. 73(5), 608–620 (2013)

    Article  Google Scholar 

  22. Tang, Zhuo, et al.: A self-adaptive scheduling algorithm for reduce start time. Future Gener. Comput. Syst. 43, 51–60 (2015)

    Article  Google Scholar 

  23. Hammoud, M., Rehman, M.S., Sakr, M.F.: Center-of-gravity reduce task scheduling to lower Mapreduce network traffic. In: Cloud Computing (CLOUD). IEEE (2012)

    Google Scholar 

  24. Hammoud, M, Sakr, M.F.: Locality-aware reduce task scheduling for MapReduce. In: IEEE Third International Conference on Cloud Computing Technology and Science (CloudCom). IEEE (2011)

    Google Scholar 

Download references

Acknowledgements

The research work is supported by Department of Computer Science & Engineering, Indian School of Mines, Dhanbad, India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dharavath Ramesh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Sethi, K.K., Ramesh, D. (2016). Delay Scheduling with Reduced Workload on JobTracker in Hadoop. In: Snášel, V., Abraham, A., Krömer, P., Pant, M., Muda, A. (eds) Innovations in Bio-Inspired Computing and Applications. Advances in Intelligent Systems and Computing, vol 424. Springer, Cham. https://doi.org/10.1007/978-3-319-28031-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28031-8_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28030-1

  • Online ISBN: 978-3-319-28031-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics