Skip to main content

Load Balancing in Cluster Using BLCR Checkpoint/Restart

  • Conference paper
Advances in Computing and Information Technology

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 176))

  • 1948 Accesses

Abstract

Modern computation is becoming complex in a way that the resource requirement is gradually increasing. High Throughput Computing is one technique to deal with such a complexity. After a significant amount of time, computing clusters gets highly overloaded resulting in degradation of performance. Since there is no central coordinator in Computer Supported Cooperative Working (CSCW) load-balancing is more complex. An overloaded node does not participate in a CSCW network as they are already overloaded. This paper proposes migration of computation intensive jobs from overloaded nodes, which will allow overloaded nodes to be able to participate in CSCW. The proposed solution improves the performance by making more nodes participating in CSCW by migrating compute intensive jobs from overloaded nodes to underloaded nodes. Evaluation of proposed approach shows that the availability and performance of the CSCW clusters is improved by 30%-40% with fault-tolerance based load balancing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Selikhov, A., Germain, C.: A Channel Memory based fault tolerance for MPI applications. Future Generation Computer Systems 21(5), 709–715 (2005)

    Article  Google Scholar 

  2. Al-Saqabi, K.H., Saleh, K.A.: An efficient process migration algorithm for homogeneous clusters. Information and Software Technology 38(9), 569–580 (1996)

    Article  Google Scholar 

  3. Hursey, J., Graham, R.L.: Analyzing fault aware collective performance in a process fault tolerant MPI. Parallel Computing 38(1-2), 15–25 (2012)

    Article  Google Scholar 

  4. Chtepen, M., Claeys, F.H.A., Dhoedt, B., De Turck, F., Demeester, P., Vanrolleghem, P.A.: Adaptive Task Checkpointing and Replication: Toward Efficient Fault-Tolerant Grids. IEEE Transactions on Parallel and Distributed Systems 20(2), 180–190 (2009)

    Article  Google Scholar 

  5. Lopriore, L.: Object and process migration in a single-address-space distributed system. Microprocessors and Microsystems 23(10), 587–595 (2000)

    Article  Google Scholar 

  6. Payli, R.U., et al.: DLB—a dynamic load balancing tool for grid computing. Scientific International Journal for Parallel and Distributed Computing 07(02) (2004)

    Google Scholar 

  7. Cao, J., et al.: Grid load balancing using intelligent agents. Future Generation Computer Systems 21(1), 135–149 (2005)

    Article  Google Scholar 

  8. Yagoubi, Slimani, Y.: Task load balancing for grid computing. Journal of Computer Science 3(3), 186–194 (2007)

    Article  Google Scholar 

  9. Nehra, N., Patel, R.B., Bhatt, V.K.: A framework for distributed dynamic load balancing in heterogeneous cluster. Journal of Computer Science (2007)

    Google Scholar 

  10. Hargrove, P.H., Duell, J.C.: Berkeley lab checkpoint/restart (BLCR) for Linux clusters, https://ftg.lbl.gov/assets/projects/CheckpointRestart/Pubs/LBNL-60520.pdf

  11. Rodríguez, G., Pardo, X.C., Martín, M.J., González, P.: Performance evaluation of an application-level checkpointing solution on grids. Future Generation Computer Systems 26, 1012–1023 (2010), doi:10.1016/j.future.2010.04.016

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hemant Hariyale .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hariyale, H., Vardhan, M., Pandey, A., Mishra, A., Kushwaha, D.S. (2012). Load Balancing in Cluster Using BLCR Checkpoint/Restart. In: Meghanathan, N., Nagamalai, D., Chaki, N. (eds) Advances in Computing and Information Technology. Advances in Intelligent Systems and Computing, vol 176. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31513-8_74

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31513-8_74

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31512-1

  • Online ISBN: 978-3-642-31513-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics