Skip to main content

Heuristic-Based Scheduling to Maximize Throughput of Data-Intensive Grid Applications

  • Conference paper
  • 534 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3326))

Abstract

Job scheduling in data grids must consider not only computation loads at each grid node but also the distributions of data required by each job. Furthermore, recent trends in grid applications emphasize high throughput more than high performance. In this paper, we propose a centralized scheduling scheme, which uses a scheduling heuristic called Maximum Residual Resource (MRR) that targets high throughput for data grid applications. We have analyzed the performance potentials of MRR, and have developed a simulator to evaluate it with typical grid configurations. Our results show that MRR brings significant performance improvements over existing online and batch heuristics like MCT, Min–min and Max-min.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Livny, M., Raman, R.: High Throughput Resource Management, ch. 13. In: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  2. Freund, R.F., Braun, T.D.: Production Throughput as a High-Performance Computing Meta-task. In: The 2002 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2002) (2002)

    Google Scholar 

  3. Coffman Jr., E.G. (ed.): Computer and Job-Shop Scheduling Theory. John Wiley and Sons, New York (1976)

    MATH  Google Scholar 

  4. Ranganathan, K., Foster, I.: Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications. In: 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11) (2002)

    Google Scholar 

  5. Park, S., Kim, J.: Chameleon: A Resource Scheduler in a data grid environment. In: Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID 2003) (2003)

    Google Scholar 

  6. Stockinger, H., Stockinger, K., Schikuta, E., Willers, I.: Towards a Cost Model for Distributed and Replicated Data Stores. In: 9th Euromicro Workshop on Parallel and Distributed Processing (PDP 2001) (2001)

    Google Scholar 

  7. Min, R., Maheswaran, M.: Scheduling advance reservations with priorities in grid computing systems. In: Thirteenth IASTED International Conference on Parallel and Distributed Computing Systems (PDCS 2001) (2001)

    Google Scholar 

  8. Smith, W., Foster, I., Taylor, V.: Scheduling with Advanced Reservations. In: International Parallel and Distributed Processing Symposium (IPDPS 2000) (2000)

    Google Scholar 

  9. Subramani, V., Kettimuthu, R., Srinivasan, S., Sadayappan, P.: Distributed Job Scheduling on Computational Grids using Multiple Simultaneous Requests. In: Proceedings of 11th IEEE Symposium on High Performance Distributed Computing (HPDC 2002) (2002)

    Google Scholar 

  10. Casanova, H., Legrand, A., Zagorodnov, D., Berman, F.: Heuristics for Scheduling Parameter Sweep Applications in Grid Environments. In: Heterogeneous Computing Workshop (HCW 2000) (2000)

    Google Scholar 

  11. Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid Information Services for Distributed Resource Sharing. In: 10th IEEE International Symposium on High Performance Distributed Computing (HPDC-10, 2001) (2001)

    Google Scholar 

  12. Wolski, R.: Forecasting Network Performance to Support Dynamic Scheduling Using the Network Weather Service. In: Proceedings of 6th IEEE Symposium on High Performance Distributed Computing, Portland, Oregon (1997)

    Google Scholar 

  13. Busetta, P., Carman, M., Serafini, L., Zini, F., Stockinger, K.: Grid Query Optimisation in the Data Grid, Technical Report, TR-01 09-01, IRST, Trento, Italy (September 2001)

    Google Scholar 

  14. Ibarra, O.H., Kim, C.E.: Heuristic algorithms for scheduling independent tasks on nonidentical processors. Journal of the ACM (April 1977)

    Google Scholar 

  15. Maheswaran, M., Ali, S., Siegel, H.J., Hensgen, D., Freund, R.: Dynamic Matching and Scheduling of a Class of Independent Tasks onto Heterogeneous Computing Systems. In: 8th Heterogeneous Computing Workshop (HCW) (1999)

    Google Scholar 

  16. Pinedo, M.: Scheduling: Theory, Algorithms, and Systems. Prentice Hall, Englewood Cliffs (1995)

    MATH  Google Scholar 

  17. Holtman, K.: HEPGRID 2001: A Model of a Virtual Data Grid Application. LNCS. Springer, Heidelberg (2001)

    Google Scholar 

  18. Buyya, R., Abramson, D., Giddy, J., Stockinger, H.: Economic Models for Resource Management and Scheduling in Grid Computing. Journal of Concurrency and Computation: Practise and Experience (CCPE) (2002)

    Google Scholar 

  19. Takefusa, A., Tatebe, O., Matsuoka, S., Morita, Y.: Performance Analysis of Scheduling and Replication Algorithms on Grid Datafarm Architecture for High-Energy Physics Applications. In: HPDC (2003)

    Google Scholar 

  20. Takefusa, A., Casanova, H., Matsuoka, S., Berman, F.: A Study of Deadline Scheduling for Client-Server Systems on the Computational Grid. In: HPDC (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ray, S., Zhang, Z. (2004). Heuristic-Based Scheduling to Maximize Throughput of Data-Intensive Grid Applications. In: Sen, A., Das, N., Das, S.K., Sinha, B.P. (eds) Distributed Computing - IWDC 2004. IWDC 2004. Lecture Notes in Computer Science, vol 3326. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30536-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30536-1_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24076-1

  • Online ISBN: 978-3-540-30536-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics