Skip to main content

Performance-Driven Task and Data Co-scheduling Algorithms for Data-Intensive Applications in Grid Computing

  • Conference paper
Book cover Advanced Web Technologies and Applications (APWeb 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3007))

Included in the following conference series:

Abstract

To gain higher performance under many constraints, effective scheduling is a key concern in data-intensive grid computing. Based on a Dual-Component and Dual-Queue Distributed Schedule Model (DCDQDSM), we present task and data co-scheduling algorithms, by which the waiting time to access datasets for the scheduled task will reduce. Firstly data replication and elimination schedule are processed by an independent approach. Secondly, if a task is divisible, the task and its dataset are divided into subtasks and their necessary data subsets. Task scheduling adopts a general approach. Finally, when a scheduled task/subtask doesn’t hit its dataset, associated data transferring is bound to this task. On the basis of relation between task execution and data access, data replication and computing may proceed concurrently in one scheduled task with divisible dataset or between scheduled tasks. Corresponding theoretic analysis and experimental results suggest that the scheduling algorithms improve execution performance and resource utilization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Foster, I., Kesselman, C., et al.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations. Journal of High Performance Computing Applications 15(3), 200–222 (2001)

    Article  Google Scholar 

  2. Allcock, W., Chervenak, A., Foster, I., et al.: The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets. Journal of Network and Computer Applications 23(3), 187–200 (2000)

    Article  Google Scholar 

  3. Beaumont, O., et al.: Bandwidth-Centric Allocation of Independent Tasks on Heterogeneous Platforms. In: Proc. of the International Parallel and Distributed Processing Symposium (2002)

    Google Scholar 

  4. Vadhiyar, S.S., Dongarra, J.J.: A Metascheduler for the Grid. In: Proc. of the 11th IEEE International Symposium on High Performance Distributed Computing (2002)

    Google Scholar 

  5. Berman, F., et al.: Adaptive Computing on the Grid Using AppLeS. IEEE Transactions on Parallel and Distribted Systems 14(4), 369–382 (2003)

    Article  Google Scholar 

  6. Wolski, R., et al.: The Network Weather Service: a Distributed Resource Performance Forecasting Service for Metacomputing. Future Generation Computing Systems (5-6), 757–768 (1999)

    Google Scholar 

  7. Smith, W., et al.: Predicting Application Run Times Using Historical Information. In: Proc. of the IPPS/SPDP Workshop on Job Scheduling Strategies for Parallel Processing (1998)

    Google Scholar 

  8. Zomaya, Y., et al.: Observations on Using Genetic Algorithms for Dynamic Load- Balancing. IEEE Transactions on Parallel and Distributed Systems 9, 899–911 (2001)

    Article  Google Scholar 

  9. Ranganathan, K., Foster, I.: Identifying Dynamic Replication Strategies for a High- Performance Data Grid. In: Proc. of the 2nd IEEE/ACM International Workshop on Grid Computing–GRID 2001 (2001)

    Google Scholar 

  10. Blazewicz, J., et al.: Divisible Task Scheduling - Concept and Verification. Journal of Parallel Computing 25(1), 87–98 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  11. Yang, Y., et al.: RUMR: Robust Scheduling for Divisible Workloads. In: Proc. of the 12th IEEE International Symposium on High Performance Distributed Computing (2003)

    Google Scholar 

  12. Beaumont, O., Legrand, A., et al.: Scheduling Strategies for Mixed Data and Task Parallelism on Heterogeneous Clusters and Grids. In: Proc. of the 11th Euromicro Conference on Parallel, Distributed and Network-Based Processing (2003)

    Google Scholar 

  13. Balaji, P., Wu, J., Kurc, T.: Impact of High Performance Sockets on Data Intensive Applications. In: Proc. of the 12th IEEE International Symposium on High Performance Distributed Computing (2003)

    Google Scholar 

  14. Thain, D., Bent, J., et al.: Gathering at the Well: Creating Communities for Grid I/O. In: Proc. of Supercomputing 2000, Denver (2000)

    Google Scholar 

  15. Ranganathan, K., Foster, I.: Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications. In: Proc. of the 11th International Symposium on High Performance Distributed Computing (2002)

    Google Scholar 

  16. Nudd, G.R., Kerbyson, D.J., et al.: PACE – A Toolset for the Performance Prediction of Parallel and Distributed Systems. Journal of High Performance Computing Applications 3, 228–251 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Huang, C., Chen, D., Zheng, Y., Hu, H. (2004). Performance-Driven Task and Data Co-scheduling Algorithms for Data-Intensive Applications in Grid Computing. In: Yu, J.X., Lin, X., Lu, H., Zhang, Y. (eds) Advanced Web Technologies and Applications. APWeb 2004. Lecture Notes in Computer Science, vol 3007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24655-8_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24655-8_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21371-0

  • Online ISBN: 978-3-540-24655-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics