Skip to main content
Log in

Streamline: scheduling streaming applications in a wide area environment

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Scheduling a streaming application on high-performance computing (HPC) resources has to be sensitive to the computation and communication needs of each stage of the application dataflow graph to ensure QoS criteria such as latency and throughput. Since the grid has evolved out of traditional high-performance computing, the tools available for scheduling are more appropriate for batch-oriented applications. Our scheduler, called Streamline, considers the dynamic nature of the grid and runs periodically to adapt scheduling decisions using application requirements (per-stage computation and communication needs), application constraints (such as co-location of stages), and resource availability. The performance of Streamline is compared with an Optimal placement, Simulated Annealing (SA) approximations, and E-Condor, a streaming grid scheduler built using Condor. For kernels of streaming applications, we show that Streamline performs close to the Optimal and SA algorithms, and an order of magnitude better than E-Condor under non-uniform load conditions. We also conduct scalability studies showing the advantage of Streamline over other approaches. Furthermore, we implement Streamline on Planetlab as a grid service and demonstrate that it performs close to SA algorithm under dynamic resource conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agarwalla, B., Ahmed, N., Hilley, D., Ramachandran, U.: Streamline: a scheduling heuristic for streaming application on the grid. In: 13th Annual Multimedia Computing and Networking Conference (MMCN’06), San Jose, CA (2006)

  2. Talwar, V. et al.: An environment for enabling interactive grids. In: 12th International Symposium on High-Performance Distributed Computing (HPDC’03), pp. 184–193. Seattle, WA (2003)

  3. Chen, L. et al.: GATES: a grid-based middleware for processing distributed data streams. In: 13th IEEE International Symposium on High-Performance Distributed Computing (HPDC-13) (2004)

  4. Foster I. and Kesselman C. (1997). Globus: a metacomputing infrastructure toolkit. Int. J. Supercomput. Appl. High: Perf. Comput. 11: 115–128

    Article  Google Scholar 

  5. Boer, R.: Resource management in the Condor system. Master’s Thesis, Delft University of Technology (1996)

  6. Chapin, S.J., Katramatos, D., Karpovich, J., Grimshaw, A.S.: The Legion resource management system. In: Job Scheduling Strategies for Parallel Processing, pp. 162–178. Springer, Heidelberg (1999)

  7. Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: an architecture for a resource management and scheduling system in a global computational grid. In: High Perf. Comput. (HPC) ASIA, (2000)

  8. Frey, J. et al.: Condor-G: a computation management agent for multi-institutional grids. In: Proceedings of the 10th IEEE Symposium on High Performance distributed Computing (HPDC10) (2001)

  9. Bavier, A., Bowman, M., Chun, B., Culler, D., Karlin, S., Muir, S., Peterson, L., Roscoe, T., Spalink, T., Wawrzoniak, M.: Operating system support for planetary-scale network services. In: 1st Symposium on Networked Systems Design and Implementation (NSDI’04), pp. 253–266 (2004)

  10. Wolenetz, M., Kumar, R., Shin, J., Ramachandran, U.: Middleware guidelines for future sensor networks. In: 1st Workshop on Broadband Advanced Sensor Networks (BASENETS’04), San Jose, CA (2004)

  11. Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid information services for distributed resource sharing. In: Proceedings of the 10th IEEE International Symposium on High-Performance Distributed Computing (HPDC-10)(2001)

  12. Wolski R. (1998). Dynamically forecasting network performance using the Network Weather Service. J. Cluster Comput. 1: 119–132

    Article  Google Scholar 

  13. Adam T., Chandy K. and Dickson J. (1974). A comparison of list schedules for parallel processing systems. Commun ACM 17: 685–690

    Article  MATH  Google Scholar 

  14. Coffman E. (1976). Computer and Job-Shop Scheduling Theory. Wiley, New York

    MATH  Google Scholar 

  15. Graham R., Lawler E., Lenstra J. and Kan A.R. (1979). Optimization and approximation in deterministic sequencing and scheduling: a survey. Ann. Discr. Math. 5: 287–326

    Article  MATH  Google Scholar 

  16. Ramamoorthy C., Chandy K. and Gonzalez M. (1972). Optimal scheduling strategies in a multiprocessor system. IEEE Trans. Comput. C-21: 137–146

    MathSciNet  Google Scholar 

  17. Hu T. (1961). Parallel sequencing and assembly line problems. Oper. Res. 9: 841–848

    Article  Google Scholar 

  18. Czajkowski, K., Foster, I., Karonis, N., Kesselman, C., Martin, S., Smith, W., Tuecke, S.: A resource management architecture for metacomputing systems. In: IPPS/SPDP ’98 Workshop on Job Scheduling Strategies for Parallel Processing, pp. 62–82 (1998)

  19. Metropolis N., Rosenbluth A.W., Rosenbluth M.N., Teller A.H. and Teller E. (1953). Equations of state calculations by fast computing machines. J. Chem. Phys. 21: 1087–1091

    Article  Google Scholar 

  20. Kirkpatrick S., Gelatt C.D. and Vecchi M.P. (1983). Optimization by simulated annealing. Science 220(4598): 671–680

    Article  MathSciNet  Google Scholar 

  21. Massie, M. L., Chun, B. N., Culler, D. E.: The Ganglia Distributed Monitoring System: Design, Implementation, and Experience. Parallel Comput. 30(2004)

  22. Simple XML parsing with SAX and DOM: http://www.onjava.com/pub/a/onjava/2002/06/26/xml.htm

  23. Nabrzyski J., Schopf J.M. and Weglarz J. (2003). Grid Resource Management: State of the Art and Future Trends. Kluwer, Dardrecht

    Google Scholar 

  24. Chen, L., Agrawal, G.:Resource allocation in a middleware for streaming data. In: 2nd Workshop on Middleware for Grid Computing (MGC’04). Toronto, Canada 18 October 2004

  25. Gerasoulis A. and Yang T. (1992). A comparison of clustering heuristics for scheduling DAGs on multiprocessors. J. Parallel Distrib. Comput. 16: 276–291

    Article  MATH  MathSciNet  Google Scholar 

  26. Kwok Y.-K. and Ahmad I. (1996). Dynamic critical-path scheduling: an effective technique for allocating task graphs to multiprocessors. IEEE Trans. on Parallel Distrib. Sys. 7(5): 506–521

    Article  Google Scholar 

  27. Topcuoglu H., Hariri S. and Wu M.-Y. (2002). Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. on Parallel Distrib. Sys. 13: 260–274

    Article  Google Scholar 

  28. Gu, X., Nahrstedt, K., Yu, B.: Spidernet: an integrated peer-to-peer service composition framework. In: Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing (HPDC’04), pp. 110–119. IEEE Computer Society, Washington (2004)

  29. Liang, J., Nahrstedt, K.: Service composition for advanced multimedia applications. In: 12th Annual Multimedia Computing and Networking (MMCN 2005) (2005)

  30. Cherniack, M. et al.: Scalable Distributed Stream Processing. In: 1st Biennial Conference on Innovative Data Systems Research (CIDR’03). Asilomar (2003)

  31. Abadi D.J. and Carney D. et al. (2003). Aurora: a new model and architecture for data stream management. VLDB J. 12: 120–139

    Article  Google Scholar 

  32. Balazinska, M., Balakrishnan, H., Stonebraker, M.: Contract-based load management in federated distributed systems. In: 1st Symposium on Networked Systems Design and Implementation (NSDI), San Francisco (2004)

  33. Chandrasekaran, S. et al.: TelegraphCQ: continuous dataflow processing. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD’03) (2003)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bikash Agarwalla.

Additional information

An earlier version of this paper appeared in [1]. This paper includes Sect. 6 on experiments using wide area environment. We describe our experience implementing Streamline scheduler as a grid service on Planetlab. We also present our experimental results on Planetlab in Sect. 6. We update related work in Sect. 7.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Agarwalla, B., Ahmed, N., Hilley, D. et al. Streamline: scheduling streaming applications in a wide area environment. Multimedia Systems 13, 69–85 (2007). https://doi.org/10.1007/s00530-007-0082-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-007-0082-0

Keywords

Navigation