Abstract
Scheduling a streaming application on high-performance computing (HPC) resources has to be sensitive to the computation and communication needs of each stage of the application dataflow graph to ensure QoS criteria such as latency and throughput. Since the grid has evolved out of traditional high-performance computing, the tools available for scheduling are more appropriate for batch-oriented applications. Our scheduler, called Streamline, considers the dynamic nature of the grid and runs periodically to adapt scheduling decisions using application requirements (per-stage computation and communication needs), application constraints (such as co-location of stages), and resource availability. The performance of Streamline is compared with an Optimal placement, Simulated Annealing (SA) approximations, and E-Condor, a streaming grid scheduler built using Condor. For kernels of streaming applications, we show that Streamline performs close to the Optimal and SA algorithms, and an order of magnitude better than E-Condor under non-uniform load conditions. We also conduct scalability studies showing the advantage of Streamline over other approaches. Furthermore, we implement Streamline on Planetlab as a grid service and demonstrate that it performs close to SA algorithm under dynamic resource conditions.
Similar content being viewed by others
References
Agarwalla, B., Ahmed, N., Hilley, D., Ramachandran, U.: Streamline: a scheduling heuristic for streaming application on the grid. In: 13th Annual Multimedia Computing and Networking Conference (MMCN’06), San Jose, CA (2006)
Talwar, V. et al.: An environment for enabling interactive grids. In: 12th International Symposium on High-Performance Distributed Computing (HPDC’03), pp. 184–193. Seattle, WA (2003)
Chen, L. et al.: GATES: a grid-based middleware for processing distributed data streams. In: 13th IEEE International Symposium on High-Performance Distributed Computing (HPDC-13) (2004)
Foster I. and Kesselman C. (1997). Globus: a metacomputing infrastructure toolkit. Int. J. Supercomput. Appl. High: Perf. Comput. 11: 115–128
Boer, R.: Resource management in the Condor system. Master’s Thesis, Delft University of Technology (1996)
Chapin, S.J., Katramatos, D., Karpovich, J., Grimshaw, A.S.: The Legion resource management system. In: Job Scheduling Strategies for Parallel Processing, pp. 162–178. Springer, Heidelberg (1999)
Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: an architecture for a resource management and scheduling system in a global computational grid. In: High Perf. Comput. (HPC) ASIA, (2000)
Frey, J. et al.: Condor-G: a computation management agent for multi-institutional grids. In: Proceedings of the 10th IEEE Symposium on High Performance distributed Computing (HPDC10) (2001)
Bavier, A., Bowman, M., Chun, B., Culler, D., Karlin, S., Muir, S., Peterson, L., Roscoe, T., Spalink, T., Wawrzoniak, M.: Operating system support for planetary-scale network services. In: 1st Symposium on Networked Systems Design and Implementation (NSDI’04), pp. 253–266 (2004)
Wolenetz, M., Kumar, R., Shin, J., Ramachandran, U.: Middleware guidelines for future sensor networks. In: 1st Workshop on Broadband Advanced Sensor Networks (BASENETS’04), San Jose, CA (2004)
Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid information services for distributed resource sharing. In: Proceedings of the 10th IEEE International Symposium on High-Performance Distributed Computing (HPDC-10)(2001)
Wolski R. (1998). Dynamically forecasting network performance using the Network Weather Service. J. Cluster Comput. 1: 119–132
Adam T., Chandy K. and Dickson J. (1974). A comparison of list schedules for parallel processing systems. Commun ACM 17: 685–690
Coffman E. (1976). Computer and Job-Shop Scheduling Theory. Wiley, New York
Graham R., Lawler E., Lenstra J. and Kan A.R. (1979). Optimization and approximation in deterministic sequencing and scheduling: a survey. Ann. Discr. Math. 5: 287–326
Ramamoorthy C., Chandy K. and Gonzalez M. (1972). Optimal scheduling strategies in a multiprocessor system. IEEE Trans. Comput. C-21: 137–146
Hu T. (1961). Parallel sequencing and assembly line problems. Oper. Res. 9: 841–848
Czajkowski, K., Foster, I., Karonis, N., Kesselman, C., Martin, S., Smith, W., Tuecke, S.: A resource management architecture for metacomputing systems. In: IPPS/SPDP ’98 Workshop on Job Scheduling Strategies for Parallel Processing, pp. 62–82 (1998)
Metropolis N., Rosenbluth A.W., Rosenbluth M.N., Teller A.H. and Teller E. (1953). Equations of state calculations by fast computing machines. J. Chem. Phys. 21: 1087–1091
Kirkpatrick S., Gelatt C.D. and Vecchi M.P. (1983). Optimization by simulated annealing. Science 220(4598): 671–680
Massie, M. L., Chun, B. N., Culler, D. E.: The Ganglia Distributed Monitoring System: Design, Implementation, and Experience. Parallel Comput. 30(2004)
Simple XML parsing with SAX and DOM: http://www.onjava.com/pub/a/onjava/2002/06/26/xml.htm
Nabrzyski J., Schopf J.M. and Weglarz J. (2003). Grid Resource Management: State of the Art and Future Trends. Kluwer, Dardrecht
Chen, L., Agrawal, G.:Resource allocation in a middleware for streaming data. In: 2nd Workshop on Middleware for Grid Computing (MGC’04). Toronto, Canada 18 October 2004
Gerasoulis A. and Yang T. (1992). A comparison of clustering heuristics for scheduling DAGs on multiprocessors. J. Parallel Distrib. Comput. 16: 276–291
Kwok Y.-K. and Ahmad I. (1996). Dynamic critical-path scheduling: an effective technique for allocating task graphs to multiprocessors. IEEE Trans. on Parallel Distrib. Sys. 7(5): 506–521
Topcuoglu H., Hariri S. and Wu M.-Y. (2002). Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. on Parallel Distrib. Sys. 13: 260–274
Gu, X., Nahrstedt, K., Yu, B.: Spidernet: an integrated peer-to-peer service composition framework. In: Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing (HPDC’04), pp. 110–119. IEEE Computer Society, Washington (2004)
Liang, J., Nahrstedt, K.: Service composition for advanced multimedia applications. In: 12th Annual Multimedia Computing and Networking (MMCN 2005) (2005)
Cherniack, M. et al.: Scalable Distributed Stream Processing. In: 1st Biennial Conference on Innovative Data Systems Research (CIDR’03). Asilomar (2003)
Abadi D.J. and Carney D. et al. (2003). Aurora: a new model and architecture for data stream management. VLDB J. 12: 120–139
Balazinska, M., Balakrishnan, H., Stonebraker, M.: Contract-based load management in federated distributed systems. In: 1st Symposium on Networked Systems Design and Implementation (NSDI), San Francisco (2004)
Chandrasekaran, S. et al.: TelegraphCQ: continuous dataflow processing. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD’03) (2003)
Author information
Authors and Affiliations
Corresponding author
Additional information
An earlier version of this paper appeared in [1]. This paper includes Sect. 6 on experiments using wide area environment. We describe our experience implementing Streamline scheduler as a grid service on Planetlab. We also present our experimental results on Planetlab in Sect. 6. We update related work in Sect. 7.
Rights and permissions
About this article
Cite this article
Agarwalla, B., Ahmed, N., Hilley, D. et al. Streamline: scheduling streaming applications in a wide area environment. Multimedia Systems 13, 69–85 (2007). https://doi.org/10.1007/s00530-007-0082-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-007-0082-0