Abstract
A key problem in executing performance critical applications on distributed computing environments (e.g. the Grid) is the selection of resources. Research related to “automatic resource selection” aims to allocate resources on behalf of users to optimize the execution performance. However, most of current approaches are based on the static principle (i.e. resource selection is performed prior to execution) and need detailed application-specific information. In the paper, we introduce a novel on-line automatic resource selection approach. This approach is based on a simple control theory: the application continuously reports the Execution Satisfaction Degree (ESD) to the middleware Application Agent (AA), which relies on the reported ESD values to learn the execution behavior and tune the computing environment by adding/replacing/deleting resources during the execution in order to satisfy users’ performance requirements. We introduce two different policies applied to this approach to enable the AA to learn and tune the computing environment: the Utility Classification policy and the Desired Processing Power Estimation (DPPE) policy. Each policy is validated by an iterative application and a non-iterative application to demonstrate that both policies are effective to support most kinds of applications.
Similar content being viewed by others
References
Adapting to Load on Workstation Clusters. IEEE Computer Society Press (1999)
Azzedin, F., Maheswaran, M.: Evolving and managing trust in grid computing systems. In: Electrical and Computer Engineering, 2002. IEEE CCECE 2002. Canadian Conference on, vol. 3, pp. 1424–1429 (2002). doi:10.1109/CCECE.2002.1012962
Beattie, B.R., LaFrance, J.T.: The law of demand versus diminishing marginal utility. Rev. Agric. Econ. 263–271 (2006)
Berman, F.D., Wolski, R., Figueira, S., Schopf, J., Shao, G.: Application-level scheduling on distributed heterogeneous networks. In: Supercomputing ’96: Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), p. 39. IEEE Computer Society, Washington (1996). doi:10.1145/369028.369109
Buyya, R., Giddy, J., Abramson, D.: An evaluation of economy-based resource trading and scheduling on computational power grids for parameter sweep applications. In: Sweep Applications, The Second Workshop on Active Middleware Services (AMS 2000), In conjunction with HPDC 2001. Kluwer Academic, Dordrecht (2000)
Buyya, R., Murshed, M., Abramson, D.: A deadline and budget constrained cost-time optimization algorithm for scheduling task farming applications on global grids. In: In Int. Conf. on Parallel and Distributed Processing Techniques and Applications, Las Vegas (2002)
Condor: Condor online manual version 7.0. Http://www.cs.wisc.edu/condor/manual/v7.0/
Cruz, J.R., Mineck, R.E., Keller, D.F., Bobskill, M.V., Cruz, J.R., Mineck, R.E., Keller, D.F., Bobskill, M.V.: Parallel Computing Works (1994)
Czajkowski, K., Foster, I., Karonis, N., Kesselman, C., Martin, S., Smith, W., Tuecke, S.: A resource management architecture for metacomputing systems. In: Lecture Notes in Computer Science, vol. 1459, p. 62 (1998). citeseer.ist.psu.edu/czajkowski97resource.html
Goux, J., Linderoth, J., Yoder, M.: Metacomputing and the master-worker paradigm (1999). citeseer.ist.psu.edu/goux00metacomputing.html
Huang, R., Casanova, H., Chien, A.A.: Automatic resource specification generation for resource selection. In: SC ’07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, pp. 1–11. ACM, New York (2007). doi:10.1145/1362622.1362638
Ingersoll, J.E.: Theory of Financial Decision Making. Rowman & Littlefield Publishers, Inc, Totowa (1987)
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Kohl, J.A., Geist, G.A.: The pvm 3.4 tracing facility and xpvm 1.1, pp. 290–299 (1995)
Lawlor, O.S., Kalí, L.V.: Supporting dynamic parallel object arrays. In: Proceedings of ACM 2001 Java Grande/ISCOPE Conference, pp. 21–29 (2001)
Lin, B., Sundararaj, A.I., Dinda, P.A.: Time-sharing parallel applications through performance-targeted feedback-controlled real-time scheduling. Cluster Comput. 11(3), 273–285 (2008). doi:10.1007/s10586-008-0055-x
Lindner, P., Gabriel, E., Resch, M.M.: Performance prediction based resource selection in grid environments. In: HPCC, pp. 228–238 (2007)
Liu, H., Nazir, A., Sørensen, S.A.: Preliminary resource management for dynamic parallel applications in the grid. In: GridNets, pp. 70–80 (2008)
Liu, H., Nazir, A., Sørensen, S.A.: A software framework to support adaptive applications in distributed/parallel computing. In: High Performance Computing and Communications, 2009. HPCC ’09. 11th IEEE International Conference on, pp. 563–570 (2009). doi:10.1109/HPCC.2009.30
London, K., Dongarra, J., Moore, S., Mucci, P., Seymour, K., Spencer, T.: End-user tools for application performance analysis using hardware counters. In: International Conference on Parallel and Distributed Computing Systems (2001)
Lu, C., Wang, X., Koutsoukos, X.: Feedback utilization control in distributed real-time systems with end-to-end tasks. IEEE Trans. Parallel Distrib. Syst. 16(6), 550–561 (2005). doi:10.1109/TPDS.2005.73
Martin, J.M.R., Tiskin, A.V.: Dynamic BSP: Towards a flexible approach to parallel computing over the grid. In: East, I.R., Duce, D., Green, M., Martin, J.M.R., Welch, P.H. (eds.) Communicating Process Architectures 2004, pp. 219–226 (2004)
Morajko, A., Caymes-Scutari, P., Margalef, T., Luque, E.: Mate: Monitoring analysis and tuning environment for parallel/distributed applications: Research articles. Concurr. Comput., Pract. Exp. 19(11), 1517–1531 (2007). doi:10.1002/cpe.v19:11
Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the ibm sp2 with backfilling. IEEE Trans. Parallel Distrib. Syst. 12(6), 529–543 (2001). doi:10.1109/71.932708
N1 grid engine6 administration guide. Tech. rep., Sun Microsystems, Inc
Nazir, A., Liu, H., Sørensen, S.A.: Powerpoint presentation: Steering dynamic behaviour. In: Open Grid Forum 20, Manchester, UK (2007)
Ribler, Y.L., Vetter, J.S., Ribler, R.L., Vetter, J.S., Simitci, H., Simitci, H., Reed, D.A., Reed, D.A.: Autopilot: Adaptive control of distributed applications. In: Proceedings of the 7th IEEE Symposium on High-Performance Distributed Computing, pp. 172–179 (1998)
Rock, H.: Parallel solving of the heat equation with mpi. Tech. rep., Department of Scientific Computing, University of Salzburg (2004)
Skillicorn, D.B., Hill, J.M.D., Mccoll, W.F.: Questions and answers about bsp (1996)
Stankovic, J., He, T., Abdelzaher, T., Marley, M., Tao, G., Son, S., Lu, C.: Feedback control scheduling in distributed real-time systems. In: Real-Time Systems Symposium, 2001 (RTSS 2001). Proceedings, 22nd IEEE, pp. 59–70 (2001)
Tsafrir, D., Etsion, Y., Feitelson, D.G.: Backfilling using system-generated predictions rather than user runtime estimates. In IEEE Trans. Parallel Distrib. Syst. 18, 789–803 (2007)
Vraalsen, F., Aydt, R.A., Mendes, C.L., Reed, D.A.: Performance contracts: Predicting and monitoring grid application behavior. In: GRID, pp. 154–165 (2001). citeseer.ist.psu.edu/vraalsen01performance.html
Welch, G., Bishop, G.: An introduction to the Kalman filter. Tech. rep. (2006)
Wolski, R., Spring, N.T., Hayes, J.: The network weather service: A distributed resource performance forecasting service for metacomputing. Future Gener. Comput. Syst. 15(5–6), 757–768 (1999) citeseer.ist.psu.edu/wolski98network.html
Xiong, L., Liu, L., Society, I.C.: Peertrust: Supporting reputation-based trust for peer-to-peer electronic communities. IEEE Trans. Knowl. Data Eng. 16, 843–857 (2004)
Zhou, S., Zheng, X., Wang, J., Delisle, P.: Utopia: A load sharing facility for large, heterogeneous distributed computer systems. Softw. Pract. Exp. 23(12), 1305–1336 (1993). doi:10.1002/spe.4380231203
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, H., Sørensen, SA. On-line feedback-based automatic resource configuration for distributed applications. Cluster Comput 13, 397–419 (2010). https://doi.org/10.1007/s10586-010-0123-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-010-0123-x