Abstract
To identify the tradeoffs between efficiency and fault-tolerance in dynamic cooperative computing, we initiate the study of a task performing problem under dynamic processes’ crashes/restarts and task injections. The system consists of n message-passing processes which, subject to dynamic crashes and restarts, cooperate in performing independent tasks that are continuously and dynamically injected to the system. The task specifications are not known a priori to the processes. This problem abstracts todays Internet-based computations, such as Grid computing and cloud services, where tasks are generated dynamically and different tasks may be known to different processes. We measure performance in terms of the number of pending tasks, and as such it can be directly compared with the optimum number obtained under the same crash-restart-injection pattern by the best off-line algorithm. We propose several deterministic algorithmic solutions to the considered problem under different information models and correctness criteria, and we argue that their performance is close to the best possible offline solutions.
The work of the first author is supported by research funds of the University of Cyprus. The work of the second author is supported by the Engineering and Physical Sciences Research Council [grant numbers EP/G023018/1, EP/H018816/1].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ajtai, M., Aspnes, J., Dwork, C., Waarts, O.: A theory of competitive analysis for distributed algorithms. In: Proc. of FOCS 1994, pp. 401–411 (1994)
Amazon Elastic Compute Cloud, http://aws.amazon.com/ec2
Attiya, H., Fouren, A.: Polynomial and adaptive long-lived (2k - 1)-renaming. In: Herlihy, M.P. (ed.) DISC 2000. LNCS, vol. 1914, pp. 149–163. Springer, Heidelberg (2000)
Attiya, H., Fouren, A., Gafni, E.: An adaptive collect algorithm with applications. Distributed Computing 15(2), 87–96 (2002)
Awerbuch, B., Kutten, S., Peleg, D.: Competitive distributed job scheduling. In: Proc. of STOC 1992, pp. 571–580 (1992)
Bartal, Y., Fiat, A., Rabani, Y.: Competitive algorithms for distributed data management. In: Proc. of STOC 1992, pp. 39–50 (1992)
Chlebus, B., De-Prisco, R., Shvartsman, A.A.: Performing tasks on restartable message-passing processors. Distributed Computing 14(1), 49–64 (2001)
Chlebus, B.S., Kowalski, D.R., Shvartsman, A.A.: Collective asynchronous reading with polylogarithmic worst-case overhead. In: Proc. of STOC 2004, pp. 321–330 (2004)
Cordasco, G., Malewicz, G., Rosenberg, A.: Extending IC-Scheduling via the sweep algorithm. J. of Parallel and Distributed Computing 70(3), 201–211 (2010)
Dwork, C., Halpern, J., Waarts, O.: Performing work efficiently in the presence of faults. SIAM Journal on Computing 27(5), 1457–1491 (1998)
Enabling Grids for E-sciencE (EGEE), http://www.eu-egee.org
Emek, Y., Halldorsson, M.M., Mansour, Y., Patt-Shamir, B., Radhakrishnan, J., Rawitz, D.: Online set packing and competitive scheduling of multi-part tasks. In: Proc. of PODC 2010, pp. 440–449 (2010)
Georgiou, C., Gilbert, S., Kowalski, D.R.: Meeting the deadline: on the complexity of fault-tolerant continuous gossip. In: Proc. of PODC 2010, pp. 247–256 (2010)
Georgiou, C., Russell, A., Shvartsman, A.A.: The complexity of synchronous iterative Do-All with crashes. Distributed Computing 17, 47–63 (2004)
Georgiou, C., Russell, A., Shvartsman, A.A.: Work-competitive scheduling for cooperative computing with dynamic groups. SIAM J. on Comp. 34(4), 848–862 (2005)
Georgiou, C., Shvartsman, A.A.: Do-All Computing in Distributed Systems: Cooperation in the Presence of Adversity. Springer, Heidelberg (2008)
Hui, L., Huashan, Y., Xiaoming, L.: A Lightweight Execution Framework for Massive Independent Tasks. In: Proc. of MTAGS 2008 (2008)
Kanellakis, P.C., Shvartsman, A.A.: Fault-Tolerant Parallel Computation. Kluwer Academic Publishers, Dordrecht (1997)
Korpela, E., Werthimer, D., Anderson, D., Cobb, J., Lebofsky, M.: SETI@home: Massively distributed computing for SETI. Comp. in Sc. & Eng. 3(1), 78–83 (2001)
Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: A system for large-scale graph processing. In: Proc. of SIGMOD 2010, pp. 135–145 (2010)
Malewicz, G., Rosenberg, A., Yurkewych, M.: Toward a theory for scheduling dags in Internet-based computing. IEEE Trans. on Computers 55(6), 757–768 (2006)
Malewicz, G., Russell, A., Shvartsman, A.A.: Distributed scheduling for disconnected cooperation. Distributed Computing 18(6), 409–420 (2006)
Sleator, D., Tarjan, R.: Amortized efficiency of list update and paging rules. Communications of the ACM 28(2), 202–208 (1985)
Tech. Report of this work, http://www.cs.ucy.ac.cy/~chryssis/disc11-TR.pdf
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Georgiou, C., Kowalski, D.R. (2011). Performing Dynamically Injected Tasks on Processes Prone to Crashes and Restarts. In: Peleg, D. (eds) Distributed Computing. DISC 2011. Lecture Notes in Computer Science, vol 6950. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24100-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-24100-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24099-7
Online ISBN: 978-3-642-24100-0
eBook Packages: Computer ScienceComputer Science (R0)