ABSTRACT
In this paper we introduce a new network scheduling model. Here jobs need to be sent via routers on a tree to machines to be scheduled, and the communication is constrained by network bandwidth. The scheduler coordinates network communication and job machine scheduling. This type of scheduler is highly desirable in practice; yet few works have considered combing networking with job processing. We consider the popular objective of total flow time in the online setting. We give a (1+ε)-speed O(1/ε7)-competitive algorithm when all routers are identical and all machines are identical for any fixed ε >0. Then we go on to show a (2+ε)-speed O(1/ε7)-competitive algorithm when the routers are identical and the machines are unrelated. To show these results we introduce an interesting combination of potential function and dual fitting techniques as well as a reduction of general tree scheduling to a special case of trees.
- A. Andoni, A. Nikolov, K. Onak, and G. Yaroslavtsev. Parallel algorithms for geometric graph problems. In STOC, pages 574--583, 2014. Google ScholarDigital Library
- B. Bahmani, B. Moseley, A. Vattani, R. Kumar, and S. Vassilvitskii. Scalable k-means++. PVLDB, 5(7):622--633, 2012. Google ScholarDigital Library
- M. Balcan, S. Ehrlich, and Y. Liang. Distributed k-means and k-median clustering on general communication topologies. In NIPS, pages 1995--2003, 2013.Google ScholarDigital Library
- M. Charikar, S. Khuller, D. M. Mount, and G. Narasimhan. Algorithms for facility location problems with outliers. In SODA, pages 642--651, 2001. Google ScholarDigital Library
- A. Ene, S. Im, and B. Moseley. Fast clustering using MapReduce. In KDD, pages 681--689, 2011. Google ScholarDigital Library
- T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38(0):293--306, 1985.Google ScholarCross Ref
- M. T. Goodrich, N. Sitchinava, and Q. Zhang. Sorting, searching, and simulation in the mapreduce framework. In ISAAC, pages 374--383, 2011. Google ScholarDigital Library
- D. S. Hochbaum and D. B. Shmoys. A best possible heuristic for the k-center problem. Mathematics of Operations Research, 10(2):180--184, 1985. Google ScholarDigital Library
- H. J. Karloff, S. Suri, and S. Vassilvitskii. A model of computation for MapReduce. In SODA, pages 938--948, 2010. Google ScholarDigital Library
- R. Kumar, B. Moseley, S. Vassilvitskii, and A. Vattani. Fast greedy algorithms in mapreduce and streaming. In SPAA, pages 1--10, 2013. Google ScholarDigital Library
- S. Lattanzi, B. Moseley, S. Suri, and S. Vassilvitskii. Filtering: A method for solving graph problems in MapReduce. In SPAA, pages 85--94, 2011. Google ScholarDigital Library
- B. Mirzasoleiman, A. Karbasi, R. Sarkar, and A. Krause. Distributed submodular maximization: Identifying representative elements in massive data. In NIPS, pages 2049--2057, 2013.Google ScholarDigital Library
- T. White. Hadoop: The Definitive Guide. O'Reilly Media, 2009. Google ScholarDigital Library
- R. Xu and D. Wunsch. Survey of Clustering Algorithms. IEEE Trans Neural Netw, 16(3):645--678, 2005. Google ScholarDigital Library
- W. Zhao, H. Ma, and Q. He. In M. G. Jaatun, G. Zhao, and C. Rong, editors, CloudCom.Google Scholar
Index Terms
- Fast and Better Distributed MapReduce Algorithms for k-Center Clustering
Recommendations
Scheduling in Bandwidth Constrained Tree Networks
SPAA '15: Proceedings of the 27th ACM symposium on Parallelism in Algorithms and ArchitecturesIn this paper we introduce a new network scheduling model. Here jobs need to be sent via routers on a tree to machines to be scheduled, and the communication is constrained by network bandwidth. The scheduler coordinates network communication and job ...
Competitive algorithms from competitive equilibria: non-clairvoyant scheduling under polyhedral constraints
STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computingWe introduce and study a general scheduling problem that we term the Packing Scheduling problem (PSP). In this problem, jobs can have different arrival times and sizes; a scheduler can process job j at rate xj, subject to arbitrary packing constraints ...
Scheduling Parallelizable Jobs Online to Minimize the Maximum Flow Time
SPAA '16: Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and ArchitecturesIn this paper we study the problem of scheduling a set of dynamic multithreaded jobs with the objective of minimizing the maximum latency experienced by any job. We assume that jobs arrive online and the scheduler has no information about the arrival ...
Comments