skip to main content
10.1145/2755573.2755607acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
announcement

Fast and Better Distributed MapReduce Algorithms for k-Center Clustering

Published:13 June 2015Publication History

ABSTRACT

In this paper we introduce a new network scheduling model. Here jobs need to be sent via routers on a tree to machines to be scheduled, and the communication is constrained by network bandwidth. The scheduler coordinates network communication and job machine scheduling. This type of scheduler is highly desirable in practice; yet few works have considered combing networking with job processing. We consider the popular objective of total flow time in the online setting. We give a (1+ε)-speed O(1/ε7)-competitive algorithm when all routers are identical and all machines are identical for any fixed ε >0. Then we go on to show a (2+ε)-speed O(1/ε7)-competitive algorithm when the routers are identical and the machines are unrelated. To show these results we introduce an interesting combination of potential function and dual fitting techniques as well as a reduction of general tree scheduling to a special case of trees.

References

  1. A. Andoni, A. Nikolov, K. Onak, and G. Yaroslavtsev. Parallel algorithms for geometric graph problems. In STOC, pages 574--583, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Bahmani, B. Moseley, A. Vattani, R. Kumar, and S. Vassilvitskii. Scalable k-means++. PVLDB, 5(7):622--633, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Balcan, S. Ehrlich, and Y. Liang. Distributed k-means and k-median clustering on general communication topologies. In NIPS, pages 1995--2003, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Charikar, S. Khuller, D. M. Mount, and G. Narasimhan. Algorithms for facility location problems with outliers. In SODA, pages 642--651, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Ene, S. Im, and B. Moseley. Fast clustering using MapReduce. In KDD, pages 681--689, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38(0):293--306, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  7. M. T. Goodrich, N. Sitchinava, and Q. Zhang. Sorting, searching, and simulation in the mapreduce framework. In ISAAC, pages 374--383, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. S. Hochbaum and D. B. Shmoys. A best possible heuristic for the k-center problem. Mathematics of Operations Research, 10(2):180--184, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. J. Karloff, S. Suri, and S. Vassilvitskii. A model of computation for MapReduce. In SODA, pages 938--948, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Kumar, B. Moseley, S. Vassilvitskii, and A. Vattani. Fast greedy algorithms in mapreduce and streaming. In SPAA, pages 1--10, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Lattanzi, B. Moseley, S. Suri, and S. Vassilvitskii. Filtering: A method for solving graph problems in MapReduce. In SPAA, pages 85--94, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Mirzasoleiman, A. Karbasi, R. Sarkar, and A. Krause. Distributed submodular maximization: Identifying representative elements in massive data. In NIPS, pages 2049--2057, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. White. Hadoop: The Definitive Guide. O'Reilly Media, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Xu and D. Wunsch. Survey of Clustering Algorithms. IEEE Trans Neural Netw, 16(3):645--678, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. W. Zhao, H. Ma, and Q. He. In M. G. Jaatun, G. Zhao, and C. Rong, editors, CloudCom.Google ScholarGoogle Scholar

Index Terms

  1. Fast and Better Distributed MapReduce Algorithms for k-Center Clustering

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SPAA '15: Proceedings of the 27th ACM symposium on Parallelism in Algorithms and Architectures
      June 2015
      362 pages
      ISBN:9781450335881
      DOI:10.1145/2755573
      • General Chair:
      • Guy Blelloch,
      • Program Chair:
      • Kunal Agrawal

      Copyright © 2015 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 June 2015

      Check for updates

      Qualifiers

      • announcement

      Acceptance Rates

      SPAA '15 Paper Acceptance Rate31of131submissions,24%Overall Acceptance Rate447of1,461submissions,31%

      Upcoming Conference

      SPAA '24
    • Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader