skip to main content
10.1145/2479440.2479443acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

D-Zipfian: a decentralized implementation of Zipfian

Published:24 June 2013Publication History

ABSTRACT

Zipfian distribution is used extensively to generate workloads to test, tune, and benchmark data stores. This paper presents a decentralized implementation of this technique, named D-Zipfian, using N parallel generators to issue requests. A request is a reference to a data item from a fixed population of data items. The challenge is for each generator to reference a disjoint set of data items. Moreover, they should finish at approximately the same time by performing work proportional to their processing capability. Intuitively, D-Zipfian assigns a total probability of 1/N to each of the N generators and requires each generator to reference data items with a scaled probability. In the case of heterogeneous generators, the total probability of each generator is proportional to its processing capability. We demonstrate the effectiveness of D-Zipfian using empirical measurements of the chi-square statistic.

References

  1. C. Aniszczyk. Caching with Twemcache, http://engineering.twitter.com/2012/07/caching-with-twemcache.html.Google ScholarGoogle Scholar
  2. Anon. A Measure of Transaction Processing Power. Datamation, April 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Barahmand and S. Ghandeharizadeh. BG: A Benchmark to Evaluate Interactive Social Networking Actions. CoRR, Proceedings of 2013 CIDR, abs/0913.1780, January 2013.Google ScholarGoogle Scholar
  4. R. Cattell. Scalable SQL and NoSQL Data Stores. SIGMOD Rec., 39:12--27, May 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Dan, D. Sitaram, and P. Shahabuddin. Scheduling Policies for an On-Demand Video Server with Batching. In 2nd ACM Multimedia Conference, October 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Fan and N. Lynch. Gradient Clock Synchronization. In Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing, pages 320--327, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Ghandeharizadeh and D. J. DeWitt. Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines. In 16th International Conference on Very Large Data Bases, pages 481--492, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Ghandeharizadeh, J. Yap, and S. Barahmand. COSAR-CQN: An Application Transparent Approach to Cache Consistency. In Twenty First International Conference On Software Engineering and Data Engineering, Los Angeles, CA, Best Paper Award, 2012.Google ScholarGoogle Scholar
  9. K. Iwanicki, M. van Steen, and S. Voulgaris. Gossip-based Clock Synchronization for Large Decentralized Systems. In Proceedings of the Second IEEE international conference on Self-Managed Networks, Systems, and Services, pages 28--42, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. L. Lamport. Time, Clocks, and the Ordering of Events in a Distributed System. Commun. ACM, 21(7):558--565, Jul 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. L. Mills. On the Accuracy and Stablility of Clocks Synchronized by the Network Time Protocol in the Internet System. SIGCOMM Comput. Commun. Rev., 20(1), December 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Patil, M. Polte, K. Ren, W. Tantisiriroj, L. Xiao, J. López, G. Gibson, A. Fuchs, and B. Rinaldi. YCSB++: Benchmarking and Performance Debugging Advanced Features in Scalable Table Stores. In Cloud Computing, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Patterson. For Better or Worse, Benchmarks Shape a Field. Communications of the ACM, 55, July 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. R and R. Greenstreet. Toward Higher Precision. Commun. ACM, 55(10):38--47, October 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Schenker. A Scalable Content-Addressable Network. In Proceedings of the ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pages 161--172, Aug. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Saab. Scaling memcached at Facebook, https://www.facebook.com/note.php?note_id=39391378919.Google ScholarGoogle Scholar
  17. M. Seltzer, D. Krinsky, K. Smith, and X. Zhang. The Case for Application Specific Benchmarking. In HotOS, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. I. Stoica, R. Morris, D. Karger, M. Kaashoek, and H. Balakrishnan. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. In ACM SIGCOMM, pages 149--160, San Diego, California, Aug. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Stonebraker. New Opportunities for New SQL. Communications of the ACM, BLOG@ACM, 55, November 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. K. Zipf. Relative Frequency as a Determinant of Phonetic Change. Harvard Studies in Classified Philiology, Volume XL, 1929.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. D-Zipfian: a decentralized implementation of Zipfian

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              DBTest '13: Proceedings of the Sixth International Workshop on Testing Database Systems
              June 2013
              63 pages
              ISBN:9781450321518
              DOI:10.1145/2479440

              Copyright © 2013 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 24 June 2013

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              DBTest '13 Paper Acceptance Rate9of15submissions,60%Overall Acceptance Rate31of56submissions,55%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader