skip to main content
research-article

KiWi: A Key-value Map for Scalable Real-time Analytics

Published: 21 June 2020 Publication History

Abstract

We present KiWi, the first atomic KV-map to efficiently support simultaneous large scans and real-time access. The key to achieving this is treating scans as first class citizens and organizing the data structure around them. KiWi provides wait-free scans, whereas its put operations are lightweight and lock-free. It optimizes memory management jointly with data structure access. We implement KiWi and compare it to state-of-the-art solutions. Compared to other KV-maps providing atomic scans, KiWi performs either long scans or concurrent puts an order of magnitude faster. Its scans are twice as fast as non-atomic ones implemented via iterators in the Java skiplist.

References

[1]
[n.d.]. Apache HBase—A Distributed Hadoop Database. Retrieved from https://hbase.apache.org/.
[2]
[n.d.]. Java Array Copy. Retrieved from https://docs.oracle.com/javase/7/docs/api/java/lang/System.html.
[3]
[n.d.]. Java Concurrent Skip List. Retrieved from https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentSkipListMap.html.
[4]
2014. A fast and lightweight key/value database library by Google. Retrieved from http://code.google.com/p/leveldb.
[5]
2014. A persistent key-value store for fast storage environments. Retrieved from http://rocksdb.org/.
[6]
Maya Arbel, Guy Golan-Gueta, Eshcar Hillel, and Idit Keidar. 2015. Towards automatic lock removal for scalable synchronization. In Proceedings of the International Symposium on Distributed Computing (DISC’15). 170--184.
[7]
Maya Arbel-Raviv and Trevor Brown. 2018. Harnessing epoch-based reclamation for efficient range queries. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’18), Andreas Krall and Thomas R. Gross (Eds.). ACM, 14--27.
[8]
Naama Ben-David, Guy E. Blelloch, Yihan Sun, and Yuanhao Wei. 2019. Multiversion concurrency with bounded delay and precise garbage collection. In Proceedings of the 31st ACM on Symposium on Parallelism in Algorithms and Architectures (SPAA’19), Christian Scheideler and Petra Berenbrink (Eds.). ACM, 241--252.
[9]
Philip A. Bernstein, Vassos Hadzilacos, and Nathan Goodman. 1987. Concurrency Control and Recovery in Database Systems. Addison-Wesley.
[10]
Anastasia Braginsky and Erez Petrank. 2011. Locality-conscious lock-free linked lists. In Proceedings of the International Conference on Distributed Computing and Networking (ICDCN’11). 107--118.
[11]
Anastasia Braginsky and Erez Petrank. 2012. A lock-free b+tree. In Proceedings of the 31st ACM on Symposium on Parallelism in Algorithms and Architectures (SPAA’12). 58--67.
[12]
Anastasia Braginsky, Erez Petrank, and Nachshon Cohen. 2016. CBPQ: High performance lock-free priority queue. In Proceedings of the European Conference on Parallel and Distributed Computing (Euro-Par’16).
[13]
Nathan Grasso Bronson, Jared Casper, Hassan Chafi, and Kunle Olukotun. 2010. A practical concurrent binary search tree. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’10). 257--268.
[14]
Trevor Brown and Hillel Avni. 2012. Range queries in non-blocking k-ary search trees. In Proceedings of the Conference on Principles of Distributed Systems (OPODIS’12). 31--45.
[15]
Trevor Brown, Faith Ellen, and Eric Ruppert. 2013. Pragmatic primitives for non-blocking data structures. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC’13), Panagiota Fatourou and Gadi Taubenfeld (Eds.). ACM, 13--22.
[16]
Trevor Brown, Faith Ellen, and Eric Ruppert. 2014. A general technique for non-blocking trees. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’14), José E. Moreira and James R. Larus (Eds.). ACM, 329--342.
[17]
Trevor Brown and Joanna Helga. 2011. Non-blocking k-ary search trees. In Proceedings of the Conference on Principles of Distributed Systems (OPODIS’11). 207--221.
[18]
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2008. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26, 2 (June 2008), 4:1--4:26.
[19]
Bapi Chatterjee. 2016. Lock-free linearizable 1-dimensional range queries. In Proceedings of the Workshop on the Theory of Transactional Memory (WTTM’16).
[20]
Panagiota Fatourou, Elias Papavasileiou, and Eric Ruppert. 2019. Persistent non-blocking binary search trees supporting wait-free range queries. In Proceedings of the 31st ACM on Symposium on Parallelism in Algorithms and Architectures (SPAA’19), Christian Scheideler and Petra Berenbrink (Eds.). ACM, 275--286.
[21]
K. Fraser. 2004. Practical lock-freedom. In Ph.D. Dissertation, University of Cambridge.
[22]
Guy Golan-Gueta, Edward Bortnikov, Eshcar Hillel, and Idit Keidar. 2015. Scaling concurrent log-structured data stores. In Proceedings of the European Conference on Computer Systems (EuroSys’15). 32:1--32:14.
[23]
Vincent Gramoli. 2015. More than you ever wanted to know about synchronization: Synchrobench, measuring the impact of the synchronization on concurrent algorithms. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’15).
[24]
Danny Hendler, Nir Shavit, and Lena Yerushalmi. 2004. A scalable lock-free stack algorithm. In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’04). 206--215.
[25]
Maurice Herlihy, Yossi Lev, Victor Luchangco, and Nir Shavit. 2007. A simple optimistic skiplist algorithm. In Proceedings of the International Colloquium on Structural Information and Communication Complexity (SIROCCO’07). 124--138.
[26]
Maurice Herlihy and Nir Shavit. 2008. The Art of Multiprocessor Programming. Morgan Kaufmann Publishers.
[27]
Maurice Herlihy and Jeannette M. Wing. 1990. Linearizability: A correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12, 3 (1990), 463--492.
[28]
Idit Keidar and Dmitri Perelman. 2015. Multi-versioning in transactional memory. In Transactional Memory; Foundations, Algorithms, Tools, and Applications. Vol. 8913. Chapter 7, 150--165.
[29]
Idit Keidar and Dmitri Perelman. 2015. Multi-versioning in transactional memory. In Proceedings of the European Conference on Transactional Memory. Foundations, Algorithms, Tools, and Applications—COST Action (Euro-TM’15), Rachid Guerraoui and Paolo Romano (Eds.). Lecture Notes in Computer Science, Vol. 8913. Springer, 150--165.
[30]
Alex Kogan and Erez Petrank. 2012. A methodology for creating fast wait-free data structures. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’12). 141--150.
[31]
David B. Lomet, Sudipta Sengupta, and Justin J. Levandoski. 2013. The bw-tree: A b-tree for new hardware platforms. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’13). 302--313.
[32]
Yandong Mao, Eddie Kohler, and Robert Tappan Morris. 2012. Cache craftiness for fast multicore key-value storage. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys’12). 183--196.
[33]
Aravind Natarajan and Neeraj Mittal. 2014. Fast concurrent lock-free binary search trees. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’14). 317--328.
[34]
Dmitri Perelman, Anton Byshevsky, Oleg Litmanovich, and Idit Keidar. 2011. SMV: Selective multi-versioning STM. In Proceedings of the International Symposium on Distributed Computing (DISC’11) (Lecture Notes in Computer Science), David Peleg (Ed.), Vol. 6950. Springer, 125--140.
[35]
Erez Petrank and Shahar Timnat. 2013. Lock-free data-structure iterators. In Proceedings of the International Symposium on Distributed Computing (DISC’13). 224--238.
[36]
Aleksandar Prokopec, Nathan Grasso Bronson, Phil Bagwell, and Martin Odersky. 2012. Concurrent tries with efficient non-blocking snapshots. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’12). 151--160.
[37]
Benjamin Sowell, Wojciech Golab, and Mehul A. Shah. 2012. Minuet: A scalable distributed multiversion b-tree. Proc. VLDB Endow. 5, 9 (May 2012), 884--895.
[38]
Alexander Spiegelman, Guy Golan-Gueta, and Idit Keidar. 2016. Transactional data structure libraries. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI’16). 682--696.
[39]
Yihan Sun, Daniel Ferizovic, and Guy E. Blelloch. 2018. PAM: parallel augmented maps. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’18), Andreas Krall and Thomas R. Gross (Eds.). ACM, 290--304.
[40]
Jennifer L. Welch and Hagit Attiya. 2004. Distributed Computing: Fundamentals, Simulations and Advanced Topics (2nd ed.). John Wiley Interscience.

Cited By

View all
  • (2024)Data Temperature Informed Streaming for Optimising Large-Scale Multi-Tiered StorageBig Data Mining and Analytics10.26599/BDMA.2023.90200397:2(371-398)Online publication date: Jun-2024
  • (2024)VERLIB: Concurrent Versioned PointersProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638501(200-214)Online publication date: 2-Mar-2024
  • (2024)CPMA: An Efficient Batch-Parallel Compressed Set Without PointersProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638492(348-363)Online publication date: 2-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Parallel Computing
ACM Transactions on Parallel Computing  Volume 7, Issue 3
Special Issue on PPoPP 2017 (Part 2) and Regular Papers
September 2020
182 pages
ISSN:2329-4949
EISSN:2329-4957
DOI:10.1145/3407694
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 2020
Accepted: 01 April 2020
Revised: 01 March 2020
Received: 01 June 2018
Published in TOPC Volume 7, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Concurrent data structures
  2. key-value maps

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)2
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Data Temperature Informed Streaming for Optimising Large-Scale Multi-Tiered StorageBig Data Mining and Analytics10.26599/BDMA.2023.90200397:2(371-398)Online publication date: Jun-2024
  • (2024)VERLIB: Concurrent Versioned PointersProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638501(200-214)Online publication date: 2-Mar-2024
  • (2024)CPMA: An Efficient Batch-Parallel Compressed Set Without PointersProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638492(348-363)Online publication date: 2-Mar-2024
  • (2023)BP-Tree: Overcoming the Point-Range Operation Tradeoff for In-Memory B-TreesProceedings of the VLDB Endowment10.14778/3611479.361150216:11(2976-2989)Online publication date: 24-Aug-2023
  • (2023)Practically and Theoretically Efficient Garbage Collection for MultiversioningProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577508(66-78)Online publication date: 25-Feb-2023
  • (2023)Opportunities and Limitations of Hardware Timestamps in Concurrent Data Structures2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00068(624-634)Online publication date: May-2023
  • (2022)A GPU Multiversion B-TreeProceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1145/3559009.3569681(481-493)Online publication date: 8-Oct-2022
  • (2022)HybriDSProceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3490148.3538591(321-332)Online publication date: 11-Jul-2022
  • (2021)Constant-time snapshots with applications to concurrent data structuresProceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3437801.3441602(31-46)Online publication date: 17-Feb-2021

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media