Abstract
In this paper, we examine the design tradeoffs of existing in-memory data structures of a state-of-the-art key-value store. We observe that no data structures provide both fast point-accesses and consistent ranged- retrievals, and naive amalgamations of existing structures fail to get the best of both worlds. Furthermore, our experiments reveal a performance anomaly when increasing the memory size: as more key-value pairs are maintained in memory, the shortcomings of the data structures exacerbate. To address the above problems, we present TeksDB, a fast and consistent key-value store with a novel in-memory data structure, which effciently handles both point- and ranged- accesses at a modest increase in memory footprint. Our evaluation demonstrates that TeksDB outperforms RocksDB by 3.6×, 9×, and 4.5× for get, scan, and range_query, respectively. The effectiveness of TeksDB extends to real-world workloads, achieving up to 3.3× speedup for YCSB.
- Ardb. 2013. Ardb. https://github.com/yinqiwen/ardb .Google Scholar
- Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '12, London, United Kingdom, June 11--15, 2012 . ACM, NewYork, NY, USA, 53--64. Google ScholarDigital Library
- Oana Balmau, Diego Didona, Rachid Guerraoui, Willy Zwaenepoel, Huapeng Yuan, Aashray Arora, Karan Gupta, and Pavan Konka. 2017a. TRIAD: Creating Synergies Between Memory, Disk and Log in Log Structured Key-Value Stores. In 2017 USENIX Annual Technical Conference, USENIX ATC 2017, Santa Clara, CA, USA, July 12--14, 2017. USENIX, Berkely, CA, USA, 363--375. Google ScholarDigital Library
- Oana Balmau, Rachid Guerraoui, Vasileios Trigonakis, and Igor Zablotchi. 2017b. FloDB: Unlocking Memory in Persistent Key-Value Stores. In Proceedings of the Twelfth European Conference on Computer Systems, EuroSys 2017, Belgrade, Serbia, April 23--26, 2017. ACM, NewYork, NY, USA, 80--94. Google ScholarDigital Library
- Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'Neil, and Patrick O'Neil. 1995. A critique of ANSI SQL isolation levels. In ACM SIGMOD Record, Vol. 24. ACM, NewYork, NY, USA, 1--10. Google ScholarDigital Library
- Philip A Bernstein and Nathan Goodman. 1981. Concurrency control in distributed database systems. ACM Computing Surveys (CSUR) , Vol. 13, 2 (1981), 185--221. Google ScholarDigital Library
- Lucas Braun, Thomas Etter, Georgios Gasparis, Martin Kaufmann, Donald Kossmann, Daniel Widmer, Aharon Avitzur, Anthony Iliopoulos, Eliezer Levy, and Ning Liang. 2015. Analytics in Motion: High Performance Event-Processing AND Real-Time Analytics in the Same Database. In Proceedings of the 2015 ACM International Conference on Management of Data, SIGMOD Conference 2015, Melbourne, Victoria, Australia, May 31 - June 4, 2015. ACM, NewYork, NY, USA, 251--264. Google ScholarDigital Library
- cameron314. 2014. Concurrent Queue. https://github.com/cameron314/concurrentqueue.git.Google Scholar
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2008. Bigtable: A Distributed Storage System for Structured Data. ACM Trans. Comput. Syst. , Vol. 26, 2 (2008), 4:1--4:26. Google ScholarDigital Library
- Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC 2010, Indianapolis, Indiana, USA, June 10--11, 2010 . ACM, NewYork, NY, USA, 143--154. Google ScholarDigital Library
- Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal Navigable Key-Value Store. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14--19, 2017. ACM, NewYork, NY, USA, 79--94. Google ScholarDigital Library
- Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: amazon's highly available key-value store. In Proceedings of the 21st ACM Symposium on Operating Systems Principles 2007, SOSP 2007, Stevenson, Washington, USA, October 14--17, 2007. ACM, NewYork, NY, USA, 205--220. Google ScholarDigital Library
- Facebook. 2012. RocksDB. https://github.com/facebook/rocksdb .Google Scholar
- Facebook. 2018. MyRocks. https://myrocks.io .Google Scholar
- The Apache Software Foundation. 2008. Cassandra. https://github.com/apache/cassandra.Google Scholar
- FoundationDB. 2013. FoundationDB. https://www.foundationdb.org/.Google Scholar
- Guy Golan-Gueta, Edward Bortnikov, Eshcar Hillel, and Idit Keidar. 2015. Scaling concurrent log-structured data stores. In Proceedings of the Tenth European Conference on Computer Systems, EuroSys 2015, Bordeaux, France, April 21--24, 2015 . ACM, NewYork, NY, USA, 32:1--32:14. Google ScholarDigital Library
- Google. 2011. LevelDB. https://github.com/google/leveldb .Google Scholar
- Tyler Harter, Dhruba Borthakur, Siying Dong, Amitanand S. Aiyer, Liyin Tang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. Analysis of HDFS under HBase: a facebook messages case study. In Proceedings of the 12th USENIX conference on File and Storage Technologies, FAST 2014, Santa Clara, CA, USA, February 17--20, 2014. USENIX, Berkely, CA, USA, 199--212. Google ScholarDigital Library
- HyperDex. 2011. HyperLevelDB. https://github.com/rescrv/HyperLevelDB .Google Scholar
- Sudarsun Kannan, Nitish Bhat, Ada Gavrilovska, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2018. Redesigning LSMs for Nonvolatile Memory with NoveLSM. In 2018 USENIX Annual Technical Conference, USENIX ATC 2018, Boston, MA, USA, July 11--13, 2018. USENIX, Berkely, CA, USA, 993--1005. Google ScholarDigital Library
- Dong-Yun Lee, Kisik Jeong, Sang-Hoon Han, Jin-Soo Kim, Joo-Young Hwang, and Sangyeun Cho. 2017. Understanding write behaviors of storage backends in Ceph object store. In Proceedings of the 2017 International Conference on Massive Storage Systems and Technology . Santa Clara University, Santa Clara, CA, USA.Google Scholar
- Eunji Lee, Youil Han, Suli Yang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2018. How to Teach an Old File System Dog New Object Store Tricks. In 10th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2018, Boston, MA, USA, July 9--10, 2018. USENIX, Berkely, CA, USA. Google ScholarDigital Library
- LMDB. 2011. LMDB. https://symas.com/lmdb/.Google Scholar
- Simon Loesing, Markus Pilman, Thomas Etter, and Donald Kossmann. 2015. On the Design and Scalability of Distributed Shared-Data Databases. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015. ACM, NewYork, NY, USA, 663--676. Google ScholarDigital Library
- Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Hariharan Gopalakrishnan, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2017. WiscKey: Separating keys from values in SSD-conscious storage. ACM Transactions on Storage (TOS) , Vol. 13, 1 (2017), 5. Google ScholarDigital Library
- Memcached. 2003. Memcached. https://memcached.org .Google Scholar
- Alexander Merritt, Ada Gavrilovska, Yuan Chen, and Dejan Milojicic. 2017. Concurrent log-structured memory for many-core key-value stores. Proceedings of the VLDB Endowment , Vol. 11, 4 (2017), 458--471. Google ScholarDigital Library
- Inc. MongoDB. 2018. MongoDB. https://www.mongodb.com.Google Scholar
- Patrick E. O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth J. O'Neil. 1996. The Log-Structured Merge-Tree (LSM-Tree). Acta Inf. , Vol. 33, 4 (1996), 351--385. Google ScholarDigital Library
- Daniel Peng and Frank Dabek. 2010. Large-scale Incremental Processing Using Distributed Transactions and Notifications. In 9th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2010, October 4--6, 2010, Vancouver, BC, Canada, Proceedings . USENIX, Berkely, CA, USA, 251--264. Google ScholarDigital Library
- Markus Pilman, Kevin Bocksrocker, Lucas Braun, Renato Marroquin, and Donald Kossmann. 2017. Fast Scans on Key-Value Stores. PVLDB , Vol. 10, 11 (2017), 1526--1537. Google ScholarDigital Library
- William Pugh. 1990. Skip Lists: A Probabilistic Alternative to Balanced Trees. Commun. ACM , Vol. 33, 6 (1990), 668--676. Google ScholarDigital Library
- Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. 2017. PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees. In Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China, October 28--31, 2017 . ACM, NewYork, NY, USA, 497--514. Google ScholarDigital Library
- Redis. 2009. Redis. https://redis.io .Google Scholar
- Margo Seltzer and Keith Bostic. 1994. Berkeley DB. http://https://www.oracle.com/database/berkeley-db/index.html .Google Scholar
- TokyoCabinet. 2009. TokyoCabinet. http://fallabs.com/tokyocabinet/.Google Scholar
- Sheng Wang, Tien Tuan Anh Dinh, Qian Lin, Zhongle Xie, Meihui Zhang, Qingchao Cai, Gang Chen, Beng Chin Ooi, and Pingcheng Ruan. 2018. ForkBase: An Efficient Storage Engine for Blockchain and Forkable Applications. PVLDB , Vol. 11, 10 (2018), 1137--1150. Google ScholarDigital Library
- Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A Scalable, High-Performance Distributed File System. In 7th Symposium on Operating Systems Design and Implementation (OSDI '06), November 6--8, Seattle, WA, USA. USENIX, Berkely, CA, USA, 307--320. Google ScholarDigital Library
- WiredTiger. 2016. WiredTiger. http://www.wiredtiger.com/.Google Scholar
Index Terms
- TeksDB: Weaving Data Structures for a High-Performance Key-Value Store
Recommendations
TeksDB: Weaving Data Structures for a High-Performance Key-Value Store
SIGMETRICS '19: Abstracts of the 2019 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer SystemsIn this paper, we examine the design tradeoffs of existing in-memory data structures of a state-of-the-art key-value store. We observe that no data structures provide both fast point-accesses and consistent ranged-retrievals, and naitive amalgamations ...
TeksDB: Weaving Data Structures for a High-Performance Key-Value Store
Key-value stores (KVS) are now an integral part of modern dataintensive systems. thanks to its simplicity, scalability, and efficiency over traditional database systems. Databases such as MySQL employ KVS (in this case, RocksDB as their backend storage ...
An Efficient Memory-Mapped Key-Value Store for Flash Storage
SoCC '18: Proceedings of the ACM Symposium on Cloud ComputingPersistent key-value stores have emerged as a main component in the data access path of modern data processing systems. However, they exhibit high CPU and I/O overhead. Today, due to power limitations it is important to reduce CPU overheads for data ...
Comments