Skip to main content
Log in

Performance Evaluation of Range Queries in Key Value Stores

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Recently there has been a considerable increase in the number of different Key-Value stores, for supporting data storage and applications on the cloud environment. While all these solutions try to offer highly available and scalable services on the cloud, they are significantly different with each other in terms of the architecture and types of the applications, they try to support. Considering three widely-used such systems: Cassandra, HBase and Voldemort; in this paper we compare them in terms of their support for different types of query workloads. We are mainly focused on the range queries. Unlike HBase and Cassandra that have built-in support for range queries, Voldemort does not support this type of queries via its available API. For this matter, practical techniques are presented on top of Voldemort to support range queries. Our performance evaluation is based on mixed query workloads, in the sense that they contain a combination of short and long range queries, beside other types of typical queries on key-value stores such as lookup and update. We show that there are trade-offs in the performance of the selected system and scheme, and the types of the query workloads that can be processed efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J., Rasin, A., Silberschatz, A.: Hadoopdb: an architectural hybrid of mapreduce and dbms technologies for analytical workloads. In: The Proceedings of VLDB Endowment, vol. 2 issue 1, pp. 922–933 (2009)

  2. Agrawal, P., Silberstein, A., Cooper, B.F., Srivastava, U., Ramakrishnan, R.: Asynchronous view maintenance for vlsd databases. SIGMOD ’09. ACM (2009)

  3. Aguilera, M.K., Golab, W., Golab, M.A.: A practical scalable distributed b-tree. In: Proceedings of VLDB Endow., vol. 1, pp. 598–609 (2008)

  4. Andrzejak, A., Xu, Z.: Scalable, efficient range queries for grid information services. Peer-to-Peer Computing, pp. 33–40 (2002)

  5. Apache CouchDB. http://couchdb.apache.org/. Accessed date Nov 2010

  6. Apache HDFS. http://hadoop.apache.org/hdfs/. Accessed date Nov 2010

  7. Aspnes, J., Kirsch, J., Krishnamurthy, A.: Load balancing and locality in range-queriable data structures. PODC ’04, pp. 115–124. ACM (2004)

  8. Binnig, C., Kossmann, D., Kraska, T., Loesing, S.: How is the weather tomorrow?: towards a benchmark for the cloud. In: Proceedings of the 2nd International Workshop on Testing Database Systems, DBTest ’09, pp. 9:1–9:6. ACM (2009)

  9. Brantner, M., Florescu, D., Graf, D.A., Kossmann, D., Kraska, T.: Building a database on s3. SIGMOD Conference, pp. 251–264 (2008)

  10. Cassandra. http://cassandra.apache.org/. Accessed date Nov 2010

  11. Cattell, R.: Scalable sql and nosql data stores. SIGMOD Rec. 39, 12–27 (2011)

    Article  Google Scholar 

  12. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 2, 4:1–4:26 (2008)

    Google Scholar 

  13. Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.-A., Puz, N., Weaver, D., Yerneni, R.: Pnuts: Yahoo!’s hosted data serving platform. Proc. VLDB Endow. 1, 1277–1288 (2008)

    Google Scholar 

  14. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with ycsb. SoCC, pp. 143–154 (2010)

  15. Ganesan, P., Bawa, M., Garcia-molina, H.:Online balancing of range-partitioned data with applications to Peer-to-Peer systems. In: VLDB, pp. 444–455 (2004)

  16. Ganesan, P., Yang, B., Garcia-Molina, H.: One torus to rule them all: multidimensional queries in p2p systems. WebDB (2004)

  17. Gray, J., Sundaresan, P., Englert, S., Baclawski, K., Weinberger, P.J.: Quickly generating billion-record synthetic databases. SIGMOD Rec. 23, 243–252 (1994)

    Article  Google Scholar 

  18. Gupta, A., Agrawal, D., Abbadi, A.E.: Approximate range selection queries in peer-to-peer systems. CIDR (2003)

  19. Hastorun, D., Jampani, M., Kakulapati, G., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazons highly available key-value store. In: Proceedings of SOSP, pp. 205–220 2007

  20. HBase. http://hbase.apache.org/. Accessed date Nov 2010

  21. Jagadish, H.V., Ooi, B.C., Vu, Q.H.: Baton: a balanced tree structure for peer-to-peer networks. In: VLDB, pp. 661–672 (2005)

  22. Lehman, P.L., Yao, S.B.: Efficient locking for concurrent operations on b-trees. ACM Trans. Database Syst. 6(4) 650–670 (1981)

    Article  MATH  Google Scholar 

  23. Lomet, D.: Replicated indexes for distributed data. DIS ’96, IEEE Computer Society, pp. 108–119 (1996)

  24. MongoDB. http://www.mongodb.org/. Accessed date Nov 2010

  25. Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. SIGMOD Conference, pp. 165–178 (2009)

  26. Pitoura, T., Ntarmos, N., Triantafillou, P.: Replication, load balancing and efficient range query processing in dhts. EDBT, pp. 131–148 (2006)

  27. Project Voldemort. http://project-voldemort.com/. Accessed date Nov 2010

  28. Ramabhadran, S., Ratnasamy, S., Hellerstein, J.M., Shenker, S.: Brief announcement: prefix hash tree. PODC ’04. ACM (2004)

  29. Sahin, O.D., Gupta, A., Agrawal, D., Abbadi, A.E.: A peer-to-peer framework for caching range queries. ICDE, pp. 165–176 (2004)

  30. Schütt, T., Schintke, F., Reinefeld, A.: Structured overlay without consistent hashing: empirical results. CCGRID (2006)

  31. Schütt, T., Schintke, F., Reinefeld, A.: Range queries on structured overlay networks. Computer Communications, vol. 31 (2008)

  32. Shi, Y., Meng, X., Zhao, J., Hu, X., Liu, B., Wang, H.: Benchmarking cloud-based data management systems, In: Proceedings of the second international workshop on Cloud data management. CloudDB ’10, pp. 47–54. ACM (2010)

  33. Vo, H.T., Chen, C., Ooi, B.C.: Towards elastic transactional cloud storage with range query support. In: The Proceedings of VLDB Endowment, vol. 3, pp. 506–517 (2010)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pouria Pirzadeh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pirzadeh, P., Tatemura, J., Po, O. et al. Performance Evaluation of Range Queries in Key Value Stores. J Grid Computing 10, 109–132 (2012). https://doi.org/10.1007/s10723-012-9214-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-012-9214-7

Keywords

Navigation