Abstract
Range queries are real demands in big data scenarios, such as analytic and time-traveling queries over web archives. Here we design AdaSI, an adaptive partition-based caching approach for efficient range queries on key-value data. AdaSI partitions data into a number of data slices (consecutive data items). Then the AdaSI Hotscore Algorithm is designed to maximize the cache-hit probability under the limitation of cache space. By measuring Dutyrate and Hotscore of data slice, the partitioning precision and adjustment sensitivity are pursued by finer partitioning on hot data, whereas the cold data are partitioned with relatively larger granularity to reduce storage overhead and search cost of queries. Our results show that the AdaSI Hotscore Algorithm could obtain a cache hit rate nearly as high as the record-based cache policies, as well as a significant speedup and space reduction, far outperforming record-based policies.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Canim, M., Mihaila, G.A., Bhattacharjee, et al.: SSD bufferpool extensions for database systems. In: 36th International Conference on Very Large Data Bases, pp. 1435–1446. VLDB Endowment, Singapore (2010)
Levandoski, J.J., Larson, P., Stoica, R.: Identifying hot and cold data in main-memory databases. In: 29th IEEE International Conference on Data Engineering (ICDE), pp. 26C–37. IEEE Press, Brisbane (2013)
Sfakianakis, G., Patlakas, I., Ntarmos, N., Triantafillou, P.: Interval indexing and querying on key-value cloud stores. In: 29th IEEE International Conference on Data Engineering (ICDE), p. 805–816. IEEE Press, Brisbane (2013)
Bentley, J.L.: Solutions to Klee’s rectangle problem, Technical report. Carnegie-Mellon University, Pittsburgh (1977)
Wu, S., Jiang, D., Ooi, B.C., Wu, K.L.: Efficient b-tree based indexing for cloud data processing. In: 36th International Conference on Very Large Data Bases, pp. 1207–1218. VLDB Endowment, Singapore (2010)
Lu, P., Wu, S., Shou, L., Tan, K.L.: An efficient and compact indexing scheme for large-scale data store. In: the 29th IEEE International Conference on Data Engineering, pp. 326–337. IEEE Press, Brisbane, Australia (2013)
Feelifl, H., Kitsuregawa, M.: The simulation evaluation of heat balancing strategies for btree index over parallel shared nothing machines. IEIC Technical report (Institute of Electronics, Information and Communication Engineers), vol. 99, pp. 7–12 (1999)
Lee, J.G., Attaluri, G.K., et al.: Joins on encoded and partitioned data. In: 40th International Conference on Very Large Data Bases, pp. 1355–1366. VLDB Endowment, Hangzhou, China (2014)
Chen, C., Li, F., Ooi, B.C., Wu, S.: TI: an efficient indexing mechanism for real-time search on tweets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 649–660. ACM, Athens, Greece (2011)
Wu, L., Lin, W., Xiao, X., Xu, Y.: LSII: an indexing structure for exact real-time search on microblogs. In: 29th IEEE International Conference on Data Engineering, pp. 482–493. IEEE Press, Brisbane, Australia (2013)
Ungureanu, C., Debnath, B., Rago, S., Aranya, A.: TBF: a memory-efficient replacement policy for flash-based caches. In: 29th IEEE International Conference on Data Engineering Brisbane (ICDE), pp. 1117–1128. IEEE Press, Brisbane (2013)
Pugh, W.: Skip lists: a probabilistic alternative to balanced trees. Commun. ACM 33(6), 668–676 (1990)
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: 1st ACM Symposium on Cloud Computing, pp. 143–154, Santa Clara, CA (2010)
Acknowledgments
This work is funded by China NSF Grants (61223003,61572250,61362006), Jiangsu Province Science & Technology Research Grant (BE2014131), Guangxi NSF (2014GXNSFBA118288) and Guangxi IBAYT Program (KY2016YB065).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ge, W., Chen, M., Yuan, C., Huang, Y. (2016). An Adaptive Partition-Based Caching Approach for Efficient Range Queries on Key-Value Data. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9932. Springer, Cham. https://doi.org/10.1007/978-3-319-45817-5_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-45817-5_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45816-8
Online ISBN: 978-3-319-45817-5
eBook Packages: Computer ScienceComputer Science (R0)