Skip to main content

BF-Matrix: A Secondary Index for the Cloud Storage

  • Conference paper
Web-Age Information Management (WAIM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8485))

Included in the following conference series:

Abstract

Although people have proposed many kinds of NoSQL databases, also referred as Key-Value stores, there is still lack of an efficient solution for the problem of non-key attribute queries. In this paper, we propose BF-Matrix, a hierarchical index composed of bloom filter and B+ tree. Faced with the massive data and the large scale cluster, the layered solution could shorten the search path and make the best of scattered resources. Moreover, it is able to scale up and scale back according to the changes of data size and cluster scale, and isolate the job of update and retrieval in a limited scope. To eliminate the risk of false negative and to ensure our index “look like consistent”, two rules are given to specify the behavior of index update and data retrieval . Experimental results demonstrate that our solution not only outperforms the state of the art, but also is flexible enough to adapt to the cloud environment.

This work was supported by Natural Science Foundation of China (No.60973002 and No.61170003), the National High Technology Research and Development Program of China (Grant No. 2012AA011002), and MOE-CMCC Research Fund.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Gruber, R.E.: Bigtable: A distributed structured data storage system. In: Proc. of 7th OSDI, pp. 305–314 (2006)

    Google Scholar 

  2. Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review 44(2), 35–40 (2010)

    Article  Google Scholar 

  3. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: Proc. of SOSP, vol. 7, pp. 205–220 (2007)

    Google Scholar 

  4. Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive: a warehousing solution over a map-reduce framework. In: PVLDB, vol. 2(2), pp. 1626–1629 (2009)

    Google Scholar 

  5. Aguilera, M.K., Golab, W., Shah, M.A.: A practical scalable distributed b-tree. In: PVLDB, vol. 1(1), pp. 598–609 (2008)

    Google Scholar 

  6. Dittrich, J., Quian-Ruiz, J.A., Richter, S., Schuh, S., Jindal, A., Schad, J.: Only aggressive elephants are fast elephants. In: PVLDB, vol. 5(11), pp. 1591–1602 (2012)

    Google Scholar 

  7. Dittrich, J., Quian-Ruiz, J.A., Jindal, A., Kargin, Y., Setty, V., Schad, J.: Hadoop++: Making a yellow elephant run like a cheetah (without it even noticing). In: PVLDB, vol. 3(1-2), pp. 515–529 (2010)

    Google Scholar 

  8. Wu, S., Jiang, D., Ooi, B.C., Wu, K.L.: Efficient b-tree based indexing for cloud data processing. In: PVLDB, vol. 3(1-2), pp. 1207–1218 (2010)

    Google Scholar 

  9. Wang, J., Wu, S., Gao, H., Li, J., Ooi, B.C.: Indexing multi-dimensional data in a cloud system. In: Procs. of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 591–602. ACM, NY (2010)

    Chapter  Google Scholar 

  10. Zhang, X., Ai, J., Wang, Z., Lu, J., Meng, X.: An efficient multi-dimensional index for cloud data management. In: Procs. of the CloudDB 2009, pp. 17–24. ACM, NY (2009)

    Google Scholar 

  11. Lu, P., Wu, S., Shou, L., Tan, K.L.: An efficient and compact indexing scheme for large-scale data store. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 326–337 (2013)

    Google Scholar 

  12. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Communications of the ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  13. Broder, A., Mitzenmacher, M.: Network applications of bloom filters: A survey. Internet Mathematics 1(4), 485–509 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  14. Tarkoma, S., Rothenberg, C.E., Lagerspetz, E.: Theory and practice of bloom filters for distributed systems. IEEE Communications Surveys & Tutorials 14(1), 131–155 (2012)

    Article  Google Scholar 

  15. Fan, L., Cao, P., Almeida, J., Broder, A.Z.: Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Transactions on Networking (TON) 8(3), 281–293 (2000)

    Article  Google Scholar 

  16. Almeida, P.S., Baquero, C., Preguica, N., Hutchison, D.: Scalable bloom filters. Information Processing Letters 101(6), 255–261 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  17. Guo, D., Wu, J., Chen, H., Yuan, Y., Luo, X.: The dynamic bloom filters. IEEE Transactions on Knowledge and Data Engineering 22(1), 120–133 (2010)

    Article  Google Scholar 

  18. Wang, T.J., Lin, Z.Y., Yang, B.S., et al.: MBA: A market-based approach to data allocation and dynamic migration for cloud database. Science China Information Sciences 55(9), 1935–1948 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Cheng, X., Li, H., Wang, Y., Wang, T., Yang, D. (2014). BF-Matrix: A Secondary Index for the Cloud Storage. In: Li, F., Li, G., Hwang, Sw., Yao, B., Zhang, Z. (eds) Web-Age Information Management. WAIM 2014. Lecture Notes in Computer Science, vol 8485. Springer, Cham. https://doi.org/10.1007/978-3-319-08010-9_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08010-9_40

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08009-3

  • Online ISBN: 978-3-319-08010-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics