Skip to main content

Fast Multi-fields Query Processing in Bigtable Based Cloud Systems

  • Conference paper
Web-Age Information Management (WAIM 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7923))

Included in the following conference series:

Abstract

With the rapid increase of data sizes, enterprise applications are migrating their backend data management and analytic systems into cloud based data management systems.Bigtable is among one of the major data models used by cloud storage systems as their storage layer. Such systems provide high scalability and schema flexibility, and support efficient point and range based queries based on rowkeys. However, Bigtable based systems have limited support on non-rowkey based queries and multiple-fields based queries, due to much overhead on invoking extra scanning of data. In this paper, we develop a system TNBGR(Telecom Network Browsing Gateway Records) on managing and querying large scale telecommunication data. TNBGR is built on top of HBase and MapReduce, with a focus on optimizing multi-fields query processing. TNBGR provides a novel application and system resource aware data allocation strategy to minimize data access through multi-layer region partitioning, resource parameterization, and balanced region distribution.The query composition dynamically updates application parameters based on tracked system statistics and automatically translates queries for MapReduce. Through additional query optimization by improving region locality, TNBGR achieves high efficiency on supporting multi-field queries. The experimental results show that our solution improves the performance of the queries by about 5 and 18 times respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, P., Silberstein, A., Cooper, B., Srivastava, U., Ramakrishnan, R.: Asynchronous view maintenance for vlsd databases. In: SIGMOD 2009, pp. 179–192. ACM (2009)

    Google Scholar 

  2. Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. Communications of the ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  3. Ding, L., Qiao, B., Wang, G., Chen, C.: An efficient quad-tree based index structure for cloud data management. In: Wang, H., Li, S., Oyama, S., Hu, X., Qian, T. (eds.) WAIM 2011. LNCS, vol. 6897, pp. 238–250. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  4. Chang, F., Dean, J., Ghemawat, S., et al.: Bigtable: A distributed storage system for structured data. In: OSDI 2006, pp. 205–218 (2006)

    Google Scholar 

  5. Kellerman, J.: Hbase: Structured storage of sparse data for hadoop (2009), http://hbase.apache.org/

  6. Kennedy, J.: Ithbase (2012), https://github.com/hbase-trx/hbase-transactional-tableindexed

  7. Papadopoulos, A., Katsaros, D.: A-tree: Distributed indexing of multidimensional data for cloud computing environments. In: CloudCom 2011, pp. 407–414 (2011)

    Google Scholar 

  8. Wang, J., Wu, S., Gao, H., Li, J., Ooi, B.: Indexing multi-dimensional data in a cloud system. In: SIGMOD 2010, pp. 591–602. ACM (2010)

    Google Scholar 

  9. ykulbak. Ihbase (2012), https://github.com/ykulbak/ihbase

  10. Zhang, X., Ai, J., Wang, Z., Lu, J., Meng, X.: An efficient multi-dimensional index for cloud data management. In: CloudDB 2009, pp. 17–24. ACM (2009)

    Google Scholar 

  11. Zou, Y., Liu, J., Wang, S., Zha, L., Xu, Z.: CCIndex: A complemental clustering index on distributed ordered tables for multi-dimensional range queries. In: Ding, C., Shao, Z., Zheng, R. (eds.) NPC 2010. LNCS, vol. 6289, pp. 247–261. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, H., Ci, X., Meng, X. (2013). Fast Multi-fields Query Processing in Bigtable Based Cloud Systems. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds) Web-Age Information Management. WAIM 2013. Lecture Notes in Computer Science, vol 7923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38562-9_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38562-9_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38561-2

  • Online ISBN: 978-3-642-38562-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics