Fast Multi-fields Query Processing in Bigtable Based Cloud Systems

Wang, Haiping; Ci, Xiang; Meng, Xiaofeng

doi:10.1007/978-3-642-38562-9_15

Haiping Wang²¹,
Xiang Ci²¹ &
Xiaofeng Meng²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7923))

Included in the following conference series:

International Conference on Web-Age Information Management

3464 Accesses
1 Citations

Abstract

With the rapid increase of data sizes, enterprise applications are migrating their backend data management and analytic systems into cloud based data management systems.Bigtable is among one of the major data models used by cloud storage systems as their storage layer. Such systems provide high scalability and schema flexibility, and support efficient point and range based queries based on rowkeys. However, Bigtable based systems have limited support on non-rowkey based queries and multiple-fields based queries, due to much overhead on invoking extra scanning of data. In this paper, we develop a system TNBGR(Telecom Network Browsing Gateway Records) on managing and querying large scale telecommunication data. TNBGR is built on top of HBase and MapReduce, with a focus on optimizing multi-fields query processing. TNBGR provides a novel application and system resource aware data allocation strategy to minimize data access through multi-layer region partitioning, resource parameterization, and balanced region distribution.The query composition dynamically updates application parameters based on tracked system statistics and automatically translates queries for MapReduce. Through additional query optimization by improving region locality, TNBGR achieves high efficiency on supporting multi-field queries. The experimental results show that our solution improves the performance of the queries by about 5 and 18 times respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, P., Silberstein, A., Cooper, B., Srivastava, U., Ramakrishnan, R.: Asynchronous view maintenance for vlsd databases. In: SIGMOD 2009, pp. 179–192. ACM (2009)
Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. Communications of the ACM 51(1), 107–113 (2008)
Article Google Scholar
Ding, L., Qiao, B., Wang, G., Chen, C.: An efficient quad-tree based index structure for cloud data management. In: Wang, H., Li, S., Oyama, S., Hu, X., Qian, T. (eds.) WAIM 2011. LNCS, vol. 6897, pp. 238–250. Springer, Heidelberg (2011)
Chapter Google Scholar
Chang, F., Dean, J., Ghemawat, S., et al.: Bigtable: A distributed storage system for structured data. In: OSDI 2006, pp. 205–218 (2006)
Google Scholar
Kellerman, J.: Hbase: Structured storage of sparse data for hadoop (2009), http://hbase.apache.org/
Kennedy, J.: Ithbase (2012), https://github.com/hbase-trx/hbase-transactional-tableindexed
Papadopoulos, A., Katsaros, D.: A-tree: Distributed indexing of multidimensional data for cloud computing environments. In: CloudCom 2011, pp. 407–414 (2011)
Google Scholar
Wang, J., Wu, S., Gao, H., Li, J., Ooi, B.: Indexing multi-dimensional data in a cloud system. In: SIGMOD 2010, pp. 591–602. ACM (2010)
Google Scholar
ykulbak. Ihbase (2012), https://github.com/ykulbak/ihbase
Zhang, X., Ai, J., Wang, Z., Lu, J., Meng, X.: An efficient multi-dimensional index for cloud data management. In: CloudDB 2009, pp. 17–24. ACM (2009)
Google Scholar
Zou, Y., Liu, J., Wang, S., Zha, L., Xu, Z.: CCIndex: A complemental clustering index on distributed ordered tables for multi-dimensional range queries. In: Ding, C., Shao, Z., Zheng, R. (eds.) NPC 2010. LNCS, vol. 6289, pp. 247–261. Springer, Heidelberg (2010)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Information, Renmin University, Beijing, China
Haiping Wang, Xiang Ci & Xiaofeng Meng

Authors

Haiping Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Ci
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Meng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
Jianyong Wang
Management Science and Information Systems Department, Rutgers, the State University of New Jersey, 1, Washington Park, 07102, Newark, NJ, USA
Hui Xiong
Department of Information Engineering, Nagoya University, 464-8601, Nagoya, Japan
Yoshiharu Ishikawa
Department of Computer Science, Hong Kong Baptist University, Hong Kong
Jianliang Xu
School of Information Science and Engineering, Yanshan University, Qinhuangdao, China
Junfeng Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, H., Ci, X., Meng, X. (2013). Fast Multi-fields Query Processing in Bigtable Based Cloud Systems. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds) Web-Age Information Management. WAIM 2013. Lecture Notes in Computer Science, vol 7923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38562-9_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-38562-9_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38561-2
Online ISBN: 978-3-642-38562-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics