Optimizing B $$^+$$ -Tree Searches on Coupled CPU-GPU Architectures

Huang, Han; Luan, Hua

doi:10.1007/978-3-030-60245-1_28

Optimizing B$^+$-Tree Searches on Coupled CPU-GPU Architectures

Han Huang⁹ &
Hua Luan⁹

Conference paper
First Online: 29 September 2020

1540 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12452))

Abstract

The B$^+$-tree is an important index in the fields of data warehousing and database management systems. With the development of new hardware technologies, the B$^+$-tree needs to be revisited to fully take advantage of hardware resources. In this paper, we focus on optimization techniques to increase the searching performance of B$^+$-trees on the coupled CPU-GPU architecture. First, we propose a hierarchical searching approach on the single coupled GPU to efficiently deal with leaf nodes of B$^+$-trees. It adopts a flexible strategy to determine the number of work items in a work group to search one key in order to reduce irregular memory accesses and divergent branches in the work group. Second, we present a co-processing pipeline method on the coupled architecture. The CPU and the integrated GPU process the sorting and searching tasks simultaneously to hide sorting and partial searching latencies. A distribution model is designed to support the workload balance strategy based on real-time performance. Our performance study shows that the hierarchical searching scheme provides an improvement up to 36% on the GPU compared to the baseline algorithm with fixed number of work items and the co-processing pipeline method further increases the throughput by a factor of 1.8. To the best of our knowledge, this paper is the first study to consider both the CPU and the coupled GPU to optimize B$^+$-trees searches.

Supported by the National Key R&D Program of China (No. 2017YFC0804004), and a grant from the Capital Science and Technology Innovation Vouchers of China.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Awad, M.A., Ashkiani, S., Johnson, R., Farach-Colton, M., Owens, J.D.: Engineering a high-performance GPU B-Tree. In: Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming. pp. 145–157. ACM (2019)
Google Scholar
Chen, L., Huo, X., Agrawal, G.: Accelerating MapReduce on a coupled CPU-GPU architecture. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. pp. 25:1–25:11. IEEE (2012)
Google Scholar
Comer, D.: The ubiquitous B-tree. ACM Comput. Surv. 11(2), 121–137 (1979)
Article MathSciNet Google Scholar
Daga, M., Nutter, M.: Exploiting coarse-grained parallelism in B+ tree searches on an APU. In: 2012 SC Companion: High Performance Computing, Networking Storage and Analysis. pp. 240–247. IEEE (2012)
Google Scholar
Daga, M., Nutter, M., Meswani, M.: Efficient breadth-first search on a heterogeneous processor. In: 2014 IEEE International Conference on Big Data. pp. 373–382. IEEE (2015)
Google Scholar
Fix, J., Wilkes, A., Skadron, K.: Accelerating braided B+ tree searches on a GPU with CUDA. In: Proceedings of the 2nd Workshop on Applications for Multi and Many Core Processors: Analysis, Implementation, and Performance. (2011)
Google Scholar
Graefe, G., Kuno, H.: Modern B-tree techniques. In: 2011 IEEE 27th International Conference on Data Engineering. pp. 1370–1373. IEEE (2011)
Google Scholar
He, J., Lu, M., He, B.: Revisiting co-processing for hash joins on the coupled CPU-GPU architecture. Proceedings of the VLDB Endowment 6(10), 889–900 (2013)
Article Google Scholar
He, J., Zhang, S., He, B.: In-cache query co-processing on coupled CPU-GPU architectures. Proceedings of the VLDB Endowment 8(4), 329–340 (2014)
Article Google Scholar
Helluy, P.: A portable implementation of the radix sort algorithm in OpenCL (2011), https://hal.archives-ouvertes.fr/hal-00596730
Kaczmarski, K.: Experimental B+-tree for GPU. In: Proceedings II of the 15th East-European Conference on Advances in Databases and Information Systems. pp. 232–241 (2011)
Google Scholar
Levandoski, J.J., Lomet, D.B., Sengupta, S.: The Bw-tree: a B-tree for new hardware platforms. In: 2013 IEEE 29th International Conference on Data Engineering. pp. 302–313. IEEE (2013)
Google Scholar
Luan, H., Chang, L.: An evaluation of analytical queries on CPUs and coupled GPUs. Concurrency and Computation: Practice and Experience 29(5), e3982 (2017)
Article Google Scholar
Ramakrishnan, R., Gehrke, J.: Database management systems. 3rd edn. McGraw-Hill(2002)
Google Scholar
Sewall, J., Chhugani, J., Kim, C., Satish, N., Dubey, P.: PALM: parallel architecture-friendly latch-free modifications to B+ trees on many-core processors. Proceedings of the VLDB Endowment 4(11), 795–806 (2011)
Article Google Scholar
Shahvarani, A., Jacobsen, H.A.: A hybrid B+-tree as solution for in-memory indexing on CPU-GPU heterogeneous computing platforms. In: Proceedings of the 2016 International Conference on Management of Data. pp. 1523–1538. ACM (2016)
Google Scholar
Stone, J.E., Gohara, D., Shi, G.: OpenCL: a parallel programming standard for heterogeneous computing systems. Computing in Science & Engineering 12(3), 66–73 (2010)
Article Google Scholar
Yan, Z., Lin, Y., Peng, L., Zhang, W.: Harmonia: a high throughput B+tree for GPUs. In: Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming. pp. 133–144. ACM (2019)
Google Scholar
Yuan, Y., Lee, R., Zhang, X.: The yin and yang of processing data warehousing queries on GPU devices. Proceedings of the VLDB Endowment 6(10), 817–828 (2013)
Article Google Scholar
Zhang, F., Zhai, J., He, B., Zhang, S., Chen, W.: Understanding co-running behaviors on integrated CPU/GPU architectures. IEEE Transactions on Parallel and Distributed Systems 28(3), 905–918 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Beijing Normal University, Beijing, China
Han Huang & Hua Luan

Authors

Han Huang
View author publications
You can also search for this author in PubMed Google Scholar
Hua Luan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hua Luan .

Editor information

Editors and Affiliations

Columbia University, New York, NY, USA
Meikang Qiu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, H., Luan, H. (2020). Optimizing B$^+$-Tree Searches on Coupled CPU-GPU Architectures. In: Qiu, M. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2020. Lecture Notes in Computer Science(), vol 12452. Springer, Cham. https://doi.org/10.1007/978-3-030-60245-1_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-60245-1_28
Published: 29 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60244-4
Online ISBN: 978-3-030-60245-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Optimizing B\(^+\)-Tree Searches on Coupled CPU-GPU Architectures

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Abstract

Buying options

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation