Skip to main content

Advertisement

Log in

Hetero-DB: Next Generation High-Performance Database Systems by Best Utilizing Heterogeneous Computing and Storage Resources

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

With recent advancement on hardware technologies, new general-purpose high-performance devices have been widely adopted, such as the graphics processing unit (GPU) and solid state drive (SSD). GPU may offer an order of higher throughput for applications with massive data parallelism, compared with the multicore CPU. Moreover, new storage device SSD is also capable of offering a much higher I/O throughput and lower latency than a traditional hard disk device (HDD). These new hardware devices can significantly boost the performance of many applications; thus the database community has been actively engaging in adopting them into database systems. However, the performance benefit cannot be easily reaped if the new hardwares are improperly used. In this paper, we propose Hetero-DB, a high-performance database system by exploiting both the characteristics of the database system and the special properties of the new hardware devices in system’s design and implementation. Hetero-DB develops a GPU-aware query execution engine with GPU device memory management and query scheduling mechanism to support concurrent query execution. Furthermore, with the SSD-HDD hybrid storage system, we redesign the storage engine by organizing HDD and SSD into a two-level caching hierarchy in Hetero-DB. To best utilize the hybrid hardware devices, the semantic information that is critical for storage I/O is identified and passed to the storage manager, which has a great potential to improve the efficiency and performance. Hetero-DB aims to maximize the performance benefits of GPU and SSD, and demonstrates the effectiveness for designing next generation database systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bandi N, Sun C, Agrawal D, Abbadi A E. Hardware acceleration in commercial databases: A case study of spatial operations. In Proc. the 30th International Conference on Very Large Data Bases (VLDB), Aug. 31-Sept. 3, 2004, pp.1021-1032.

  2. Govindaraju N K, Lloyd B, Wang W, Lin M, Manocha D. Fast computation of database operations using graphics processors. In Proc. the 2004 ACM SIGMOD International Conference on Management of Data, June 2004, pp.215-226.

  3. He B, Yang K, Fang R, Liu M, Govindaraju N, Luo Q, Sander P. Relational joins on graphics processors. In Proc. the 2008 ACM SIGMOD International Conference on Management of Data, June 2008, pp.511-524.

  4. Pirk H, Manegold S, Kersten M. Accelerating foreign-key joins using asymmetric memory channels. In Proc. ADMS, September 2011, pp.27-35.

  5. Govindaraju N, Gray J, Kumar R, Manocha D. Gputerasort: High performance graphics co-processor sorting for large database management. In Proc. ACM SIGMOD, June 2006, pp.325-336.

  6. Satish N, Kim C, Chhugani J, Nguyen A D, Lee V W, Kim D, Dubey P. Fast sort on CPUs and GPUs: A case for bandwidth oblivious SIMD sort. In Proc. the 2010 ACM SIGMOD International Conference on Management of Data, June 2010, pp.351-362.

  7. Fang W, He B, Luo Q. Database compression on graphics processors. Proc. VLDB Endow., 2010, 3(1/2): 670-680.

    Article  Google Scholar 

  8. Sitaridi E A, Ross K A. Ameliorating memory contention of OLAP operators on GPU processors. In Proc. the 8th International Workshop on Data Management on New Hardware (DaMoN), May 2012, pp.39-47.

  9. He B, Yu J X. High-throughput transaction executions on graphics processors. Proc. VLDB Endow., 2011, 4(5): 314-325.

    Article  Google Scholar 

  10. He B, Liu M, Yang K, Fang R, Govindaraju N, Luo Q, Sander P. Relational query coprocessing on graphics processors. ACM Transactions on Database Systems, 2009, 34(4): 21:1–21:39.

  11. Kaldewey T, Lohman G, M¨uller R, Volk P. GPU join processing revisited. In Proc. the 8th International Workshop on Data Management on New Hardware (DaMoN), May 2012, pp.55-62.

  12. Ao N, Zhang F, Wu D, Stones D S, Wang G, Liu X, Liu J, Lin S. Efficient parallel lists intersection and index compression algorithms using graphics processing units. PVLDB, 2011, 4(8): 470-481.

    Google Scholar 

  13. Wu H, Diamos G, Cadambi S, Yalamanchili S. Kernel weaver: Automatically fusing database primitives for efficient GPU computation. In Proc. the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December 2012, pp.107-118.

  14. Lieberman M D, Sankaranarayanan J, Samet H. A fast similarity join algorithm using graphics processing units. In Proc. the 24th ICDE, April 2008, pp.1111-1120.

  15. Wang K, Huai Y, Lee R, Wang F, Zhang X, Saltz J H. Accelerating pathology image data cross-comparison on CPUGPU hybrid systems. Proc. VLDB Endow., 2012, 5(11):1543-1554.

    Article  Google Scholar 

  16. Handy J. Flash memory vs. hard disk drives — Which will win? http://www.storagesearch.com/semicoart1.html, May 2015.

  17. Lee S W, Moon B, Park C, Kim J M, Kim S W. A case for flash memory SSD in enterprise database applications. In Proc. the 2008 ACM SIGMOD International Conference on Management of Data, June 2008, pp.1075-1086.

  18. Mesnier M P, Akers J B. Differentiated storage services. SIGOPS Oper. Syst. Rev., 2011, 45(1): 45-53.

    Article  Google Scholar 

  19. Wang K, Ding X, Lee R, Kato S, Zhang X. GDM: Device memory management for GPGPU computing. SIGMETRICS Perform. Eval. Rev., 2014, 42(1): 533-545.

    Article  Google Scholar 

  20. Canim M, Mihaila G A, Bhattacharjee B, Ross K A, Lang C A. An object placement advisor for DB2 using solid state storage. Proc. VLDB Endow., 2009, 2(2): 1318-1329.

    Article  Google Scholar 

  21. Hassidim A. Cache replacement policies for multicore processors. In Proc. Innovations in Computer Science (ICS), January 2010, pp.501-509.

  22. Sivathanu M, Bairavasundaram L N, Arpaci-Dusseau A C, Arpaci-Dusseau R H. Life or death at block-level. In Proc. the 6th Symposium on Operating Systems Design and Implementation (OSDI), December 2004, pp.379-394.

  23. Lee R, Luo T, Huai Y, Wang F, He Y, Zhang X. YSmart: Yet another SQL-to-MapReduce translator. In Proc. the 31st International Conference on Distributed Computing Systems (ICDCS), June 2011, pp.25-36.

  24. Canim M, Mihaila G A, Bhattacharjee B, Ross K A, Lang C A. SSD bufferpool extensions for database systems. Proc. VLDB Endow., 2010, 3(1/2): 1435-1446.

    Article  Google Scholar 

  25. Do J, Zhang D, Patel J M, DeWitt D J, Naughton J F, Halverson A. Turbocharging DBMs buffer pool using SSDs. In Proc. the 2011 ACM SIGMOD International Conference on Management of Data, June 2011, pp.1113-1124.

  26. Jiang S, Zhang X. LIRS: An efficient low inter-reference recency set replacement policy to improve buffer cache performance. SIGMETRICS Perform. Eval. Rev., 2002, 30(1):31-42.

    Article  Google Scholar 

  27. Balkesen C, Teubner J, Alonso G, ¨Ozsu M T. Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware. In Proc. the 29th ICDE, April 2013, pp.362-373.

  28. Blanas S, Li Y, Patel J. Design and evaluation of main memory hash join algorithms for multi-core CPUs. In Proc. ACM SIGMOD, June 2011, pp.37-48.

  29. Alcantara D A, Sharf A, Abbasinejad F, Sengupta S, Mitzenmacher M, Owens J D, Amenta N. Real-time parallel hashing on the GPU. ACM Trans. Graph., 2009, 28(5):154:1-154:9.

  30. Motwani R, Raghavan P. Randomized Algorithms. Cambridge University Press, 1995.

  31. Yuan Y, Lee R, Zhang X. The Yin and Yang of processing data warehousing queries on GPU devices. Proc. VLDB Endow., 2013, 6(10): 817-828.

    Article  Google Scholar 

  32. Heimel M, Markl V. A first step towards GPU-assisted query optimization. In Proc. ADMS, August 2012, pp.33-44.

  33. Yalamanchili S. Scaling data warehousing applications using GPUs. In Proc. the 2nd International Workshop on Performance Analysis of Workload Optimized Systems (FastPath), April 2013.

  34. Pirk H, Manegold S, Kersten M L. Waste not… efficient coprocessing of relational data. In Proc. the 30th IEEE International Conference on Data Engineering (ICDE), March 31-April 4, 2014, pp.508-519.

  35. Heimel M, Saecker M, Pirk H, Manegold S, Markl V. Hardware-oblivious parallelism for in-memory columnstores. Proc. VLDB Endow., 2013, 6(9): 709-720.

    Article  Google Scholar 

  36. Breß S, Saake G. Why it is time for a HyPE: A hybrid query processing engine for efficient GPU coprocessing in DBMs. Proc. VLDB Endow., 2013, 6(12): 1398-1403.

    Article  Google Scholar 

  37. Rossbach C J, Currey J, Silberstein M, Ray B, Witchel E. PTask: Operating system abstractions to manage GPUs as compute devices. In Proc. the 23rd ACM Symposium on Operating Systems Principles (SOSP), October 2011, pp.233-248.

  38. Kato S, Lakshmanan K, RaJjkumar R, Ishikawa Y. Time-Graph: GPU scheduling for real-time multi-tasking environments. In Proc. the 2011 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC), June 2011, Article No. 2.

  39. Kato S, McThrow M, Maltzahn C, Brandt C. Gdev: Firstclass GPU resource management in the operating system. In Proc. the 2012 USENIX Conference on Annual Technical Conference (USENIX ATC), June 2012, Article No. 37.

  40. Megiddo N, Modha D S. ARC: A self-tuning, low overhead replacement cache. In Proc. the 2nd USENIX Conference on File and Storage Technologies (FAST), March 31-April 2, 2003, pp.115-130.

  41. Liu X, Aboulnaga A, Salem K, Li X. CLIC: Client-informed caching for storage servers. In Proc. the 7th Conference on File and Storage Technologies (FAST), February 2009, pp.297-310.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaodong Zhang.

Additional information

This work was supported in part by the National Science Foundation of USA under Grant Nos. CCF-0913050, OCI-1147522, and CNS-1162165.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, K., Chen, F., Ding, X. et al. Hetero-DB: Next Generation High-Performance Database Systems by Best Utilizing Heterogeneous Computing and Storage Resources. J. Comput. Sci. Technol. 30, 657–678 (2015). https://doi.org/10.1007/s11390-015-1553-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-015-1553-y

Keywords

Navigation