Abstract
Specialized processing units such as GPUs or FPGAs provide great opportunities to speed up database operations by exploiting parallelism and relieving the CPU. But utilizing coprocessors efficiently poses major challenges to developers. Besides finding fine-granular data parallel algorithms and tuning them for the available hardware, it has to be decided at runtime which (co)processor should be chosen to execute a specific task. Depending on input parameters, wrong decisions can lead to severe performance degradations since involving coprocessors introduces a significant overhead, e.g., for data transfers. In this paper, we present a framework that automatically learns and adapts execution models for arbitrary algorithms on any (co)processor to find break-even points and support scheduling decisions. We demonstrate its applicability for three common use cases in modern database systems and show how their performance can be improved with wise scheduling decisions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Gregg, C., Hazelwood, K.: Where is the data? why you cannot debate cpu vs. gpu performance without the answer. In: ISPASS, pp. 134–144. IEEE (2011)
Govindaraju, N., Gray, J., Kumar, R., Manocha, D.: Gputerasort: high performance graphics co-processor sorting for large database management. In: SIGMOD, pp. 325–336. ACM (2006)
AMD: AMD Accelerated Parallel Processing (APP) SDK, Samples & Demos, http://developer.amd.com/sdks/AMDAPPSDK/samples/Pages/default.aspx
Hellerstein, J.M., Naughton, J.F., Pfeffer, A.: Generalized Search Trees for Database Systems. In: VLDB, pp. 562–573. Morgan Kaufmann Publishers Inc. (1995)
Beier, F., Kilias, T., Sattler, K.U.: Gist scan acceleration using coprocessors. In: DaMoN, pp. 63–69. ACM (2012)
Abadi, D.J., Madden, S.R., Hachem, N.: Column-stores vs. row-stores: how different are they really? In: SIGMOD, pp. 967–980. ACM (2008)
French, C.D.: ”One size fits all” database architectures do not work for DSS. In: SIGMOD, pp. 449–450. ACM (1995)
Boncz, P., Zukowski, M., Nes, N.: MonetDB/X100: Hyper-pipelining query execution. In: CIDR, pp. 225–237. VLDB Endowment (2005)
Stonebraker, M., Abadi, D.: Others.: C-store: a column-oriented DBMS. In: VLDB, pp. 553–564. VLDB Endowment (2005)
Krueger, J., Kim, C., Grund, M., Satish, N.: Fast updates on read-optimized databases using multi-core CPUs. J. VLDB Endowment, 61–72 (2011)
Ding, S., He, J., Yan, H., Suel, T.: Using graphics processors for high performance IR query processing. In: WWW, pp. 421–430. ACM (2009)
Wu, D., Zhang, F., Ao, N., Wang, G., Liu, X., Liu, J.: Efficient lists intersection by cpu-gpu cooperative computing. In: IPDPS Workshops, pp. 1–8. IEEE (2010)
Hoberock, J., Bell, N.: Thrust: A Parallel Template Library, Version 1.3.0 (2010)
Nvidia: Nvidia CUDA, http://developer.nvidia.com/cuda-toolkit
Krueger, J., Grund, M., Jaeckel, I., Zeier, A., Plattner, H.: Applicability of GPU Computing for Efficient Merge in In-Memory Databases. In: ADMS. VLDB Endowment (2011)
Breß, S., Mohammad, S., Schallehn, E.: Self-tuning distribution of db-operations on hybrid cpu/gpu platforms. In: Grundlagen von Datenbanken, CEUR-WS, pp. 89–94 (2012)
Anthony Ralston, P.R.: A first course in numerical analysis, 2nd edn., vol. 73, p. 251. Dover Publications (2001)
Zhang, N., Haas, P.J., Josifovski, V., Lohman, G.M., Zhang, C.: Statistical learning techniques for costing xml queries. In: VLDB, pp. 289–300. VLDB Endowment (2005)
ALGLIB Project: ALGLIB, http://www.alglib.net/
Akdere, M., Cetintemel, U., Upfal, E., Zdonik, S.: Learning-based query performance modeling and prediction. Technical report. Department of Computer Science, Brown University (2011)
Lee, V.W., Kim, C., et al.: Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. In: SIGARCH Comput. Archit. News, pp. 451–460. ACM (2010)
Zidan, M.A., Bonny, T., Salama, K.N.: High performance technique for database applications using a hybrid gpu/cpu platform. In: VLSI, pp. 85–90. ACM (2011)
He, B., Lu, M., Yang, K., Fang, R., Govindaraju, N.K., Luo, Q., Sander, P.V.: Relational query coprocessing on graphics processors. In: ACM Trans. Database Syst., pp. 1–21. ACM (2009)
Matsunaga, A., Fortes, J.A.B.: On the use of machine learning to predict the time and resources consumed by applications. In: CCGRID, pp. 495–504. IEEE (2010)
Kerr, A., Diamos, G., Yalamanchili, S.: Modeling gpu-cpu workloads and systems. In: GPGPU, pp. 31–42. ACM (2010)
Iverson, M.A., Ozguner, F., Follen, G.J.: Run-time statistical estimation of task execution times for heterogeneous distributed computing. In: HPDC, pp. 263–270. IEEE (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Breß, S., Beier, F., Rauhe, H., Schallehn, E., Sattler, KU., Saake, G. (2012). Automatic Selection of Processing Units for Coprocessing in Databases. In: Morzy, T., Härder, T., Wrembel, R. (eds) Advances in Databases and Information Systems. ADBIS 2012. Lecture Notes in Computer Science, vol 7503. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33074-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-33074-2_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33073-5
Online ISBN: 978-3-642-33074-2
eBook Packages: Computer ScienceComputer Science (R0)