Abstract
This paper presents an approach for modeling the achievable speed-ups of FPGAs (Field Programmable Gate Arrays) or GPUs (Graphic Processing Units) as coprocessors in hybrid computing systems. The underlying computation model assumes that the coprocessors are separate devices and that their input and output data are transferred from and into the system’s memory. The model considers all overheads involved when (sub-)tasks are performed on a coprocessor instead of the CPU. By means of a sample application the validity of the model is checked against measured values. In addition, the theoretical maximum speed-ups of two hybrid systems compared to an optimal single core CPU implementation are approximated. Using penalty factor P SEQ as a measure to which degree a program cannot be fully parallelized due to data dependencies, a system with a Nvidia GTX 285 GPU achieves a speed-up of 2.7 times P SEQ , while for a single node of a Cray XD1 with a Xilinx Virtex4 LX160 the speed-up is about 1 times P SEQ .
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Betkaoui, B., Thomas, D., Luk, W.: Comparing Performance and Energy Efficiency of FPGAs and GPUs for High Productivity Computing. In: Int. Conf. on Field-Programmable Technology (FPT), pp. 94–101 (2010)
Cope, B., Cheung, P., Luk, W., Witt, S.: Have GPUs made FPGAs redundant in the field of Video Processing? In: Int. Conf. on Field-Programmable Technology (FPT), pp. 111–118 (2005)
Cray Incorporate, Seattle, Washington, USA: Cray XD1 System Overview, version 1.4 (2006)
Hampel, V., Sobe, P., Maehle, E.: Designing Coprocessors for Hybrid Compute Systems. In: Int. Symp. on Parallel and Distributed Processing (IPDPS), pp. 1–8 (2008)
Hampel, V., Goronzy, G., Maehle, E.: A Code-Based Analytical Approach for Using Separate Device Coprocessors in Computing Systems. In: Berekovic, M., Fornaciari, W., Brinkschulte, U., Silvano, C. (eds.) ARCS 2011. LNCS, vol. 6566, pp. 1–12. Springer, Heidelberg (2011)
Hollander, R.M., Bolotoff, P.V.: RAMSpeed, a Cache and Memory Benchmarking Tool (2009), http://alasir.com/software/ramspeed/ (visited on September 23, 2011)
Jones, D., Powell, A., Bouganis, C.S., Cheung, P.: GPU versus FPGA for High Productivity Computing. In: Int. Conf. on Field Programmable Logic and Applications (FPL), pp. 119–124 (2010)
NVIDIA Corporation, Santa Clara, California, USA: NVIDIA CUDA C Programming Guide, http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf (visited on September 23, 2011)
NVIDIA Corporation, Santa Clara, California, USA: Technical Brief NVIDIA GeForce GTX 200 GPU Architectural Overview, http://www.nvidia.com/docs/IO/55506/GeForce_GTX_200_GPU_Technical_Brief.pdf (visited on September 23, 2011)
Suffern, K.G.: Ray Tracing from the Ground up. A K Peters Ltd. (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hampel, V., Pionteck, T., Maehle, E. (2012). An Approach for Performance Estimation of Hybrid Systems with FPGAs and GPUs as Coprocessors. In: Herkersdorf, A., Römer, K., Brinkschulte, U. (eds) Architecture of Computing Systems – ARCS 2012. ARCS 2012. Lecture Notes in Computer Science, vol 7179. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28293-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-28293-5_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28292-8
Online ISBN: 978-3-642-28293-5
eBook Packages: Computer ScienceComputer Science (R0)