Skip to main content
Log in

GPU-based Acceleration of System-level Design Tasks

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Many system-level design tasks (e.g., high-level timing analysis, hardware/software partitioning and design space exploration) involve computational kernels that are intractable (usually NP-hard). As a result, they involve high running times even for mid-sized problems. In this paper we explore the possibility of using commodity graphics processing units (GPUs) to accelerate such tasks that commonly arise in the electronic design automation (EDA) domain. We demonstrate this idea via two detailed case studies. The first explores the possibility of using GPUs to speedup standard schedulability analysis problems. The second proposes a GPU-based engine for a general hardware/software design space exploration problem. Not only do these problems commonly arise in the embedded systems domain, their computational kernels turn out to be variants of a combinatorial optimization problem—viz., the knapsack problem—that lies at the heart of several EDA applications. Experimental results show that our GPU-based implementations offer very attractive speedups for the computational kernels (up to 100×), and speedups of up to 17× for the full problem. In contrast to ASIC/FPGA-based accelerators—given that even low-end desktop and notebook computers are now equipped with GPUs—our solution involves no extra hardware cost. Although recent research has shown the benefits of using GPUs for a variety of non-graphics applications (e.g., in databases and bioinformatics), harnessing the parallelism of GPUs to accelerate problems from the EDA domain has not been sufficiently explored so far. We believe that our results and the generality of the core problem that we address will motivate researchers from this community to explore the possibility of using GPUs for a wider variety of problems from the EDA domain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abramovici, M., de Sousa, J.T., Saab, D.: A massively-parallel easily-scalable satisfiability solver using reconfigurable hardware. In: Proceedings of 36th Design Automation Conference (DAC), pp. 684–690. ACM Press (1999)

  2. Agarwal, P.K., Krishnan, S., Mustafa, N.H., Venkatasubramanian, S.: Streaming geometric optimization using graphics hardware. In: Proceedings of 11th European Symposium on Algorithms (ESA), Lecture Notes in Computer Science 2832, pp. 544–555. Springer (2003)

  3. Ailamaki, A., Govindaraju, N.K., Harizopoulos, S., Manocha, D.: Query co-processing on commodity processors. In: Proceedings of 32nd International Conference on Very Large Data Bases (VLDB), pp. 1267–1267. VLDB Endowment (2006)

  4. Albers, K., Slomka, F.: An event stream driven approximation for the analysis of real-time systems. In: Proceedings of 16th Euromicro Conference on Real-time Systems (ECRTS), pp. 187–195. IEEE Computer Society (2004)

  5. Baruah S.: Dynamic- and static-priority scheduling of recurring real-time tasks. Real-Time Syst. 24(1), 93–128 (2003)

    Article  MATH  Google Scholar 

  6. Baruah S., Chen D., Gorinsky S., Mok A.K.: Generalized multiframe tasks. Real-Time Syst. 17(1), 5–22 (1999)

    Article  Google Scholar 

  7. Baruah S., Mok A.K., Rosier, L.E.: Preemptively scheduling hard-real-time sporadic tasks on one processor. In: Proceedings of 11th IEEE Real-time Systems Symposium, pp. 182–190. IEEE Computer Society Press (1990)

  8. Bauer, M., Ecker, W., Henftling, R., Zinn, A.: A method for accelerating test environments. In: Euromicro, Vol. 01, p. 1477. IEEE Computer Society, Los Alamitos (1999)

  9. Belleman R.G., Bedorf J., Zwart S.P.: High performance direct gravitational n-body simulations on graphics processing units. New Astron. 13(2), 103–112 (2008)

    Article  Google Scholar 

  10. Bordoloi, U.D., Chakraborty, S.: Accelerating system-level design tasks using commodity graphics hardware: A case study. In: 22nd International Conference on VLSI Design (VLSID), pp. 465–470 (2009)

  11. Bordoloi, U.D., Chakraborty, S.: Performance debugging of real-time systems using multicriteria schedulability analysis. In: Proceedings of 13th Real Time and Embedded Technology and Applications Symposium (RTAS), pp. 193–202. IEEE Computer Society (2007)

  12. Buttazzo G.C.: Hard Real-time Computing Systems: Predictable Scheduling Algorithms and Applications. Kluwer, Boston (1997)

    MATH  Google Scholar 

  13. Chakraborty, S., Erlebach, T., Thiele, L.: On the complexity of scheduling conditional real-time code. In: Proceedings of 7th International Workshop on Algorithms and Data Structures (WADS), Lecture Notes in Computer Science 2125, pp. 38–49 (2001)

  14. Chakraborty, S., Künzli, S., Thiele, L.: Approximate schedulability analysis. In: Proceedings of 23rd IEEE Real-time Systems Symposium (RTSS), p. 159. IEEE Computer Society (2002)

  15. Chatterjee, D., De Orio, A., Bertacco, V.: GCS: High-performance gate-level simulation with GP-GPUs. In: Design Automation and Test in Europe (DATE), April (2009)

  16. nVIDIA CUDA Zone, http://www.nvidia.com/object/cuda_home.html

  17. Deb, K.: Multi-objective Optimization Using Evolutionary Algorithms. Wiley, New York (2001)

  18. Dutta, R., Roy, J., Vemuri, R.: Distributed design-space exploration for high-level synthesis systems. In: Proceedings of 29th Design Automation Conference (DAC), pp. 644–650. IEEE Computer Society Press (1992)

  19. Feng, J., Chakraborty, S., Schmidt, B., Liu, W., Bordoloi, U.D.: Fast schedulability analysis using commodity graphics hardware. In: Proceedings of 13th International Conference on Embedded and Real-time Computing Systems and Applications (RTCSA), pp. 400–408. IEEE Computer Society (2007)

  20. Fisher N., Baruah, S.: A fully polynomial-time approximation scheme for feasibility analysis in static-priority systems with arbitrary relative deadlines. In: Proceedings of 17th Euromicro Conference on Real-time Systems (ECRTS), pp. 117–126. IEEE Computer Society (2005)

  21. Garey M.R., Johnson D.S.: Computers and Intractability: A Guide to the Theory of NP-completeness. W.H. Freeman and Company, New York (1979)

    MATH  Google Scholar 

  22. Goodnight, N., Woolley, C., Lewin, G., Luebke, D., Humphreys, G.: A multigrid solver for boundary value problems using programmable graphics hardware. In: Proceedings of SIGGRAPH/Eurographics Conference on Graphics Hardware, pp. 102–111. Eurographics Association (2003)

  23. Gulati, K., Khatri, S.P.: Towards acceleration of fault simulation using graphics processing units. In: Proceedings of 45th Design Automation Conference (DAC), pp. 822–827. ACM (2008)

  24. Hamann, A., Ernst, R.: Efficient priority optimization in complex distributed embedded systems through search space adaptation. In: Proceedings of 9th Conference on Genetic and Evolutionary Computation (GECCO), pp. 1517–1517. ACM (2007)

  25. Henftling, R., Zinn, A., Bauer, M., Zambaldi, M., Ecker, W.: Re-use-centric architecture for a fully accelerated testbench environment. In: Proceedings of 40th Design Automation Conference (DAC), pp. 372–375. ACM (2003)

  26. Kellerer H., Pferschy U., Pisinger D.: Knapsack Problems. Springer, Berlin (2004)

    MATH  Google Scholar 

  27. Kim, D., Ha, S., Gupta, R.: Parallel co-simulation using virtual synchronization with redundant host execution. In: Proceedings of 9th Conference on Design, Automation and Test in Europe (DATE), pp. 1151–1156. European Design and Automation Association (2006)

  28. Kim, Y.-I., Yang, W., Kwon, Y-S., Kyung, C-M.: Communication-efficient hardware acceleration for fast functional simulation. In: Proceedings of 41st Design Automation Conference (DAC), pp. 293–298. ACM (2004)

  29. Krüger J., Westermann R.: Linear algebra operators for GPU implementation of numerical algorithms. ACM Trans. Graph. 22(3), 908–916 (2003)

    Article  Google Scholar 

  30. Liu C., Leyland J.: Scheduling algorithms for multiprogramming in a hard real-time environment. J. ACM 20(1), 46–61 (1973)

    Article  MATH  Google Scholar 

  31. Liu, W., Schmidt, B., Voss, G., Müller-Wittig, W.: GPU-ClustalW: Using graphics hardware to accelerate multiple sequence alignment. In: Proceedings of 13th International Conference on High Performance Computing (HiPC), pp. 363–374. Springer (2006)

  32. Mueller, K., Xu, F.: Ultra-fast 3d filtered backprojection on commodity graphics hardware. In: Proceedings of 1st International Symposium on Biomedical Imaging, pp. 571–574 (2004)

  33. Neophytou, N., Mueller, K.: GPU accelerated image aligned splatting. In: Proceedings of 4th Eurographics/IEEE Visualization and Graphics Technical Committee (VGTC) Workshop on Volume Graphics, pp. 197–205. Eurographics Association (2005)

  34. Rost, R.J.: OpenGL Shading Language. Addison-Wesley, Reading (2006)

  35. Skliarova I., Ferrari A.B.: A software/reconfigurable hardware sat solver. IEEE Trans. VLSI Syst. 12(4), 408–419 (2004)

    Article  Google Scholar 

  36. Soulé, L., Blank, T.: Parallel logic simulation on general purpose machines. In: Proceedings of 25th design automation conference (DAC), pp. 166–171. IEEE Computer Society Press (1988)

  37. Venkatasubramanian, S.: The graphics card as a stream computer. In: SIGMOD-DIMACS Workshop on Management and Processing of Data Streams (2003)

  38. Zargham, M.R.: Parallel channel routing. In: Proceedings of 25th Design Automation Conference (DAC), pp. 128–133. IEEE Computer Society Press (1988)

  39. Zhong P., Martonosi M., Ashar P., Malik S.: Using configurable computing to accelerate boolean satisfiability. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. 18(6), 861–868 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Unmesh D. Bordoloi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bordoloi, U.D., Chakraborty, S. GPU-based Acceleration of System-level Design Tasks. Int J Parallel Prog 38, 225–253 (2010). https://doi.org/10.1007/s10766-009-0125-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-009-0125-6

Keywords

Navigation