Skip to main content
Log in

Performance Improvements from Partitioning Applications to FPGA Hardware in Embedded SoCs

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

A hardware/software partitioning methodology for improving performance in single-chip systems composed by processor and Field Programmable Gate Array reconfigurable logic is presented. Speedups are achieved by executing critical software parts on the reconfigurable logic. A hybrid System-on-Chip platform, which can model the majority of existing processor-FPGA systems, is considered by the methodology. The partitioning method uses an automated kernel identification process at the basic-block level for detecting critical kernels in applications. Three different instances of the generic platform and two sets of benchmarks are used in the experimentation. The analysis on five real-life applications showed that these applications spend an average of 69% of their instruction count in 11% on average of their code. The extensive experiments illustrate that for the systems composed by 32-bit processors the improvements of five applications ranges from 1.3 to 3.7 relative to an all software solution. For a platform composed by an 8-bit processor, the performance gains of eight DSP algorithms are considerably greater, as the average speedup equals 28.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. ARM Corp, www.arm.com, 2005.

  2. K. Bazargan, R. Kastner, S. Ogrenci, and M. Sarrafzadeh. A C to hardware/software compiler. In Proc. of FCCM'00, pp. 331–332, 2000.

  3. M. Bister, Y. Taeymans, and J. Cornelis. Automatic segmentation of cardiac MR images. Computers in Cardiology. IEEE Computer Society Press, pp. 215–218, 1989.

  4. T. J. Callahan, J. R. Hauser, and J. Wawrzynek. The garp architecture and C compiler. IEEE Computer, 33(4):62–69, 2000.

    Google Scholar 

  5. P. Eles, Z. Peng, K. Kuchchinski, and A. Doboli. System level hardware/software partitioning based on simulated annealing and tabu search. Design Automation for Embedded Systems, (Springer) 2(1):5–32, 1997.

    Google Scholar 

  6. Excalibur devices, Altera Inc., www.altera.com, 2005.

  7. FPSLIC devices, ATMEL Inc., www.atmel.com, 2005.

  8. D. D. Gajski, F. Vahid, S. Narayan, and J. Gong. SpecSyn: An environment supporting the specify-explore-refine paradigm for hardware/software system design. IEEE Trans. on VLSI Syst., 6(1):84–100, 1998.

    Google Scholar 

  9. S. Hauck, T. W. Fry, M. M. Hosler, and J. P. Kao. The chimaera reconfigurable functional unit. IEEE Trans. on VLSI Syst., 12(2):206–217, 2004.

    Google Scholar 

  10. J. Henkel. A low power hardware/software partitioning approach for core-based embedded systems. In Proc. of the 36th ACM/IEEE DAC, pp. 122–127, 1999.

  11. G. Holloway and M. D. Smith. The machine-SUIF SUIFvm library. Technical Report, Harvard University, July 2002.

  12. Honeywell Inc., http://www.htc.honeywell.com/projects/acsbench, 2005.

  13. IAR Embedded Workbench, IAR Systems Inc., www.iar.com, 2005.

  14. IEEE 802.11a Wireless LAN standards, http://grouper.ieee.org/groups/802/11/, 2005.

  15. JPEG image standard, www.jpeg.org, 2005.

  16. MachineSUIF, http://www.eecs.harvard.edu/hube/research/machsuif.html, 2005.

  17. M. Mercaldi, M. D. Smith, and G. Holloway. The HALT library. Technical Report, Harvard University, July 2002.

  18. MIPS Corp., www.mips.com, 2005.

  19. SimpleScalar LLC, www.simplescalar.com, 2005.

  20. G. Stitt and F. Vahid. Energy advantages of microprocessors platforms with on-chip configurable logic. IEEE Design & Test of Computers, 19(6):36–43, 2002.

    Article  Google Scholar 

  21. G. Stitt, F. Vahid, and S. Nematbakhsh. Energy savings and speedups from partitioning critical software loops to hardware in embedded systems. ACM Trans. on Embedded Computing Systems (TECS), 3(1):218–232, 2004.

  22. P. Strobach, Qsdpcm—A new technique in scene adaptive coding. In Proc. of 4th European Signal Processing Conf. (EUSIPCO-88), Grenoble, France, pp. 1141–1144, Sept. 1988.

  23. SUIF2 compiler, http://suif.stanford.edu/suif/suif2/index.html, 2005.

  24. Synplify Pro, Synplicity Inc., www.synplicity.com, 2005.

  25. Texas Instruments Inc., www.ti.com, 2005.

  26. Triscend Corp. www.triscend.com, 2004.

  27. J. Villareal, D. Suresh, G. Stitt, F. Vahid, and W. Najjar. Improving Software Performance with Configurable Logic. In Design Automation for Embedded Systems, Springer, Vol. 7, pp. 325–339, 2002.

  28. Virtex devices, Xilinx Inc., www.xilinx.com, 2005.

  29. A. Ye, N. Shenoy, and P. Baneijee. A compiler for a processor with a reconfigurable functional unit. In Proc. of FPGA, pp. 95–100, 2000.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michalis D. Galanis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Galanis, M.D., Dimitroulakos, G. & Goutis, C.E. Performance Improvements from Partitioning Applications to FPGA Hardware in Embedded SoCs. J Supercomput 35, 185–199 (2006). https://doi.org/10.1007/s11227-006-2953-0

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-006-2953-0

Keywords

Navigation