Skip to main content

The Effects of System Hyper Pipelining on Three Computational Benchmarks Using FPGAs

  • Conference paper
  • First Online:
Applied Reconfigurable Computing (ARC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9040))

Included in the following conference series:

Abstract

C-Slow Retiming (CSR) generates concentric design clusters and improves the performance per area factor of a design by reusing the combinatorial logic in a time sliced fashion. The limitation of CSR is, that all C copies of the design have to be continuously executed. The paper proposes System Hyper Pipelining (SHP), which overcomes the limitations of CSR by adding thread stalling, bypassing and fork-join queueing techniques. The impact of SHP on multithreading and multiprocessing system is manifold. This paper concentrates on techniques to improve the performance of individual threads of SHP based CPUs. SHP is ideal for FPGAs with their high number of registers and their flexible memory usage. The paper compares standard implementations of CPUs with their CSR and SHP versions. Results based on three state-of-the-art 32-bit processors are shown.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abdelfattah, M.S., Betz, V.: The power of communication: Energy- efficient NoCs for FPGAs. In: Intern. Conf. on FPL, pp. 1–8. Porto, Portugal September 2-4, 2013

    Google Scholar 

  2. Matthews, E., Shannon, L., Dedorova, A.: Polyblaze: from one to many. Bringing the microblaze into the multicore era with linux SMP support. In: 22nd Intern. Conf. On FPL, pp. 224–230. Oslo, Norway August 29-31, 2012

    Google Scholar 

  3. Vallina, F.M., Jachimiec, N., Saniie, J.: Multiprocessor and operating system design for signal processing on an FPGA. In: IEEE Intl. Conf. on Electro/ Information Technology, pp. 378–383. Chicago, IL, USA May 17-20, 2007

    Google Scholar 

  4. Klimm, A., Braun, L., Becker, J.: An adaptive and scalable multiprocessor system for xilinx FPGAs using minimal sized processor cores. In: IEEE Inter. Symposium on Parallel and Ditributed Processing, pp. 1–7. Miami, Fl, USA April 14-18, 2008

    Google Scholar 

  5. Wallentowitz, S., Lankes, A., Zaib, A., Wild, T., Herkersdorf, A.: A framework for open tiled manycore system-on-chip. In: 22nd Intern. Conf. on FPL, pp.535–538. Oslo, Norway August 29-31, 2012

    Google Scholar 

  6. Henrey, M., Edmond, S., Shannon, L., Menon, C.: Bio-inspired walking: A FPGA multicore system for a legged robot. In: 22nd Inter. Conf. on FPL, pp. 105–111. Oslo, Norway August 29-31, 2012

    Google Scholar 

  7. Lu, Y., Sezer, S., McCanny, J.: Advanced multithreading architecture with hardware based thread scheduling. In: Inter. Conf. on FPL, pp. 95–100. Milano, Italy 31 August –2 September 2010

    Google Scholar 

  8. Tatas, K., Kyriacou, C.: Implementation of a threaded dataflow multiprocessor using FPGA. In: 6th Intern. Conf. on DTIS, pp. 1–6. Athens, Greece April 6-8, 2011

    Google Scholar 

  9. Labrecque, M., Steffan, J.G.: Improving pipelined soft processors with multitherading. In: Intern. Conf. on FPL, pp. 210–215. Amsterdam August 27-29, 2007

    Google Scholar 

  10. Labrecque, M., Steffan, J.G.: Fast critical sections via thread scheduling for FPGA-based multithreaded processors. In: Intern. Conf. on FPL, pp. 18–25. Prague, Czech Republic 31 August –2 September 2009

    Google Scholar 

  11. Leiserson, C., Saxe, J.: Retiming Synchronous Circuitry. Algorithmica 6(1), 5–35 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  12. Weaver, N., Wawrzynek, J.: The effects of datapath placement and C- slow retiming on three computational benchmarks. In: Proc. FCCM 2002, pp. 303–304. Napa, CA, USA April 24, 2002

    Google Scholar 

  13. Strauch, T.: Timing driven C-slow retiming on RTL for multicores on FPGAs. In: ParaFPGA 2013. Munich, Germany September 10-13, 2013. www.edaptix.com/ParCo2013_Strauch_CSR_RTL.pdf

  14. Su, M., Zhou, L., Shi, C.: Maximizing the throughput-area efficiency of fully-parallel low-density parity-check decoding with c-slow retiming and asynchronous deep pipelining. In: ICCD 2007, pp. 636–643. Lake Tahoe, CA, USA October 7-10, 2007

    Google Scholar 

  15. Afram, M., Khan, A., Sarfaraz, M.: C-slow technique vs. multiprocessor in designing low area customized set processor for embedded applications. In: Intern. Journal of Computer Applications 6(7) (2001)

    Google Scholar 

  16. Cadenas, J., Sherratt, S., Huerta, P., Kao, W.-C., Megson, G.M.: C-slow retimed parallel histogram archi-tectures for consumer imaging devices. Transactions on Consumer Electronics 59(2), pp. 291–295

    Google Scholar 

  17. Opencores, Stockholm, Sweden, 2007. www.opencores.org/projects

  18. The RISCV Instruction Set Architecture (riscv.org)

    Google Scholar 

  19. Atmel: AT91SAM ARM based Flashed MCU. http://www.atmel.com/Images/doc11057.pdf

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tobias Strauch .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Strauch, T. (2015). The Effects of System Hyper Pipelining on Three Computational Benchmarks Using FPGAs. In: Sano, K., Soudris, D., Hübner, M., Diniz, P. (eds) Applied Reconfigurable Computing. ARC 2015. Lecture Notes in Computer Science(), vol 9040. Springer, Cham. https://doi.org/10.1007/978-3-319-16214-0_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16214-0_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16213-3

  • Online ISBN: 978-3-319-16214-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics