Abstract
Coarse Grained Reconfigurable Arrays have become an established approach to provide high computational performance in various environments. Several researchers have found that the achievable performance highly depends on the interface between memory and CGRA. In this contribution we show that a smart prefetching mechanism can increase the performance of the CGRA. At the same time it consumes less hardware resources and energy as state of the art prefetching mechanisms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Values are based on our FPGA implementation of the System (work in progress).
- 2.
Also called synthesis in previous publications.
- 3.
Simply setting \(f=p=0\) and increasing u will result in a worse performance because high u decrease performance as shown in [9].
- 4.
Note that the number of contexts does not directly correlate to the runtime, because some contexts are executed more often as they are part of inner loops or even different kernels.
References
Archibald, J., Baer, J.L.: Cache coherence protocols: evaluation using a multiprocessor simulation model. ACM Trans. Comput. Syst. 4(4), 273–298 (1986)
Cong, J., Huang, H., Ma, C., Xiao, B., Zhou, P.: A fully pipelined and dynamically composable architecture of CGRA. In: 2014 FCCM, pp. 9–16, May 2014
Dahlgren, F., Stenstrom, P.: Evaluation of hardware-based stride and sequential prefetching in shared-memory multiprocessors. TPDS 7(4), 385–398 (1996)
Fuchs, A., Mannor, S., Weiser, U., Etsion, Y.: Loop-aware memory prefetching using code block working sets. In: 2014 MICRO, pp. 533–544, December 2014
Gatzka, S., Hochberger, C.: The AMIDAR class of reconfigurable processors. J. Supercomput. 32(2), 163–181 (2005)
Gatzka, S., Hochberger, C.: Hardware based online profiling in AMIDAR processors. In: IPDPS, p. 144b (2005)
Hashemi, M., Mutlu, O., Patt, Y.N.: Continuous runahead: transparent hardware acceleration for memory intensive workloads. In: 2016 MICRO, pp. 1–12, October 2016
Hoy, C.H., Govindarajuz, V., Nowatzki, T., Nagaraju, R., Marzec, Z., Agarwal, P., Frericks, C., Cofell, R., Sankaralingam, K.: Performance evaluation of a DySER FPGA prototype system spanning the compiler, microarchitecture, and hardware implementation. In: 2015 ISPASS, pp. 203–214, March 2015
Jung, L.J., Hochberger, C.: Feasibility of high level compiler optimizations in online synthesis. In: 2015 ReConFig, pp. 1–7, December 2015
Jung, L.J., Hochberger, C.: Optimal processor interface for CGRA-based accelerators implemented on FPGAs. In: 2016 ReConFig, pp. 1–7, November 2016
Lee, H., Nguyen, D., Lee, J.: Optimizing stream program performance on CGRA-based systems. In: Proceedings of the 52nd DAC, DAC 2015, pp. 110:1–110:6. ACM, New York (2015)
Prabhakar, R., Zhang, Y., Koeplinger, D., Feldman, M., Zhao, T., Hadjis, S., Pedram, A., Kozyrakis, C., Olukotun, K.: Plasticine: a reconfigurable architecture for parallel paterns. In: Proceedings of the 44th ISCA, ISCA 2017, pp. 389–402. ACM, New York (2017)
Ruschke, T., Jung, L.J., Wolf, D., Hochberger, C.: Scheduler for inhomogeneous and irregular CGRAs with support for complex control flow. In: 2016 IPDPSW, pp. 198–207, May 2016
Vahid, F., Stitt, G., Lysecky, R.: Warp processing: dynamic translation of binaries to FPGA circuits. Computer 41(7), 40–46 (2008)
Veredas, F.J., Scheppler, M., Moffat, W., Mei, B.: Custom implementation of the coarse-grained reconfigurable ADRES architecture for multimedia purposes. In: FPL 2005, pp. 106–111, August 2005
Yang, C., Liu, L., Yin, S., Wei, S.: Data cache prefetching via context directed pattern matching for coarse-grained reconfigurable arrays. In: 2016 53nd DAC, pp. 1–6, June 2016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Jung, L.J., Hochberger, C. (2018). Lookahead Memory Prefetching for CGRAs Using Partial Loop Unrolling. In: Voros, N., Huebner, M., Keramidas, G., Goehringer, D., Antonopoulos, C., Diniz, P. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2018. Lecture Notes in Computer Science(), vol 10824. Springer, Cham. https://doi.org/10.1007/978-3-319-78890-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-78890-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78889-0
Online ISBN: 978-3-319-78890-6
eBook Packages: Computer ScienceComputer Science (R0)