Abstract
A conflict resolving parallel data memory system for Transport Triggered Architecture (TTA) is described. The architecture is generic and reusable to support various application specific designs. With parallel memory, more area and power consuming multi-port memory can be replaced with single-port memory modules. Number of ports can be increased over what is available on a design library for multi-port memories. In an FFT TTA example, dual-port data memory was replaced by the proposed architecture. To avoid memory conflicts, the original code was rescheduled and the TTA core was regenerated for the new schedule. The original memory required an area higher by a factor of 3.38 and energy higher by a factor of 1.70. In this case, the energy consumption of the processor core increased so that system energy consumption remained about the same. However, the original system required an area higher by a factor of 1.89.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Corporaal, H.: Microprocessor Architectures: From VLIW to TTA. John Wiley & Sons, Chichester, UK (1997)
Sohi, G.S., Franklin, M.: High-bandwidth data memory systems for superscalar processors. In: Proc. 4th Int. Conf. Architectural Support for Programming Languages and Operating Systems, Santa Clara, CA, U.S.A., pp. 53–62 (April 8-11, 1991)
Juan, T., Navarro, J.J., Temam, O.: Data caches for superscalar processors. In: Proc. 11th Int. Conf. Supercomputing, Vienna, Austria, pp. 60–67 (July 7-11, 1997)
Rivers, J.A., Tyson, G.S., Davidson, E.S., Austin, T.M.: On high-bandwidth data cache design for multi-issue processors. In: Proc. 30th Ann. ACM/IEEE Int. Symp. Microarchitecture, pp. 46–56. Research Triangle Park, NC, U.S.A (December 1-3, 1997)
Sawyer, N., Defossez, M.: Quad-port memories in Virtex devices. Xilinx application note, XAPP228 (v1.0) (September 24, 2002)
Zhu, Z., Johguchi, K., Mattausch, H.J., Koide, T., Hirakawa, T., Hironaka, T.: A novel hierarchical multi-port cache. In: Proc. 29th European Solid-State Circuits Conf., Estoril, Portugal, pp. 405–408 (September 16-18, 2003)
Patel, K., Macii, E., Poncino, M.: Energy-performance tradeoffs for the shared memory in multi-processor systems-on-chip. In: Proc. IEEE Int. Symp. Circuits and Systems, Vancouver, British Columbia, Canada, May 23-26, 2004, vol. 2, pp. 361–364. IEEE Computer Society Press, Los Alamitos (2004)
Ang, S.S., Constantinides, G., Cheung, P., Luk, W.: A flexible multi-port caching scheme for reconfigurable platforms. In: Bertels, K., Cardoso, J.M.P., Vassiliadis, S. (eds.) ARC 2006. LNCS, vol. 3985, pp. 205–216. Springer, Heidelberg (2006)
Takala, J.H., Järvinen, T.S., Sorokin, H.T.: Conflict-free parallel memory access scheme for FFT processors. In: Proc. IEEE Int. Symp. Circuits and Systems, Bangkok, Thailand, May 25-28, 2003, vol. 4, pp. 524–527. IEEE Computer Society Press, Los Alamitos (2003)
Jääskeläinen, P., Guzma, V., Cilio, A., Takala, J.: Codesign toolset for application-specific instruction-set processors. In: Proc. SPIE - Multimedia on Mobile Devices (2007)
Mäkinen, R.: Fast Fourier transform on transport triggered architectures. M.Sc. Thesis, Tampere University of Technology, Tampere, Finland (October 2005)
Pitkänen, T., Mäkinen, R., Heikkinen, J., Partanen, T., Takala, J.: Low-power, high-performance TTA processor for 1024-point Fast Fourier transform. In: Vassiliadis, S., Wong, S., Hämäläinen, T.D. (eds.) SAMOS 2006. LNCS, vol. 4017, pp. 227–236. Springer, Heidelberg (2006)
Budnik, P., Kuck, D.J.: The organization and use of parallel memories. IEEE Trans. Comput. C-20(12), 1566–1569 (1971)
Kim, K., Prasanna, V.K.: Latin squares for parallel array access. IEEE Trans. Parallel and Distrib. Syst. 4(4), 361–370 (1993)
Frailong, J.M., Jalby, W., Lenfant, J.: XOR-schemes: a flexible data organization in parallel memories. In: Proc. Int. Conf. Parallel Processing, pp. 276–283 (August 20-23, 1985)
Liu, Z., Li, X.: XOR storage schemes for frequently used data patterns. Journal of Parallel and Distributed Computing 25(2), 162–173 (1995)
Deb, A.: Multiskewing – a novel technique for optimal parallel memory access. IEEE Trans. Parallel and Distrib. Syst. 7(6), 595–604 (1996)
Rau, B.R.: Pseudo-randomly interleaved memory. In: Proc. 18th Ann. Int. Symp. Computer Architecture, Toronto, Ontario, Canada, pp. 74–83 (May 27-30, 1991)
Seznec, A., Lenfant, J.: Odd memory systems: a new approach. Journal of Parallel and Distributed Computing 26(2), 248–256 (1995)
Tanskanen, J.K., Creutzburg, R., Niittylahti, J.T.: On design of parallel memory access schemes for video coding. J. VLSI Signal Processing 40(2), 215–237 (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tanskanen, J.K., Pitkänen, T., Mäkinen, R., Takala, J. (2007). Parallel Memory Architecture for TTA Processor. In: Vassiliadis, S., Bereković, M., Hämäläinen, T.D. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2007. Lecture Notes in Computer Science, vol 4599. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73625-7_29
Download citation
DOI: https://doi.org/10.1007/978-3-540-73625-7_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73622-6
Online ISBN: 978-3-540-73625-7
eBook Packages: Computer ScienceComputer Science (R0)