Abstract
It’s a promising way to improve performance significantly by adding reconfigurable processing unit (RPU) to a general purpose processor. In this paper, a Reconfigurable Multi-Core (RMC) architecture combining general multi-core and reconfigurable logic is proposed. Reconfigurable logic is separated into RPUs logically, which are coupled with general purpose cores as co-processors via a full crossbar switch. An RPU Manager (RPU-M) is also designed to manage RPUs. To verify RMC, a simulation method based on the Simics and Virtex 5 FPGA is adopted, which simplifies the simulation and assures the evaluation accuracy of hardware function cores. Five workloads are selected to test RMC, including 3-DES, AES, SHA2, IDCT and JPEG_ENC. The experimental results show a 3.10 times average speedup over software implementation on the original multi-core, and the data and control communication overhead on RMC is acceptable.
Similar content being viewed by others
References
Banovic, D., & Radusinovic, I. (2008). Scheduling algorithm for VOQ switches. AEÜ. International Journal of Electronics and Communications, 62(6), 455–458.
Birk, Y., & Fiksman, E. (2009). Dynamic reconfiguration architectures for multi-context FPGAs. Computers & Electrical Engineering, 35(6), 878–903.
Bobda, C., Haller, T., Muehlbauer, F., Rech, D., & Jung, S. (2007). Design of adaptive multiprocessor on chip systems. In Proceedings of the 20th annual conference on integrated circuits and systems design. Copacabana, Rio de Janeiro.
Campi, F., Toma, M., Lodi, A., Cappelli, A., Canegallo, R., & Guerrieri, R. (2003). A VLIW processor with reconfigurable instruction set for embedded applications. In Proc. of 2003 IEEE international solid-state circuits conference (pp. 250–491).
Dietterle, D., & Kraemer, R. (2009). A hardware accelerated implementation of the IEEE 802.15.3 MAC protocol. Telecommunications Systems, 40(3–4), 161–167.
Fu, W., & Compton, K. (2006). A simulation platform for reconfigurable computing research. In Proc. of 2006 international conference on field programmable logic and applications (FPL’06) (pp. 1–7).
Garcia, P., & Compton, K. (2007). A reconfigurable hardware interface for a modern computing system. In Proc. of 15th annual IEEE symposium on field-programmable custom computing machines (FCCM 2007) (pp. 73–84).
Gottlieh, D. B., Cook, J. J., Walstrom, J. D., Ferrera, S., & Wang, C. W. (2002). Clustered programmable-reconfigurable processors. In Proc. of 2002 IEEE international conference on field-programmable technology (FPT 2003) (pp. 134–141).
Haubelt, C., Falk, J., Keinert, J., Schlichter, T., Streub, M., Deyhle, A., Hadert, A., & Teich, J. (2007). A SystemC-based design methodology for digital signal processing systems. EURASIP Journal on Embedded Systems, 2007(1), 15.
Hauser, J. R., & Wawrzynek, J. (1997). Garp: a MIPS processor with a reconfigurable coprocessor. In Proc. of 5th annual IEEE symposium on FPGAs for custom computing machines (pp. 12–21).
Ho, J. T. L., & Lemieux, G. G. F. (2009). PERG-Rx: a hardware pattern-matching engine supporting limited regular expressions. In Proc. of the ACM/SIGDA international symposium on field programmable gate arrays (pp. 257–260).
Kindratenko, V. V., & Brunner, R. J. (2009). Accelerating cosmological data analysis with FPGAs. In Proc. of the 2009 17th IEEE symposium on field programmable custom computing machines (pp. 11–18).
Kuzmanov, G., Gaydadjiev, G., & Vassiliadis, S. (2004). The MOLEN processor prototype. In Proc. of the 12th annual IEEE symposium on field-programmable custom computing machines (FCCM 2004) (pp. 296–299).
Lin, S., Parameswaran, S. S., & Cheung, N. (2005). Novel architecture for loop acceleration: a case study. In Proc. of the 3rd IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis (pp. 297–302).
Lysecky, R., Stitt, G., & Vahid, F. (2006). Warp processors. ACM Transactions on Design Automation of Electronic Systems, 11(3), 659–681.
Maestre, R., Kurdahi, F. J., Fernandez, M., Hermida, R., Bagherzadeh, N., & Singh, H. (2001). A framework for reconfigurable computing: task scheduling and context management. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 9(6), 858–873.
Magnusson, P. S., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., & Werner, B. (2002). Simics: a full system simulation platform. Computer, 35(2), 50–58.
OpenCores. http://www.opencores.org/.
Pande, A., & Zambreno, J. (2011). A chaotic encryption scheme for real-time embedded systems design and implementation. Telecommunication Systems, Published Online, 2 June 2011.
Peng, L., Tian, C., & Zheng, S. (2004). IRGRR: a fast scheduling scheme with less control messages for scalable crossbar switches. In Lecture notes in computer science: Vol. 3079. High speed networks and multimedia communications (pp. 191–202). Berlin: Springer.
Pereira, M. M., Oliveira, B. C., & Silva, I. S. (2007). RoSA: a reconfigurable stream-based architecture. In Proc. of the 20th annual conference on integrated circuits and systems design (pp. 159–164).
Rahul, R., & Michael, D. S. (1994). A high-performance microarchitecture with hardware-programmable functional units. In Proc. of the 27th annual international symposium on microarchitecture (pp. 172–180).
Riley, M. W., Warnock, J. D., & Wendel, D. F. (2007). Cell broadband engine processor: design and implementation. IBM Journal of Research and Development, 51(5), 545–557.
Rupnow, K., Adriaens, J., Fu, W., & Compton, K. (2010). Accurately evaluating application performance in simulated hybrid multi-tasking systems. In Proceedings of the 18th annual ACM/SIGDA international symposium on field programmable gate arrays (pp. 135–144).
Shah, M., Barreh, J., Brooks, J., Golla, R., Grohoski, G., Gura, N., Hetherington, R., Jordan, P., Luttrell, M., Olson, C., Saha, B., Sheahan, D., Spracklen, L., & Wynn, A. (2007). UltraSPARC T2: a highly-threaded, power-efficient, SPARC SOC. In Proc. of 2007 solid-state circuits conference (ASSCC’ 07) (pp. 22–25).
Stitt, G., & Vahid, F. (2007). Thread warping: a framework for dynamic synthesis of thread accelerators. In Proc. of the 5th IEEE/ACM international conference on hardware/software codesign and system synthesis (pp. 93–98).
Thomos, C., & Kalivas, G. (2011). FPGA-based architecture and implementation techniques of a low-complexity hybrid RAKE receiver for a DS-UWB communication system. Telecommunication Systems, Published Online, 2 June 2011.
Tse, A. H. T., Thomas, D. B., & Luk, W. (2009). Accelerating quadrature methods for option valuation. In Proc. of the 17th IEEE symposium on field programmable custom computing machines (pp. 29–36).
Uhrig, S., Maier, S., Kuzmanov, G., & Ungerer, T. (2006). Coupling of a reconfigurable architecture and a multithreaded processor core with integrated real-time scheduling. In Proc. of the 20th international parallel and distributed processing symposium (IPDPS 2006) (pp. 214–217).
Waidyasooriya, H. M., Hariyama, M., & Kameyama, M. (2009). Implementation of a partially reconfigurable multi-context FPGA based on asynchronous architecture. IEICE Transactions on Electronics, 92(6), 539–549.
Weaver, N., Paxson, N., & Gonzalez, J. M. (2007). The shunt: an FPGA-based accelerator for network intrusion prevention. In Proc. of the 2007 ACM/SIGDA 15th international symposium on field programmable gate arrays (pp. 199–206).
Wittig, R. D., & Chow, P. (1996). OneChip: an FPGA processor with reconfigurable logic. In Proc. of 1996 IEEE symposium on FPGAs for custom computing machines (pp. 126–135).
Xilinx Inc. (2007). Virtex-5 FPGA configuration user guide. http://www.xilinx.com/support/documentation/virtex-5_user_guides.htm.
Ye, Z. A., Moshovos, A., Hauck, S., & Banerjee, P. (2000). CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit. In Proc. of the 27th annual international symposium on computer architecture (pp. 225–235).
Acknowledgements
Acknowledge: Supported by the National Natural Science Foundation of China under Grant No. 61070001, the Special Funds for Key Program of the China No. 2011ZX0302-004-002, the Key Science Foundation of Zhejiang Province under Grand No. 2010C11048, the Research Foundation of Education Bureau of Zhejiang Province under Grant No. Y200909683, the State Key Laboratory of High-end Server & Storage Technology(No. 2009HSSA10), National Key Laboratory of Science and Technology on Avionics System Integration.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yan, L., Wu, B., Wen, Y. et al. A reconfigurable processor architecture combining multi-core and reconfigurable processing units. Telecommun Syst 55, 333–344 (2014). https://doi.org/10.1007/s11235-013-9791-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11235-013-9791-1