Abstract
The utilization wall, caused by the breakdown of threshold voltage scaling, hinders performance gains for new generation microprocessors. We propose an instruction fusion technique for multiscalar and many-core processors to alleviate its impact. With instruction fusion, similar copies of an instruction to be run on multiple pipelines or cores are merged into a single copy for simultaneous execution. Instruction fusion applied to vector code enables the processor to idle early pipeline stages and instruction caches at various times during program implementation with minimum performance degradation, while reducing program size and the required instruction memory bandwidth. Instruction fusion is applied here to a MIPS-based dual-core that resembles an ideal multiscalar of degree two. Benchmarking using an FPGA prototype shows a 6–11 % reduction in the dynamic power dissipation for the targeted applications as well as a 17–45 % decrease in code size with frequent performance improvements due to higher instruction cache hit rates.
Similar content being viewed by others
References
Ansari, B., Hasan, M.A.: High-performance architecture of elliptic curve scalar multiplication. IEEE Trans. Comput. 57, 1443–1453 (2008)
Beldianu, S.F., Ziavras, S.G.: Performance-energy optimizations for shared vector accelerators in multicores. IEEE Trans. Comput. 64, 805–817 (2015)
Dally, W.J., et al.: Efficient embedded computing. Computer 41, 27–32 (2008)
Dennard, R.H., Gaensslen, F.H., et al.: Design of ion-implanted MOSFET’s with very small physical dimensions. IEEE J. Solid State Circuits 9, 256–268 (1974)
Goulding-Hotta, N., Sampson, J., et al.: GreenDroid: an architecture for the dark silicon age. Proceedings of 17th Asia and South Pacific Design Automation Conference. pp. 100–105 (2012)
Pajuelo, A., Gonzalez, A., Valero, M.: Speculative dynamic vectorization. Proceedings of 29th Annual International Symposium on Computer Architecture. pp. 271–280 (2002)
Rakvic, J., et al.: Energy efficiency via thread fusion and value reuse. IET Comput. Digit. Tech. 4, 114–125 (2010)
Taylor, M.B.: Is dark silicon useful? Proceedings of 49th Annual Design Automation Conference. pp. 1131–1136 (2012)
Venkatesh, G., et al.: Conservation cores: reducing the energy of mature computations. ACM SIGARCH Comput. Archit. News 38, 205–218 (2010)
Wang, S., et al.: Efficient scalable algorithms for solving dense linear systems with hierarchically semiseparable structures. SIAM J. Sci. Comput. 35, C519–C544 (2013)
Wang, X., Ziavras, S.G., et al.: Parallel solution of Newton’s power flow equations on configurable chips. Int. J. Electr Power Energy Syst. 29, 422–431 (2007)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lu, Y., Ziavras, S.G. Instruction Fusion for Multiscalar and Many-Core Processors. Int J Parallel Prog 45, 67–78 (2017). https://doi.org/10.1007/s10766-015-0386-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-015-0386-1