Skip to main content
Log in

Instruction Fusion for Multiscalar and Many-Core Processors

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

The utilization wall, caused by the breakdown of threshold voltage scaling, hinders performance gains for new generation microprocessors. We propose an instruction fusion technique for multiscalar and many-core processors to alleviate its impact. With instruction fusion, similar copies of an instruction to be run on multiple pipelines or cores are merged into a single copy for simultaneous execution. Instruction fusion applied to vector code enables the processor to idle early pipeline stages and instruction caches at various times during program implementation with minimum performance degradation, while reducing program size and the required instruction memory bandwidth. Instruction fusion is applied here to a MIPS-based dual-core that resembles an ideal multiscalar of degree two. Benchmarking using an FPGA prototype shows a 6–11 % reduction in the dynamic power dissipation for the targeted applications as well as a 17–45 % decrease in code size with frequent performance improvements due to higher instruction cache hit rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Ansari, B., Hasan, M.A.: High-performance architecture of elliptic curve scalar multiplication. IEEE Trans. Comput. 57, 1443–1453 (2008)

    Article  MathSciNet  Google Scholar 

  2. Beldianu, S.F., Ziavras, S.G.: Performance-energy optimizations for shared vector accelerators in multicores. IEEE Trans. Comput. 64, 805–817 (2015)

    Article  MathSciNet  Google Scholar 

  3. Dally, W.J., et al.: Efficient embedded computing. Computer 41, 27–32 (2008)

    Article  Google Scholar 

  4. Dennard, R.H., Gaensslen, F.H., et al.: Design of ion-implanted MOSFET’s with very small physical dimensions. IEEE J. Solid State Circuits 9, 256–268 (1974)

    Article  Google Scholar 

  5. Goulding-Hotta, N., Sampson, J., et al.: GreenDroid: an architecture for the dark silicon age. Proceedings of 17th Asia and South Pacific Design Automation Conference. pp. 100–105 (2012)

  6. Pajuelo, A., Gonzalez, A., Valero, M.: Speculative dynamic vectorization. Proceedings of 29th Annual International Symposium on Computer Architecture. pp. 271–280 (2002)

  7. Rakvic, J., et al.: Energy efficiency via thread fusion and value reuse. IET Comput. Digit. Tech. 4, 114–125 (2010)

    Article  Google Scholar 

  8. Taylor, M.B.: Is dark silicon useful? Proceedings of 49th Annual Design Automation Conference. pp. 1131–1136 (2012)

  9. Venkatesh, G., et al.: Conservation cores: reducing the energy of mature computations. ACM SIGARCH Comput. Archit. News 38, 205–218 (2010)

    Article  Google Scholar 

  10. Wang, S., et al.: Efficient scalable algorithms for solving dense linear systems with hierarchically semiseparable structures. SIAM J. Sci. Comput. 35, C519–C544 (2013)

  11. Wang, X., Ziavras, S.G., et al.: Parallel solution of Newton’s power flow equations on configurable chips. Int. J. Electr Power Energy Syst. 29, 422–431 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sotirios G. Ziavras.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, Y., Ziavras, S.G. Instruction Fusion for Multiscalar and Many-Core Processors. Int J Parallel Prog 45, 67–78 (2017). https://doi.org/10.1007/s10766-015-0386-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-015-0386-1

Keywords

Navigation