Skip to main content
Log in

Power Mitigation by Performance Equalization in a Heterogeneous Reconfigurable Multicore Architecture

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

This paper presents an integrated self-aware computing model mitigating the power dissipation of a heterogeneous reconfigurable multicore architecture by dynamically scaling the operating frequency of each core. The power mitigation is achieved by equalizing the performance of all the cores for an uninterrupted exchange of data. The multicore platform consists of heterogeneous Coarse-Grained Reconfigurable Arrays (CGRAs) of application-specific sizes and a Reduced Instruction-Set Computing (RISC) core. The CGRAs and the RISC core are integrated with each other over a Network-on-Chip (NoC) of six nodes arranged in a topology of two rows and three columns. The RISC core constantly monitors and controls the performance of each CGRA accelerator by adjusting the operating frequencies unless the performance of all the CGRAs is optimally balanced over the platform. The CGRA cores on the platform are processing some of the most computationally-intensive signal processing algorithms while the RISC core establishes packet based synchronization between the cores for computation and communication. All the cores can access each other’s computational and memory resources while processing the kernels simultaneously and independently of each other. Besides general-purpose processing and overall platform supervision, the RISC processor manages performance equalization among all the cores which mitigates the overall dynamic power dissipation by 20.7 % for a proof-of-concept test.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5

Similar content being viewed by others

References

  1. Venkatesh, G., Sampson, J., Goulding, N., Gracia, S., Bryksin, V., Martinez, J.L., Swanson, S., & Taylor, M.B. (2010). Conservation Cores: Reducing the Energy of Mature Computations, ASPLOS 10, pp. 205218.

  2. Taylor, M.B. Is dark silicon useful?: harnessing the four horsemen of the coming dark silicon apocalypse. In proceedings of the 49th Annual Design Automation Conference (DAC ’12) (pp. 1131–1136). NY, USA: ACM.

  3. Airoldi, R., Garzia, F., & Nurmi, J. (2011). Improving Reconfigurable Hardware Energy Efficiency and Robustness via DVFS-scaled Homogeneous MPSoc. In IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW) (pp. 286–289).

  4. Hoffmann, H., Holt, J., Kurian, G., Lau, E., Maggio, M., Miller, J.E., Neuman, S.M., Sinangil, M., Sinangil, Y., Agarwal, A., Chandrakasan, A.P., & Devadas, S. (2012). Self-aware computing in the Angstrom processor. In 2012 49th ACM/EDAC/IEEE Design Automation Conference (DAC) (pp. 259–264).

  5. Kishimoto, Y., Haruyama, S., & Amano, H. Design and implementation of adaptive viterbi decoder for using a dynamic reconfigurable processor. In Proceedings Reconfigurable Computing and FPGAs, 2008. ReConFig ’08. doi:10.1109/ReConFig.2008.39, ISBN: 978-1-4244-3748-1 (pp. 247–252).

  6. Lo, C.-C., Tsai, S.-T., & Shieh, M.-D. (2009). A reconfigurable architecture for entropy decoding and IDCT in H.264. In International Symposium on VLSI Design, Automation and Test, 2009. VLSI-DAT ’09, vol., no., pp. 279–282. doi:10.1109/VDAT.2009.5158149, ISBN: 978-1-4244-2781-9.

  7. Brunelli, C., Garzia, F., & Nurmi, J. (2008). A Coarse-Grain reconfigurable architecture for multimedia applications featuring subword computation capabilities. Journal of Real-Time Image Processing, Springer-Verlag, 3(1–2), 21–32. doi:10.1007/s11554-008-0071-3.

  8. Singh, H., Lee, M.-H., Lu, G., Kurdahi, F.J., Bagherzadeh, N., & Filho, E.M.C. (2000). Morphosys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Transactions Computers, 49(5), 465– 481.

    Article  Google Scholar 

  9. Mei, B., Vernalde, S., Verkest, D., Man, H.D., & Lauwereins, R. (2003). ADRES: An architecture with tightly coupled VLIW processor and coarse-grained reconfigurable matrix. Field-Programmable Logic and Applications, 2778, 61–70. ISBN 978-3-540-40822-2.

    Article  Google Scholar 

  10. Baumgarte, V., Ehlers, G., May, F., Nuckel, A., Vorbach, M., & Weinhardt, M. (2003). PACT XPP- A Self-Reconfigurable data processing architecture. The Journal of Supercomputing, 26(2), 167–184.

    Article  MATH  Google Scholar 

  11. Garzia, F., Hussain, W., & Nurmi, J. (2009). CREMA, A coarse-grain re-configurable array with mapping adaptiveness. In Proceedings 19th International Conference on Field Programmable Logic and Applications (FPL 2009). Prague, Czech Republic: IEEE.

  12. Hussain, W., Garzia, F., Ahonen, T., & Nurmi, J. (2012). Designing fast fourier transform accelerators for orthogonal frequency- division multiplexing systems. Journal of Signal Processing Systems, Springer, ISSN 1939-8018, 69, 161–171.

    Google Scholar 

  13. Hussain, W., Ahonen, T., & Nurmi, J. (2012). Effects of scaling a coarse-grain reconfigurable array on power and energy consumption. In Proceedings SoC 2012. Tampere, Finland.

  14. Hussain, W., Garzia, F., & Nurmi, J. (2010). Evaluation of Radix-2 and Radix-4 FFT Processing on a Reconfigurable Platform. In proceedings of the 13th IEEE International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS’10). ISBN 978-1-4244-6610-8 (pp. 249–254): IEEE.

  15. Garzia, F., Hussain, W., Airoldi, R., & Nurmi, J. (2009). A reconfigurable SoC tailored to software defined radio applications. In Proceedings of 27th Norchip Conference. Trondheim (NO).

  16. IEEE Standard for Information technology–Telecommunications and information exchange between systems–Local and metropolitan area networks–Specific requirements Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications Amendment 5: Enhancements for Higher Throughput, IEEE, 3 Park Avenue, NY 10016-5997, USA, 2009, E-ISBN : 978-0-7381-6046-7, Print ISBN: 978-0-7381- 6047-4.

  17. Rauwerda, G.K., Heysters, P.M., & Smit, G.J.M. (2008). Towards software defined radios using coarse-grained reconfigurable hardware. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 16(1), 313.

    Article  Google Scholar 

  18. Cooley, J.W., & Tukey, J.W. (1965). An algorithm for the machine calculation of complex Fourier series. Mathematics Comparative, 19, 297–301.

    Article  MathSciNet  MATH  Google Scholar 

  19. Kylliainen, J., Ahonen, T., & Nurmi, J. General-purpose embedded processor cores - the COFFEE RISC example. In Nurmi, J. (Ed.), Processor Design: System-on-Chip Computing for ASICs and FPGAs. Kluwer Academic Publishers / Springer Publishers, 2007, ch. 5, pp. 83-100, ISBN-10: 1402055293, ISBN-13: 978-1-4020-5529-4.

  20. Brunelli, C., Garzia, F., Giliberto, C., & Nurmi, J. (2008). A Dedicated DMA Logic Addressing a Time Multiplexed Memory to Reduce the Effects of the System Buss Bottlenec. In Proceedings 18th International Conference on Field Programmable Logic and Applications, (FPL 2008) (pp. 487–490). Heidelberg, Germany.

  21. Garzia, F., Brunelli, C., & Nurmi, J. (2008). A pipelined infrastructure for the distribution of the configuration bitstream in a coarse-grain reconfigurable array. In Proceedings of the 4th International Workshop on Reconfigurable Communication-centric System-on-Chip (ReCoSoC’08). ISBN:978-84-691-3603-4 (pp. 188–191): University Montpellier II.

  22. Hussain, W., Ahonen, T., Garzia, F., & Nurmi, J. (2011). Application-driven dimensioning of a coarse-grain reconfigurable array. In Proceedings NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2011) (pp. 234–239). California, USA.

  23. Hussain, W., Ahonen, T., & Nurmi, J. (2012). Effects of scaling a coarse-grain reconfigurable array on power and energy consumption. In Proceedings SoC 2012. Tampere, Finland.

  24. Campi, F., Deledda, A., Pizzotti, M., Ciccarelli, L., Rolandi, P., Mucci, C., Lodi, A., Vitkovski, A., & Vanzolini, L. A dynamically adaptive DSP for heterogeneous reconfigurable platforms. In Proceedings of Design Automation and Test in Europe (DATE ’07) (pp. 9–14). CA, USA: EDA Consortium.

  25. Voros, N.S., Rosti, A., & Hubner, M. Flexeos Embedded FPGA Solution. In Dynamic System Reconfiguration in Heterogeneous Platforms, Lecture Notes in Electrical Engineering. ISBN 978-90-481-2426-8, (Vol. 40 pp. 39–47). Netherlands: Springer.

  26. Choi, H.J., Park, Y.J., Lee, H.-H., & Kim, C.H. (2012). Adaptive dynamic frequency scaling for thermal-aware 3d multi-core processors. In Computational Science and Its Applications-ICCSA 2012, Lecture Notes in Computer Science. ISBN: 978-3-642-31127-7, (Vol. 7336 pp. 602–612). Berlin Heidelberg: Springer.

  27. Chen, X., Xu, Z., Kim, H., Gratz, P.V., Hu, J., Kishinevsky, M., Ogras, U., & Ayoub, R. Dynamic voltage and frequency scaling for shared resources in multicore processor designs. In Proceedings of the 50th Annual Design Automation Conference (DAC ’13). Article 114 , 7 pages. doi:10.1145/2463209.2488874. NY, USA: ACM.

  28. Hussain, W., Hoffmann, H., Ahonen, T., & Nurmi, J. Constraint-driven frequency scaling in a coarse grain reconfigurable array. In Proceedings System-on-Chip Symposium 2014. Tampere, Finland.

  29. Jafri, S.M.A.H., Tajammul, M.A., Hemani, A., Paul, K., Plosila, J., & Tenhunen, H. (2013). Energy-aware-task-parallelism for efficient dynamic voltage, and frequency scaling. In CGRAs, 2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIII). doi:10.1109/SAMOS.2013.6621112 (pp. 104–112).

  30. Haghbayan, M.-H., Rahmani, A.-M., Weldezion, A.Y., Liljeberg, P., Plosila, J., Jantsch, A., & Tenhunen, H. (2014). Dark silicon aware power management for manycore systems under dynamic workloads. In 2014 32nd IEEE International Conference on Computer Design (ICCD) (pp. 509–512).

  31. Voros, N.S., Hubner, M., Becker, J., Khnle, M., Thomaitiv, F., Grasset, A., Brelet, P., Bonnot, P., Campi, F., Schler, E., Sahlbach, H., Whitty, S., Ernst, R., Billich, E., Tischendorf, C., Heinkel, U., Ieromnimon, F., Kritharidis, D., Schneider, A., Knaeblein, J., & Putzke-Rming, W. MORPHEUS: A Heterogeneous Dynamically Reconfigurable Platform for Designing Highly Complex Embedded Systems. ACM Transactions on Embedded Computing Systems 12, 3, Article 70 (2013), 33 pages.

Download references

Acknowledgment

This research work is jointly conducted by the Department of Electronics and Communications Engineering, Tampere University of Technology, Finland and the Department of Computer Science, The University of Chicago, Illinois, USA. It was partially funded by the Academy of Finland under contract # 258506 (DEFT: Design of a Highly-parallel Heterogeneous MP-SoC Architecture for Future Wireless Technologies) and Tampere Doctoral Programme in Information Science and Engineering, Finland. The Department of Computer Science, The University of Chicago, Illinois, USA also provided the financial and on-site resources for its implementation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Waqar Hussain.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hussain, W., Hoffmann, H., Ahonen, T. et al. Power Mitigation by Performance Equalization in a Heterogeneous Reconfigurable Multicore Architecture. J Sign Process Syst 87, 287–297 (2017). https://doi.org/10.1007/s11265-016-1142-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-016-1142-5

Keywords

Navigation