Skip to main content

Flexible Data Flow Architecture for Embedded Hardware Accelerators

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11944))

Abstract

In order to enable control units to run future algorithms, such as advanced control theory, advanced signal processing, data-based modeling, and physical modeling, the control units require a substantial step-up in computational power. In case of an automotive Engine Control Unit (ECU), safety requirements and cost constraints are just as important. Existing solutions to increase the performance of a microcontroller are either not suitable for a subset of the expected algorithms, or too expensive in terms of area. Hence, we introduce the novel Data Flow Architecture (DFA) for embedded hardware accelerators. The DFA is flexible from the concept level to the individual functional units to achieve a high performance per size ratio for a wide variety of data intensive algorithms. Compared to hardwired implementations, the area can be as little as 1.4 times higher at the same performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ansaloni, G., Bonzini, P., Pozzi, L.: EGRA: a coarse grained reconfigurable architectural template. IEEE Trans. Very Large Scale Integr. VLSI Syst. 19(6), 1062–1074 (2011). https://doi.org/10.1109/TVLSI.2010.2044667

    Article  Google Scholar 

  2. Baumgarte, V., Ehlers, G., May, F., Nückel, A., Vorbach, M., Weinhardt, M.: PACT XPP—a self-reconfigurable data processing architecture. J. Supercomput. 26(2), 167–184 (2003). https://doi.org/10.1023/A:1024499601571

    Article  MATH  Google Scholar 

  3. Bhagyanath, A., Schneider, K.: TTA as predictable architecture for real-time applications. In: 2014 International Conference on Science Engineering and Management Research (ICSEMR), November 2014. https://doi.org/10.1109/ICSEMR.2014.7043544

  4. Diener, R., Hanselmann, M., Lang, T., Markert, H., Ulmer, H.: Data-based models on the ECU. In: Design of Experiments (DoE) in Powertrain Development, pp. 227–241. Expert-Verlag (2015)

    Google Scholar 

  5. Froemmer, J., Bannow, N., Aue, A., Grimm, C., Schneider, K.: Model-based configuration of a coarse-grained reconfigurable architecture. In: MBMV. VDE Verlag (2019)

    Google Scholar 

  6. Govindaraju, V., Ho, C.H., Sankaralingam, K.: Dynamically specialized datapaths for energy efficient computing. In: 2011 IEEE 17th International Symposium on High Performance Computer Architecture, February 2011. https://doi.org/10.1109/HPCA.2011.5749755

  7. Hartenstein, R.: A decade of reconfigurable computing: a visionary retrospective. In: Proceedings of the Conference on Design, Automation and Test in Europe, DATE 2001. IEEE Press (2001)

    Google Scholar 

  8. Jain, T., Schneider, K., Jain, A.: An efficient self-routing and non-blocking interconnection network on chip. In: Proceedings of the 10th International Workshop on Network on Chip Architectures, NoCArc 2017. ACM (2017). https://doi.org/10.1145/3139540.3139546

  9. Jang, M., Kim, K., Kim, K.: The performance analysis of ARM NEON technology for mobile platforms. In: Proceedings of the 2011 ACM Symposium on Research in Applied Computation, RACS 2011. ACM (2011). https://doi.org/10.1145/2103380.2103401

  10. Kim, C., Chung, M., Cho, Y., Konijnenburg, M., Ryu, S., Kim, J.: ULP-SRP: ultra low power Samsung Reconfigurable Processor for biomedical applications. In: 2012 International Conference on Field-Programmable Technology, December 2012. https://doi.org/10.1109/FPT.2012.6412157

  11. Kuon, I., Rose, J.: Measuring the gap between FPGAs and ASICs. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 26(2), 203–215 (2007). https://doi.org/10.1109/TCAD.2006.884574

    Article  Google Scholar 

  12. Liang, C., Huang, X.: SmartCell: an energy efficient coarse-grained reconfigurable architecture for stream-based applications. EURASIP J. Embedded Syst. 2009, 1:1–1:15 (2009). https://doi.org/10.1155/2009/518659

    Article  Google Scholar 

  13. Mei, B., Vernalde, S., Verkest, D., De Man, H., Lauwereins, R.: ADRES: an architecture with tightly coupled VLIW processor and coarse-grained reconfigurable matrix. In: Y. K. Cheung, P., Constantinides, G.A. (eds.) FPL 2003. LNCS, vol. 2778, pp. 61–70. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45234-8_7

    Chapter  Google Scholar 

  14. Nowatzki, T., Gangadhar, V., Ardalani, N., Sankaralingam, K.: Stream-dataflow acceleration. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA 2017. ACM (2017). https://doi.org/10.1145/3079856.3080255

  15. Sankaralingam, K., et al.: Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In: Proceedings of the 30th Annual International Symposium on Computer Architecture, ISCA 2003. ACM (2003). https://doi.org/10.1145/859618.859667

  16. Schlansker, M.S., Rau, B.R.: EPIC: explicitly parallel instruction computing. Computer 33(2), 37–45 (2000). https://doi.org/10.1109/2.820037

    Article  Google Scholar 

  17. Smith, J.E.: Decoupled access/execute computer architectures. In: Proceedings of the 9th Annual Symposium on Computer Architecture, ISCA 1982. IEEE Computer Society Press (1982)

    Google Scholar 

  18. Swanson, S., Michelson, K., Schwerin, A., Oskin, M.: WaveScalar. In: Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 36. IEEE Computer Society (2003)

    Google Scholar 

  19. Tanomoto, M., Takamaeda-Yamazaki, S., Yao, J., Nakashima, Y.: A CGRA-based approach for accelerating convolutional neural networks. In: 2015 IEEE 9th International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, September 2015. https://doi.org/10.1109/MCSoC.2015.41

  20. Tehre, V.: Survey on coarse grained reconfigurable architectures. Int. J. Comput. Appl. 48(16), 1–7 (2012)

    Google Scholar 

  21. Tessier, R., Pocek, K., DeHon, A.: Reconfigurable computing architectures. Proc. IEEE 103(3), 332–354 (2015). https://doi.org/10.1109/JPROC.2014.2386883

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jens Froemmer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Froemmer, J., Bannow, N., Aue, A., Grimm, C., Schneider, K. (2020). Flexible Data Flow Architecture for Embedded Hardware Accelerators. In: Wen, S., Zomaya, A., Yang, L. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2019. Lecture Notes in Computer Science(), vol 11944. Springer, Cham. https://doi.org/10.1007/978-3-030-38991-8_3

Download citation

Publish with us

Policies and ethics