Skip to main content

Efficient Operator Sharing Modulo Scheduling for Sum-Product Network Inference on FPGAs

  • Conference paper
  • First Online:
  • 953 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13227))

Abstract

Probabilistic models are receiving increasing attention as a complementary alternative to more widespread machine learning approaches such as neural networks. One particularly interesting class of models, so-called Sum-Product Networks (SPNs), combine the expressiveness of probabilistic models with tractable inference, making them an interesting candidate for use in real-world applications.

Previously, inference in SPNs has successfully been accelerated by fully pipelined FPGA-based hardware. However, with these approaches, the maximum size of the SPN for FPGA acceleration has effectively been limited by the fully spatial mapping of arithmetic operations into hardware and the number of available resources in the FPGA.

In this work, we present an extended and specialized modulo scheduling algorithm based on Integer Linear Programming (ILP) for time-multiplexed sharing of hardware arithmetic operators in the SPN inference accelerator. In addition and in order to scale the scheduling to large SPN graphs, we combine the scheduling algorithm with a graph-partitioning heuristic, exploiting the graph structure of SPNs.

The combination of heuristic graph partitioning and ILP-based scheduling allows generating pipelined accelerators with the best possible initiation interval, while limiting the resource utilization to pre-set bounds. The evaluation discusses the effect different parameters have on convergence time and solution quality. A performance comparison shows that the FPGA improves the inference throughput over a comparable CPU- and GPU platform by a factor (geo.-mean) of 4.4x and 1.7x, respectively.

The authors would like to thank Xilinx Inc. for supporting their work by donations of hard- and software. Calculations for this research were conducted on the Lichtenberg high performance computer of TU Darmstadt. This research was partially funded by the German Federal Ministry for Education and Research (BMBF) with the funding ID ZN 01\(\vert \)S17050.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Specifically, variables \(\hat{x}_{i j}, \hat{y}_{i j}, \hat{z}_{i v}\) and all constraints mentioning them.

  2. 2.

    https://github.com/esa-tu-darmstadt/tapasco.

References

  1. Canis, A., Brown, S.D., Anderson, J.H.: Modulo SDC scheduling with recurrence minimization in high-level synthesis. In: International Conference on Field Programmable Logic and Applications (FPL) (2014)

    Google Scholar 

  2. Canis, A., et al.: LegUp: an open-source high-level synthesis tool for FPGA-based processor/accelerator systems. ACM Trans. Embedded Comput. Syst. (TECS) 13(2), 1–27 (2013)

    Google Scholar 

  3. Codina, J.M., Llosa, J., González, A.: A comparative study of modulo scheduling techniques. In: International Conference on Supercomputing (ICS 2002) (2002)

    Google Scholar 

  4. Cong, J., Xu, J.: Simultaneous FU and register binding based on network flow method. In: Design, Automation and Test in Europe (2008)

    Google Scholar 

  5. Dai, S., Zhang, Z.: Improving scalability of exact modulo scheduling with specialized conflict-driven learning. In: Design Automation Conference (2019)

    Google Scholar 

  6. Fan, K., Kudlur, M., Park, H., Mahlke, S.: Cost sensitive modulo scheduling in a loop accelerator synthesis system. In: IEEE/ACM International Symposium on Microarchitecture (MICRO2005) (2005)

    Google Scholar 

  7. Heinz, C., Hofmann, J., Korinth, J., Sommer, L., Weber, L., Koch, A.: The TaPaSCo open-source Toolflow. J. Sign. Process. Syst. 93(5), 545–563 (2021). https://doi.org/10.1007/s11265-021-01640-8

    Article  Google Scholar 

  8. Lam, M.: Software pipelining: an effective scheduling technique for VLIW machines. In: Programming Language Design and Implementation (PLDI) (1988)

    Google Scholar 

  9. Memik, S.O., Memik, G., Jafari, R., Kursun, E.: Global resource sharing for synthesis of control data flow graphs on FPGAs. In: Design Automation Conference (2003)

    Google Scholar 

  10. Molina, A., Vergari, A., Di Mauro, N., Natarajan, S., Esposito, F., Kersting, K.: Mixed sum-product networks: a deep architecture for hybrid domains. In: Thirty-Second AAAI Conference on artificial intelligence (2018)

    Google Scholar 

  11. Ober, M., Hofmann, J., Sommer, L., Weber, L., Koch, A.: High-throughput multi-threaded sum-product network inference in the reconfigurable cloud. In: Workshop on Heterogeneous High-performance Reconfigurable Computing (2019)

    Google Scholar 

  12. Oppermann, J., Sittel, P., Kumm, M., Reuter-Oppermann, M., Koch, A., Sinnen, O.: Design-space exploration with multi-objective resource-aware modulo scheduling. In: Conference on Parallel and Distributed Computing (Euro-Par) (2019)

    Google Scholar 

  13. Peharz, R., Tschiatschek, S., Pernkopf, F., Domingos, P.: On theoretical properties of sum-product networks. In: Artificial Intelligence and Statistics (2015)

    Google Scholar 

  14. Peharz, R., et al.: Random sum-product networks: a simple but effective approach to probabilistic deep learning. In: Proceedings of UAI (2019)

    Google Scholar 

  15. Poon, H., Domingos, P.: Sum-product networks: a new deep architecture. In: IEEE International Conference on Computer Vision Workshops (2011)

    Google Scholar 

  16. Rau, B.R.: Iterative modulo scheduling. Int. J. Parall. Programm. 24(1), 3–64 (1996). https://doi.org/10.1007/BF03356742

    Article  Google Scholar 

  17. Sánchez-Cauce, R., París, I., Díez, F.J.: Sum-product networks: a survey. IEEE Trans. Patt. Anal. Mach. Intell. (2021)

    Google Scholar 

  18. Sittel, P., Kumm, M., Oppermann, J., Möller, K., Zipf, P., Koch, A.: ILP-based modulo scheduling and binding for register minimization. In: International Conference on Field Programmable Logic and Applications (FPL) (2018)

    Google Scholar 

  19. Sommer, L., Oppermann, J., Molina, A., Binnig, C., Kersting, K., Koch, A.: Automatic mapping of the sum-product network inference problem to FPGA-based accelerators. In: IEEE International Conference on Computer Design (ICCD) (2018)

    Google Scholar 

  20. Sommer, L., Weber, L., Kumm, M., Koch, A.: Comparison of arithmetic number formats for inference in sum-product networks on FPGAs. In: International Symposium on Field-Programmable Custom Computing Machines (FCCM) (2020)

    Google Scholar 

  21. Šůcha, P., Hanzálek, Z.: A cyclic scheduling problem with an undetermined number of parallel identical processors. Comput. Optim. Appl. (2011). https://doi.org/10.1007/s10589-009-9239-4

  22. Venieris, S.I., Kouris, A., Bouganis, C.S.: Toolflows for mapping convolutional neural networks on FPGAs: a survey and future directions. ACM Comput. Surv. 51(3) (2018)

    Google Scholar 

  23. Weber, L., Sommer, L., Oppermann, J., Molina, A., Kersting, K., Koch, A.: Resource-efficient logarithmic number scale arithmetic for SPN inference on FPGAs. In: International Conference on Field-Programmable Technology (FPT) (2019)

    Google Scholar 

  24. Zhang, Z., Liu, B.: SDC-based modulo scheduling for pipeline synthesis. In: IEEE/ACM International Conference on Computer-Aided Design (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hanna Kruppe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kruppe, H., Sommer, L., Weber, L., Oppermann, J., Axenie, C., Koch, A. (2022). Efficient Operator Sharing Modulo Scheduling for Sum-Product Network Inference on FPGAs. In: Orailoglu, A., Jung, M., Reichenbach, M. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2021. Lecture Notes in Computer Science, vol 13227. Springer, Cham. https://doi.org/10.1007/978-3-031-04580-6_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-04580-6_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-04579-0

  • Online ISBN: 978-3-031-04580-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics