Skip to main content
Log in

A fast MILP solver for high-level synthesis based on heuristic model reduction and enhanced branch and bound algorithm

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Modeling high-level synthesis (HLS) as mixed integer linear programming (MILP) affords the opportunity to integrate constraints and optimization objectives of hardware design in the form of a mathematical intermediate representation. Consequently, it is possible to improve previously developed methods to solve MILP models and customize them for application in domain-specific functions. However, the problem remains NP-hard, and solving large models requires methods that investigate the state space intelligently. Despite the high potential of branch and bound (B&B) algorithms to solve MILP models quickly, computing the best answer is still an open challenge. In this paper, we first develop three model improvement techniques that reduce the size of the search space for MILP models derived from HLS. Then, we present a new B&B algorithm to tackle the computational challenge by considering the properties of the original HLS problem. In this regard, we propose two heuristic techniques that enable the B&B algorithm to prove that some branches are not promising and should be pruned. Moreover, we have developed a best-first strategy by suggesting a new priority key calculation scheme to improve tree traversal in the B&B algorithm. Besides, the sifting method has been employed as the LP relaxation method to achieve fast processing of large models. Our proposed approach was evaluated using a set of MILP models derived from the synthesis of Mediabench data flow graphs. The experimental results indicate that our approach outperforms modern MILP solvers in terms of speed and the scale of the MILP models which can solve. According to the test results on the large MILP models, we solved models with 7254 integer and binary variables in less than 13 minutes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Availability of data and materials

The data that support the findings of this study are openly available in http://mathstat.slu.edu/fritts/mediabench.

References

  1. Gorgin S, Gholamrezaei MH, Javaheri D, Lee J-A (2022) An efficient fpga implementation of k-nearest neighbors via online arithmetic. In: 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, pp 1–2

  2. Javaheri D, Gorgin S, Lee J-A, Masdari M (2022) An improved discrete Harris hawk optimization algorithm for efficient workflow scheduling in multi-fog computing. Sustain Comput Inf Syst 1:100787

    Google Scholar 

  3. Micheli GD (1994) Synthesis and optimization of digital circuits. McGraw-Hill Higher Education, New York

    Google Scholar 

  4. Folmer HH, Groote Rd, Bekooij MJG (2022) High-level synthesis of digital circuits from template haskell and sdf-ap. In: Orailoglu A, Reichenbach M, Jung M (eds) Embedded computer systems: architectures, modeling, and simulation. Springer, Cham, pp 3–27

  5. Lahti S, Rintala M, Hämäläinen TD (2022) Leveraging modern c++ in high-level synthesis. IEEE Trans Comput-Aided Des Integrated Circuits Syst

  6. Folmer HH, Groote Rd, Bekooij MJG (2022) High-level synthesis of digital circuits from template haskell and sdf-ap. In: Orailoglu A, Reichenbach M, Jung M (eds) Embedded computer systems: architectures, modeling, and simulation. Springer, Cham, pp 3–27

  7. Guo L, Chi Y, Lau J, Song L, Tian X, Khatti M, Qiao W, Wang J, Ustun E, Fang Z, Zhang Z, Cong J (2022) TAPA: a scalable task-parallel dataflow programming framework for modern FPGAs with co-optimization of HLS and physical design. https://doi.org/10.48550/ARXIV.2209.02663

  8. Fallah MK, Fazlali M (2021) Parallel branch and bound algorithm for solving integer linear programming models derived from behavioral synthesis. Parall Comput 101:102722

    Article  MathSciNet  Google Scholar 

  9. Li Y, Niu J, Atiquzzaman M, Long X (2017) Energy-aware scheduling on heterogeneous multi-core systems with guaranteed probability. J Parall Distrib Comput 103:64–76

    Article  Google Scholar 

  10. Dumitrescu I, Stützle T (2003) Combinations of local search and exact algorithms. In: Cagnoni S, Johnson CG, Cardalda JJR, Marchiori E, Corne DW, Meyer J-A, Gottlieb J, Middendorf M, Guillot A, Raidl GR, Hart E (eds) Applications of evolutionary computing. Springer, Berlin, pp 211–223

    Chapter  MATH  Google Scholar 

  11. Puchinger J, Raidl GR (2005) Combining metaheuristics and exact algorithms in combinatorial optimization: a survey and classification. In: Mira J, Álvarez JR (eds) Artificial intelligence and knowledge engineering applications: a bioinspired approach. Springer, Berlin, pp 41–53

    Google Scholar 

  12. Aziz SM, Hoskin DH, Pham DM, Kamruzzaman J (2022) Remote reconfiguration of fpga-based wireless sensor nodes for flexible internet of things. Comput Electr Eng 100:107935

    Article  Google Scholar 

  13. Zhou Z, Liu Y, Yu H, Chen Q (2021) Logistics supply chain information collaboration based on fpga and internet of things system. Microprocess Microsyst 80:103589

    Article  Google Scholar 

  14. Bobda C, Mbongue JM, Chow P, Ewais M, Tarafdar N, Vega JC, Eguro K, Koch D, Handagala S, Leeser M et al (2022) The future of fpga acceleration in datacenters and the cloud. ACM Trans Reconfigurable Technol Syst (TRETS) 15(3):1–42

    Article  Google Scholar 

  15. Liao Y, Adegbija T, Lysecky R (2022) A high-level synthesis approach for precisely-timed, energy-efficient embedded systems. Sustain Comput: Inf Syst 35:100741

    Google Scholar 

  16. Purushothaman P, Srihari S, Deivalakshmi S (2021) High-level synthesis of cellular automata-belousov zhabotinsky reaction in fpga. In: Machine Learning. Deep Learning and Computational Intelligence for Wireless Communication. Springer, Singapore, pp 341–349

  17. You G, Wang X (2020) A server-side accelerator framework for multi-core cpus and intel xeon phi co-processor systems. Clust Comput 23(4):2591–2608

    Article  Google Scholar 

  18. Bournias I, Chotin R, Lacassagne L (2022) Using hls for designing a parametric optical flow hierarchical algorithm in fpgas. In: IEEE International Symposium on Circuits and Systems (ISCAS 2022)

  19. Arias-Garcia J, Mafra A, Gade L, Coelho F, Castro C, Torres L, Braga A (2020) Enhancing performance of gabriel graph-based classifiers by a hardware co-processor for embedded system applications. IEEE Trans Industr Inf 17(2):1186–1196

    Article  Google Scholar 

  20. Wang C, Gong L, Li X, Zhou X (2020) A ubiquitous machine learning accelerator with automatic parallelization on fpga. IEEE Trans Parallel Distrib Syst 31(10):2346–2359

    Article  Google Scholar 

  21. Elnawawy M, Sagahyroon A, Shanableh T (2020) Fpga-based network traffic classification using machine learning. IEEE Access 8:175637–175650

    Article  Google Scholar 

  22. Shahsavani SN, Fayyazi A, Nazemi M, Pedram M (2022) Efficient compilation and mapping of fixed function combinational logic onto digital signal processors targeting neural network inference and utilizing high-level synthesis. ACM Transactions on Reconfigurable Technology and Systems (TRETS)

  23. Lo M, Fang Z, Wang J, Zhou P, Chang M-CF, Cong J (2020) Algorithm-hardware co-design for bqsr acceleration in genome analysis toolkit. In: 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, pp 157–166

  24. Young-Schultz T, Lilge L, Brown S, Betz V (2020) Using opencl to enable software-like development of an fpga-accelerated biophotonic cancer treatment simulator. In: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp 86–96

  25. Choi Y-k, Chi Y, Lau J, Cong J (2022) Taro: Automatic optimization for free-running kernels in fpga high-level synthesis. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

  26. Song L, Chi Y, Sohrabizadeh A, Choi Y-k, Lau J, Cong J (2022) Sextans: A streaming accelerator for general-purpose sparse-matrix dense-matrix multiplication. In: Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp 65–77

  27. din Dawrayn AM, Bilal M, (2022) Svm-based switching filter hardware design for mixed noise reduction in digital images using high-level synthesis tools. Int J Comput Vis Image Process (IJCVIP) 12(1):1–16

  28. Islam SA, Katkoori S (2022) Behavioral synthesis of key-obfuscated rtl ip. In: Behavioral Synthesis for Hardware Security. Springer, Berlin, pp 17–42

  29. Cong J, Liu B, Neuendorffer S, Noguera J, Vissers K, Zhang Z (2011) High-level synthesis for fpgas: from prototyping to deployment. IEEE Trans Comput Aided Des Integr Circuits Syst 30(4):473–491

    Article  Google Scholar 

  30. Jose S (2022) Vivado design suite user guide: high-level synthesis. UG-902, Xilinx). https://docs.xilinx.com/v/u/en-US/ug902-vivado-high- level-synthesis

  31. Canis A, Choi J, Aldham M, Zhang V, Kammoona A, Anderson JH, Brown S, Czajkowski T (2011) Legup: high-level synthesis for fpga-based processor/accelerator systems. In: Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp 33–36

  32. Ropponen J (2021) Feasibility of using high-level synthesis in fpga design: evaluating the capabilities of intel high-level synthesis compiler

  33. Pilato C, Ferrandi F (2013) Bambu: A modular framework for the high level synthesis of memory-intensive applications. In: 2013 23rd International Conference on Field Programmable Logic and Applications. IEEE, pp 1–4

  34. Cong J, Lau J, Liu G, Neuendorffer S, Pan P, Vissers K, Zhang Z (2022) Fpga hls today: successes, challenges, and opportunities. ACM Trans Reconfigurable Technol Syst (TRETS) 15(4):1–42

    Article  Google Scholar 

  35. Molina RS, Gil-Costa V, Crespo ML, Ramponi G (2022) High-level synthesis hardware design for fpga-based accelerators: Models, methodologies, and frameworks. IEEE Access

  36. Fallah MK, Mirhosseini M, Fazlali M, Daneshtalab M (2020) Scalable parallel genetic algorithm for solving large integer linear programming models derived from behavioral synthesis. In: 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp 390–394

  37. Fazlali M, Zakerolhosseini A, Gaydadjiev G (2012) Efficient datapath merging for the overhead reduction of run-time reconfigurable systems. J Supercomput 59(2):636–657

    Article  Google Scholar 

  38. Fazlali M, Fallah MK, Hosseinpour N, Katanforoush A (2019) Accelerating datapath merging by task parallelisation on multicore systems. Int J Parall Emergent Distrib Syst 1:1–14

    Google Scholar 

  39. Williams A, Brown A, Baidas Z (2001) Optimisation in behavioural synthesis using hierarchical expansion: module ripping. Comput Digit Tech 148(1):31–43

    Article  Google Scholar 

  40. Chabini N, Wolf W (2005) Unification of scheduling, binding, and retiming to reduce power consumption under timings and resources constraints. IEEE Trans Very Large Scale Integration (VLSI) Syst 13(10):1113–1126

  41. Dilek S, Smri R, Tosun S, Dal D (2020) A high-level synthesis methodology for energy and reliability-oriented designs. IEEE Trans Comput

  42. Nalci Y, Kullu P, Tosun S, Ozturk O (2021) Ilp formulation and heuristic method for energy-aware application mapping on 3d-nocs. J Supercomput 77(3):2667–2680

    Article  Google Scholar 

  43. Pilato C, Wu K, Garg S, Karri R, Regazzoni F (2018) Tainthls: high-level synthesis for dynamic information flow tracking. IEEE Trans Comput Aided Des Integr Circuits Syst 38(5):798–808

    Article  Google Scholar 

  44. Fallah MK, Keshvari VS, Fazlali M (2019) A parallel hybrid genetic algorithm for solving the maximum clique problem. In: International Congress on High-Performance Computing and Big Data Analysis . Springer, pp 378–393

  45. Fazlali M, Fallah MK, Zolghadr M, Zakerolhosseini A (2009) A new datapath merging method for reconfigurable system. In: International Workshop on Applied Reconfigurable Computing. Springer, pp 157–168

  46. Fazlali M, Zakerolhosseini A, Sabeghi M, Bertels K, Gaydadjiev G (2009) Data path configuration time reduction for run-time reconfigurable systems. In: ERSA, pp 323–327

  47. Fazlali M, Sabeghi M, Zakerolhosseini A, Bertels K (2010) Efficient task scheduling for runtime reconfigurable systems. J Syst Architect 56(11):623–632

    Article  Google Scholar 

  48. Barik R, Grothoff C, Gupta R, Pandit V, Udupa R (2006) Optimal bitwise register allocation using integer linear programming. In: International Workshop on Languages and Compilers for Parallel Computing. Springer, pp 267–282

  49. Chen L, Ebrahimi M, Tahoori MB (2016) Reliability-aware resource allocation and binding in high-level synthesis. ACM Trans Des Autom Electron Syst (TODAES) 21(2):1–27

    Google Scholar 

  50. Chen J, Chang C-H, Ding J, Qiao R, Faust M (2017) Tap delay-and-accumulate cost aware coefficient synthesis algorithm for the design of area-power efficient fir filters. IEEE Trans Circuits Syst I Regul Pap 65(2):712–722

    Article  Google Scholar 

  51. Taher FN, Kishani M, Schafer BC (2018) Design and optimization of reliable hardware accelerators: Leveraging the advantages of high-level synthesis. In: 2018 IEEE 24th International Symposium on On-Line Testing And Robust System Design (IOLTS). IEEE, pp 232–235

  52. de Fine Licht J, Besta M, Meierhans S, Hoefler T (2020) Transformations of high-level synthesis codes for high-performance computing. IEEE Trans Parallel Distrib Syst 32(5):1014–1029

    Article  Google Scholar 

  53. Nam H, Lysecky R (2018) Security-aware multi-objective optimization of distributed reconfigurable embedded systems. J Parall Distrib Comput

  54. Lee S, Gerstlauer A (2019) Approximate high-level synthesis of custom hardware. In: Approximate Circuits. Springer, Cham, pp 205–223

  55. Bobda C, Yonga F, Gebser M, Ishebabi H, Schaub T (2018) High-level synthesis of on-chip multiprocessor architectures based on answer set programming. J Parall Distrib Comput 117:161–179

    Article  Google Scholar 

  56. Arabnejad H, Barbosa JG (2013) List scheduling algorithm for heterogeneous systems by an optimistic cost table. IEEE Trans Parallel Distrib Syst 25(3):682–694

    Article  Google Scholar 

  57. Fazlali M, Zakerolhosseini A, Shahbahrami A, Gaydadjiev G (2009) High speed merged-datapath design for run-time reconfigurable systems. In: 2009 International Conference on Field-Programmable Technology. IEEE, pp 339–343

  58. Knop D, Kouteckỳ M (2018) Scheduling meets n-fold integer programming. J Sched 21(5):493–503

    Article  MathSciNet  MATH  Google Scholar 

  59. Sulaiman M, Halim Z, Waqas M, Aydın D (2021) A hybrid list-based task scheduling scheme for heterogeneous computing. J Supercomput 1:1–37

    Google Scholar 

  60. Sirisha D, Prasad SS (2022) Mpeft: a makespan minimizing heuristic scheduling algorithm for workflows in heterogeneous computing systems. CCF Trans High Perform Comput 1:1–16

    Google Scholar 

  61. Fallah MK, Fazlali M, Daneshtalab M (2021) A symbiosis between population based incremental learning and lp-relaxation based parallel genetic algorithm for solving integer linear programming models. Computing 1–19

  62. Belwal M, Ramesh T (2022) N-pir: a neighborhood-based pareto iterative refinement approach for high-level synthesis. Arab J Sci Eng 1–17

  63. Taha HA (2014) Integer Programming: Theory, Applications, and Computations, United States

  64. Wilken K, Liu J, Heffernan M (2000) Optimal instruction scheduling using integer programming. Acm Sigplan Not 35(5):121–133

    Article  Google Scholar 

  65. Rodionov A, Rose J (2017) Synchronization constraints for interconnect synthesis. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp 95–104

  66. Ohata K, Nishikawa H, Kong X, Tomiyama H (2022) Ilp-based and heuristic scheduling techniques for variable-cycle approximate functional units in high-level synthesis. Computers 11(10):146

    Article  Google Scholar 

  67. Gay DM (2015) The ampl modeling language: An aid to formulating and solving optimization problems. In: Numerical Analysis and Optimization. Springer, Cham, pp 95–116

  68. Bussieck MR, Meeraus A (2004) General Algebraic Modeling System (GAMS). Springer, Boston, MA, pp 137–157

  69. Lin Y, Schrage L (2009) The global solver in the lindo api. Optim Methods Softw 24(4–5):657–668

    Article  MathSciNet  MATH  Google Scholar 

  70. Berthold T, Farmer J, Heinz S, Perregaard M (2018) Parallelization of the fico xpress-optimizer. Optim Methods Softw 33(3):518–529

    Article  MathSciNet  MATH  Google Scholar 

  71. Gurobi Optimization L (2019) Gurobi optimizer (Version 9.0)

  72. CPLEX I (2014) ILOG CPLEX 12.6 Optimization Studio. IBM, New York, NY, USA

  73. Gilmore PC, Gomory RE (1963) A linear programming approach to the cutting stock problem-part ii. Oper Res 11(6):863–888

    Article  MATH  Google Scholar 

  74. Mitchell JE (2002) Branch-and-cut algorithms for combinatorial optimization problems. Handb Appl Optim 1:65–77

    Google Scholar 

  75. Fischetti M, Monaci M (2020) A branch-and-cut algorithm for mixed-integer bilinear programming. Eur J Oper Res 282(2):506–514

    Article  MathSciNet  MATH  Google Scholar 

  76. Clausen J (1999) Branch and bound algorithms-principles and examples. Department of Computer Science, University of Copenhagen, 1–30

  77. Parragh SN, Tricoire F (2019) Branch-and-bound for bi-objective integer programming. INFORMS J Comput 31(4):805–822

    Article  MathSciNet  MATH  Google Scholar 

  78. He H, Daume H III, Eisner JM (2014) Learning to search in branch and bound algorithms. Adv Neural Inf Process Syst 27:3293–3301

    Google Scholar 

  79. Karmarkar N (1984) A new polynomial-time algorithm for linear programming. In: Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing, pp 302–311

  80. Desai J, Wang K (2011) Lagrangian optimization for lp: Theory and algorithms. Wiley Encyclopedia of Operations Research and Management Science, pp 1–19

  81. Forrest J (1989) Mathematical programming with a library of optimization subroutines. In: ORSA/TIMS Joint National Meeting, New York

  82. Bixby RE, Gregory JW, Lustig IJ, Marsten RE, Shanno DF (1992) Very large-scale linear programming: a case study in combining interior point and simplex methods. Oper Res 40(5):885–897

    Article  MathSciNet  MATH  Google Scholar 

  83. Gao W, Sun C, Ye Y, Ye Y (2021) Boosting method in approximately solving linear programming with fast online algorithm. arXiv preprint arXiv:2107.03570

  84. Mediabench (2019) http://mathstat.slu.edu/~fritts/mediabench

Download references

Funding

This research received no external funding.

Author information

Authors and Affiliations

Authors

Contributions

Mina Mirhoseini implemented the algorithms as a Ph.D. researcher and wrote the paper in cooperation with Mohammad K Fallah as the responsible researcher. Mahmood Fazlali supervises the research as the Ph.D. supervisor, while Jeong-A Lee contributes to the paper as an advisor.

Corresponding author

Correspondence to Mahmood Fazlali.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mirhosseini, M., Fazlali, M., Fallah, M.K. et al. A fast MILP solver for high-level synthesis based on heuristic model reduction and enhanced branch and bound algorithm. J Supercomput 79, 12042–12073 (2023). https://doi.org/10.1007/s11227-023-05109-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05109-2

Keywords

Navigation