Abstract
Modeling high-level synthesis (HLS) as mixed integer linear programming (MILP) affords the opportunity to integrate constraints and optimization objectives of hardware design in the form of a mathematical intermediate representation. Consequently, it is possible to improve previously developed methods to solve MILP models and customize them for application in domain-specific functions. However, the problem remains NP-hard, and solving large models requires methods that investigate the state space intelligently. Despite the high potential of branch and bound (B&B) algorithms to solve MILP models quickly, computing the best answer is still an open challenge. In this paper, we first develop three model improvement techniques that reduce the size of the search space for MILP models derived from HLS. Then, we present a new B&B algorithm to tackle the computational challenge by considering the properties of the original HLS problem. In this regard, we propose two heuristic techniques that enable the B&B algorithm to prove that some branches are not promising and should be pruned. Moreover, we have developed a best-first strategy by suggesting a new priority key calculation scheme to improve tree traversal in the B&B algorithm. Besides, the sifting method has been employed as the LP relaxation method to achieve fast processing of large models. Our proposed approach was evaluated using a set of MILP models derived from the synthesis of Mediabench data flow graphs. The experimental results indicate that our approach outperforms modern MILP solvers in terms of speed and the scale of the MILP models which can solve. According to the test results on the large MILP models, we solved models with 7254 integer and binary variables in less than 13 minutes.
Similar content being viewed by others
Availability of data and materials
The data that support the findings of this study are openly available in http://mathstat.slu.edu/fritts/mediabench.
References
Gorgin S, Gholamrezaei MH, Javaheri D, Lee J-A (2022) An efficient fpga implementation of k-nearest neighbors via online arithmetic. In: 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, pp 1–2
Javaheri D, Gorgin S, Lee J-A, Masdari M (2022) An improved discrete Harris hawk optimization algorithm for efficient workflow scheduling in multi-fog computing. Sustain Comput Inf Syst 1:100787
Micheli GD (1994) Synthesis and optimization of digital circuits. McGraw-Hill Higher Education, New York
Folmer HH, Groote Rd, Bekooij MJG (2022) High-level synthesis of digital circuits from template haskell and sdf-ap. In: Orailoglu A, Reichenbach M, Jung M (eds) Embedded computer systems: architectures, modeling, and simulation. Springer, Cham, pp 3–27
Lahti S, Rintala M, Hämäläinen TD (2022) Leveraging modern c++ in high-level synthesis. IEEE Trans Comput-Aided Des Integrated Circuits Syst
Folmer HH, Groote Rd, Bekooij MJG (2022) High-level synthesis of digital circuits from template haskell and sdf-ap. In: Orailoglu A, Reichenbach M, Jung M (eds) Embedded computer systems: architectures, modeling, and simulation. Springer, Cham, pp 3–27
Guo L, Chi Y, Lau J, Song L, Tian X, Khatti M, Qiao W, Wang J, Ustun E, Fang Z, Zhang Z, Cong J (2022) TAPA: a scalable task-parallel dataflow programming framework for modern FPGAs with co-optimization of HLS and physical design. https://doi.org/10.48550/ARXIV.2209.02663
Fallah MK, Fazlali M (2021) Parallel branch and bound algorithm for solving integer linear programming models derived from behavioral synthesis. Parall Comput 101:102722
Li Y, Niu J, Atiquzzaman M, Long X (2017) Energy-aware scheduling on heterogeneous multi-core systems with guaranteed probability. J Parall Distrib Comput 103:64–76
Dumitrescu I, Stützle T (2003) Combinations of local search and exact algorithms. In: Cagnoni S, Johnson CG, Cardalda JJR, Marchiori E, Corne DW, Meyer J-A, Gottlieb J, Middendorf M, Guillot A, Raidl GR, Hart E (eds) Applications of evolutionary computing. Springer, Berlin, pp 211–223
Puchinger J, Raidl GR (2005) Combining metaheuristics and exact algorithms in combinatorial optimization: a survey and classification. In: Mira J, Álvarez JR (eds) Artificial intelligence and knowledge engineering applications: a bioinspired approach. Springer, Berlin, pp 41–53
Aziz SM, Hoskin DH, Pham DM, Kamruzzaman J (2022) Remote reconfiguration of fpga-based wireless sensor nodes for flexible internet of things. Comput Electr Eng 100:107935
Zhou Z, Liu Y, Yu H, Chen Q (2021) Logistics supply chain information collaboration based on fpga and internet of things system. Microprocess Microsyst 80:103589
Bobda C, Mbongue JM, Chow P, Ewais M, Tarafdar N, Vega JC, Eguro K, Koch D, Handagala S, Leeser M et al (2022) The future of fpga acceleration in datacenters and the cloud. ACM Trans Reconfigurable Technol Syst (TRETS) 15(3):1–42
Liao Y, Adegbija T, Lysecky R (2022) A high-level synthesis approach for precisely-timed, energy-efficient embedded systems. Sustain Comput: Inf Syst 35:100741
Purushothaman P, Srihari S, Deivalakshmi S (2021) High-level synthesis of cellular automata-belousov zhabotinsky reaction in fpga. In: Machine Learning. Deep Learning and Computational Intelligence for Wireless Communication. Springer, Singapore, pp 341–349
You G, Wang X (2020) A server-side accelerator framework for multi-core cpus and intel xeon phi co-processor systems. Clust Comput 23(4):2591–2608
Bournias I, Chotin R, Lacassagne L (2022) Using hls for designing a parametric optical flow hierarchical algorithm in fpgas. In: IEEE International Symposium on Circuits and Systems (ISCAS 2022)
Arias-Garcia J, Mafra A, Gade L, Coelho F, Castro C, Torres L, Braga A (2020) Enhancing performance of gabriel graph-based classifiers by a hardware co-processor for embedded system applications. IEEE Trans Industr Inf 17(2):1186–1196
Wang C, Gong L, Li X, Zhou X (2020) A ubiquitous machine learning accelerator with automatic parallelization on fpga. IEEE Trans Parallel Distrib Syst 31(10):2346–2359
Elnawawy M, Sagahyroon A, Shanableh T (2020) Fpga-based network traffic classification using machine learning. IEEE Access 8:175637–175650
Shahsavani SN, Fayyazi A, Nazemi M, Pedram M (2022) Efficient compilation and mapping of fixed function combinational logic onto digital signal processors targeting neural network inference and utilizing high-level synthesis. ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Lo M, Fang Z, Wang J, Zhou P, Chang M-CF, Cong J (2020) Algorithm-hardware co-design for bqsr acceleration in genome analysis toolkit. In: 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, pp 157–166
Young-Schultz T, Lilge L, Brown S, Betz V (2020) Using opencl to enable software-like development of an fpga-accelerated biophotonic cancer treatment simulator. In: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp 86–96
Choi Y-k, Chi Y, Lau J, Cong J (2022) Taro: Automatic optimization for free-running kernels in fpga high-level synthesis. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Song L, Chi Y, Sohrabizadeh A, Choi Y-k, Lau J, Cong J (2022) Sextans: A streaming accelerator for general-purpose sparse-matrix dense-matrix multiplication. In: Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp 65–77
din Dawrayn AM, Bilal M, (2022) Svm-based switching filter hardware design for mixed noise reduction in digital images using high-level synthesis tools. Int J Comput Vis Image Process (IJCVIP) 12(1):1–16
Islam SA, Katkoori S (2022) Behavioral synthesis of key-obfuscated rtl ip. In: Behavioral Synthesis for Hardware Security. Springer, Berlin, pp 17–42
Cong J, Liu B, Neuendorffer S, Noguera J, Vissers K, Zhang Z (2011) High-level synthesis for fpgas: from prototyping to deployment. IEEE Trans Comput Aided Des Integr Circuits Syst 30(4):473–491
Jose S (2022) Vivado design suite user guide: high-level synthesis. UG-902, Xilinx). https://docs.xilinx.com/v/u/en-US/ug902-vivado-high- level-synthesis
Canis A, Choi J, Aldham M, Zhang V, Kammoona A, Anderson JH, Brown S, Czajkowski T (2011) Legup: high-level synthesis for fpga-based processor/accelerator systems. In: Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp 33–36
Ropponen J (2021) Feasibility of using high-level synthesis in fpga design: evaluating the capabilities of intel high-level synthesis compiler
Pilato C, Ferrandi F (2013) Bambu: A modular framework for the high level synthesis of memory-intensive applications. In: 2013 23rd International Conference on Field Programmable Logic and Applications. IEEE, pp 1–4
Cong J, Lau J, Liu G, Neuendorffer S, Pan P, Vissers K, Zhang Z (2022) Fpga hls today: successes, challenges, and opportunities. ACM Trans Reconfigurable Technol Syst (TRETS) 15(4):1–42
Molina RS, Gil-Costa V, Crespo ML, Ramponi G (2022) High-level synthesis hardware design for fpga-based accelerators: Models, methodologies, and frameworks. IEEE Access
Fallah MK, Mirhosseini M, Fazlali M, Daneshtalab M (2020) Scalable parallel genetic algorithm for solving large integer linear programming models derived from behavioral synthesis. In: 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp 390–394
Fazlali M, Zakerolhosseini A, Gaydadjiev G (2012) Efficient datapath merging for the overhead reduction of run-time reconfigurable systems. J Supercomput 59(2):636–657
Fazlali M, Fallah MK, Hosseinpour N, Katanforoush A (2019) Accelerating datapath merging by task parallelisation on multicore systems. Int J Parall Emergent Distrib Syst 1:1–14
Williams A, Brown A, Baidas Z (2001) Optimisation in behavioural synthesis using hierarchical expansion: module ripping. Comput Digit Tech 148(1):31–43
Chabini N, Wolf W (2005) Unification of scheduling, binding, and retiming to reduce power consumption under timings and resources constraints. IEEE Trans Very Large Scale Integration (VLSI) Syst 13(10):1113–1126
Dilek S, Smri R, Tosun S, Dal D (2020) A high-level synthesis methodology for energy and reliability-oriented designs. IEEE Trans Comput
Nalci Y, Kullu P, Tosun S, Ozturk O (2021) Ilp formulation and heuristic method for energy-aware application mapping on 3d-nocs. J Supercomput 77(3):2667–2680
Pilato C, Wu K, Garg S, Karri R, Regazzoni F (2018) Tainthls: high-level synthesis for dynamic information flow tracking. IEEE Trans Comput Aided Des Integr Circuits Syst 38(5):798–808
Fallah MK, Keshvari VS, Fazlali M (2019) A parallel hybrid genetic algorithm for solving the maximum clique problem. In: International Congress on High-Performance Computing and Big Data Analysis . Springer, pp 378–393
Fazlali M, Fallah MK, Zolghadr M, Zakerolhosseini A (2009) A new datapath merging method for reconfigurable system. In: International Workshop on Applied Reconfigurable Computing. Springer, pp 157–168
Fazlali M, Zakerolhosseini A, Sabeghi M, Bertels K, Gaydadjiev G (2009) Data path configuration time reduction for run-time reconfigurable systems. In: ERSA, pp 323–327
Fazlali M, Sabeghi M, Zakerolhosseini A, Bertels K (2010) Efficient task scheduling for runtime reconfigurable systems. J Syst Architect 56(11):623–632
Barik R, Grothoff C, Gupta R, Pandit V, Udupa R (2006) Optimal bitwise register allocation using integer linear programming. In: International Workshop on Languages and Compilers for Parallel Computing. Springer, pp 267–282
Chen L, Ebrahimi M, Tahoori MB (2016) Reliability-aware resource allocation and binding in high-level synthesis. ACM Trans Des Autom Electron Syst (TODAES) 21(2):1–27
Chen J, Chang C-H, Ding J, Qiao R, Faust M (2017) Tap delay-and-accumulate cost aware coefficient synthesis algorithm for the design of area-power efficient fir filters. IEEE Trans Circuits Syst I Regul Pap 65(2):712–722
Taher FN, Kishani M, Schafer BC (2018) Design and optimization of reliable hardware accelerators: Leveraging the advantages of high-level synthesis. In: 2018 IEEE 24th International Symposium on On-Line Testing And Robust System Design (IOLTS). IEEE, pp 232–235
de Fine Licht J, Besta M, Meierhans S, Hoefler T (2020) Transformations of high-level synthesis codes for high-performance computing. IEEE Trans Parallel Distrib Syst 32(5):1014–1029
Nam H, Lysecky R (2018) Security-aware multi-objective optimization of distributed reconfigurable embedded systems. J Parall Distrib Comput
Lee S, Gerstlauer A (2019) Approximate high-level synthesis of custom hardware. In: Approximate Circuits. Springer, Cham, pp 205–223
Bobda C, Yonga F, Gebser M, Ishebabi H, Schaub T (2018) High-level synthesis of on-chip multiprocessor architectures based on answer set programming. J Parall Distrib Comput 117:161–179
Arabnejad H, Barbosa JG (2013) List scheduling algorithm for heterogeneous systems by an optimistic cost table. IEEE Trans Parallel Distrib Syst 25(3):682–694
Fazlali M, Zakerolhosseini A, Shahbahrami A, Gaydadjiev G (2009) High speed merged-datapath design for run-time reconfigurable systems. In: 2009 International Conference on Field-Programmable Technology. IEEE, pp 339–343
Knop D, Kouteckỳ M (2018) Scheduling meets n-fold integer programming. J Sched 21(5):493–503
Sulaiman M, Halim Z, Waqas M, Aydın D (2021) A hybrid list-based task scheduling scheme for heterogeneous computing. J Supercomput 1:1–37
Sirisha D, Prasad SS (2022) Mpeft: a makespan minimizing heuristic scheduling algorithm for workflows in heterogeneous computing systems. CCF Trans High Perform Comput 1:1–16
Fallah MK, Fazlali M, Daneshtalab M (2021) A symbiosis between population based incremental learning and lp-relaxation based parallel genetic algorithm for solving integer linear programming models. Computing 1–19
Belwal M, Ramesh T (2022) N-pir: a neighborhood-based pareto iterative refinement approach for high-level synthesis. Arab J Sci Eng 1–17
Taha HA (2014) Integer Programming: Theory, Applications, and Computations, United States
Wilken K, Liu J, Heffernan M (2000) Optimal instruction scheduling using integer programming. Acm Sigplan Not 35(5):121–133
Rodionov A, Rose J (2017) Synchronization constraints for interconnect synthesis. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp 95–104
Ohata K, Nishikawa H, Kong X, Tomiyama H (2022) Ilp-based and heuristic scheduling techniques for variable-cycle approximate functional units in high-level synthesis. Computers 11(10):146
Gay DM (2015) The ampl modeling language: An aid to formulating and solving optimization problems. In: Numerical Analysis and Optimization. Springer, Cham, pp 95–116
Bussieck MR, Meeraus A (2004) General Algebraic Modeling System (GAMS). Springer, Boston, MA, pp 137–157
Lin Y, Schrage L (2009) The global solver in the lindo api. Optim Methods Softw 24(4–5):657–668
Berthold T, Farmer J, Heinz S, Perregaard M (2018) Parallelization of the fico xpress-optimizer. Optim Methods Softw 33(3):518–529
Gurobi Optimization L (2019) Gurobi optimizer (Version 9.0)
CPLEX I (2014) ILOG CPLEX 12.6 Optimization Studio. IBM, New York, NY, USA
Gilmore PC, Gomory RE (1963) A linear programming approach to the cutting stock problem-part ii. Oper Res 11(6):863–888
Mitchell JE (2002) Branch-and-cut algorithms for combinatorial optimization problems. Handb Appl Optim 1:65–77
Fischetti M, Monaci M (2020) A branch-and-cut algorithm for mixed-integer bilinear programming. Eur J Oper Res 282(2):506–514
Clausen J (1999) Branch and bound algorithms-principles and examples. Department of Computer Science, University of Copenhagen, 1–30
Parragh SN, Tricoire F (2019) Branch-and-bound for bi-objective integer programming. INFORMS J Comput 31(4):805–822
He H, Daume H III, Eisner JM (2014) Learning to search in branch and bound algorithms. Adv Neural Inf Process Syst 27:3293–3301
Karmarkar N (1984) A new polynomial-time algorithm for linear programming. In: Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing, pp 302–311
Desai J, Wang K (2011) Lagrangian optimization for lp: Theory and algorithms. Wiley Encyclopedia of Operations Research and Management Science, pp 1–19
Forrest J (1989) Mathematical programming with a library of optimization subroutines. In: ORSA/TIMS Joint National Meeting, New York
Bixby RE, Gregory JW, Lustig IJ, Marsten RE, Shanno DF (1992) Very large-scale linear programming: a case study in combining interior point and simplex methods. Oper Res 40(5):885–897
Gao W, Sun C, Ye Y, Ye Y (2021) Boosting method in approximately solving linear programming with fast online algorithm. arXiv preprint arXiv:2107.03570
Mediabench (2019) http://mathstat.slu.edu/~fritts/mediabench
Funding
This research received no external funding.
Author information
Authors and Affiliations
Contributions
Mina Mirhoseini implemented the algorithms as a Ph.D. researcher and wrote the paper in cooperation with Mohammad K Fallah as the responsible researcher. Mahmood Fazlali supervises the research as the Ph.D. supervisor, while Jeong-A Lee contributes to the paper as an advisor.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mirhosseini, M., Fazlali, M., Fallah, M.K. et al. A fast MILP solver for high-level synthesis based on heuristic model reduction and enhanced branch and bound algorithm. J Supercomput 79, 12042–12073 (2023). https://doi.org/10.1007/s11227-023-05109-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05109-2