Parallel branch and bound algorithm for solving integer linear programming models derived from behavioral synthesis
Introduction
Synthesis of digital circuits is the process of generating hardware from technical specifications. Synthesis consists of transformations and optimizations at multiple levels to generate the desired circuit. Behavioral Synthesis specifically refers to creating register-transfer level descriptions from algorithmic (or behavioral) descriptions. In behavioral synthesis, optimizing hardware latency under resource constraints is an NP-Hard problem [20]. Hence, when the size of the problem grows, heuristic algorithms such as list-based scheduling [20] are used to solve it. But these algorithms are not scalable in support of additional goals for multi-objective optimizations. Integer Linear Programming (ILP) formulation of behavioral synthesis allows hardware designers to model various optimization goals and constraints along with resource and timing limitations.
Different algorithms and tools have been presented for solving ILP models, but due to the complexity of the problem, solving large models is still a computational challenge. One of the practical solutions to overcome the huge state space created from ILP models is development of algorithms adapted to a family of models using their common characteristics that are not public [3], [13], [27]. Branch and Bound (B&B) is one of the algorithm design paradigms that enables using spacial characteristics of problems to prone the state space.
B&B algorithms address optimization problems by organizing the set of candidate solutions as a tree structure, called the search tree, and systematic enumeration of its nodes. There are three algorithmic components that guide the behavior of a B&B algorithm: the search strategy, the branching strategy, and the pruning rules [21]. The search strategy determines how to traverse the search tree. During tree traversal, B&B algorithms process each node and recursively split the search space into smaller spaces where this splitting is called branching. Instead of brute-force processing of all the tree branches, a B&B algorithm uses an approximation function to estimate lower/upper bound for each branch and prune it if is not promising. In fact, a non-promising branch is a sub-space that can be verified does not include the optimal solution. The most common approximation function for B&B algorithms in ILP solvers is LP relaxation [30], where at each node of the search tree, the bound is obtained by solving the relaxed ILP.
High performance computing enables simultaneous usage of multiple processing resources by developing parallel algorithms to achieve reasonable execution time during solving computationally expensive problems. A Parallel B&B (PB&B) algorithm runs multiple processes simultaneously on the search tree. In a PB&B that uses LP relaxation technique, processing of the search tree can occurs in different granularities. In sub-node parallelism a parallel LP relaxation algorithm runs at each node of the search tree. Also, node parallelism enables processing different nodes simultaneously. Dividing the search tree into a number of sub-trees and processing them in parallel is called sub-tree parallelism. Tree parallelism is the most coarse grain granularity and is the processing of different copies of the search tree in parallel with different strategies, and any process that finds a new band shares it with other processes. Some times, a PB&B is the mix of different granularities, as well.
In this paper we propose two PB&B algorithms for solving ILP models derived from behavioral synthesis by providing sub-node / node parallelism on multi-core platforms. The main contributions of this paper are:
- -
Representing resource constrained behavioral synthesis problems (Scheduling, resource allocation and binding) in the form of ILP models.
- -
Developing two fast parallel B&B algorithms with different tree parallelism strategies for solving the ILP models derived from behavioral synthesis.
- -
Applying memory efficient techniques for encoding the search tree nodes of the proposed B&B algorithm.
- -
Using problem metadata to present a best-first tree traversal method for guiding the search.
We used Media Bench [19] DFGs to derive a set of large ILP models that support behavioral synthesis of digital circuits, and we studied their features to optimize the models and develop two efficient parallel algorithms for solving them. Then, we implemented the proposed algorithms by C++ and OpenMP library and applied it to the set of models. Also, we tried to solve the models by IBM ILOG CPLEX 12.6 [4] optimizer which is a state of the art MILP solver supporting parallel processing. The experimental results verify that our B&B approach is scalable for solving ILP models derived from behavioral synthesis and it outperforms CPLEX so that both the proposed algorithms can solve models with more than 2000 decision variables and 1000 constraints, while CPLEX cannot solve models with more than 550 variables / constraints.
In the rest of the paper, Section 2 is appropriated to related works. In Section 3, we define the problem by building and optimizing the ILP models that support scheduling, resource allocation and resource binding in behavioral synthesis of digital circuits. Then, we present the proposed PB&B algorithms for solving ILP models in Section 4 and evaluate our experiments in Section 5. Finally, we conclude the paper in Section 6 by reinforcing the main conclusions of this research.
Section snippets
Related work
High-level optimization of circuit structures is extremely critical in achieving the best circuit implementation [20]. It is one of the main goals in behavioral synthesis and serves a variety of objectives. In the following, we will mention some examples. For energy awareness optimization of mobile devices, Li et al. [17] formulated the relation between energy, time and probability based on an energy-minimum model and proposed an efficient dynamic programming algorithm to solve it. In the
Preliminaries
Here we define the problem by developing ILP models for behavioral synthesis. Definitions in this section are different from those in [20], [30], only in some minor details. Definition 1 A Data-Flow Graph (DFG) is a directed graph which represents the behavioral model of a digital circuit at the function level abstraction in terms of operations (functional units) and their data exchanges (dependencies). Each vertex corresponds to an operation and each edge corresponds to the data
The proposed PB&B algorithms
In this section, we describe the components of the proposed PB&B algorithms as well as parallelization methods and optimizations applied in problem encoding and memory management.
Experimental results
In this study, we use an execution platform with a 64-core AMD processor, 128 GB of main memory, and CentOS distribution of Linux operating system to test the effectiveness of the proposed method. The proposed algorithm is implemented by C++ programming language and OpenMP library. Also, IBM ILOG CPLEX 12.6 has been used to solve models and compared with the results of the proposed method. CPLEX is a leading commercial software product for solving MILPs which uses B&C and supports many
Conclusion
This paper presents two parallel branch and bound algorithms for solving ILP models derived from behavioral synthesis of digital circuits as well as a problem formulation and some optimizations on it. Furthermore, it encodes sub-problems to compress search tree nodes which causes reducing the memory consumption and increases scalability of the algorithms in solving problems with larger sizes. The proposed algorithms are based on LP-relaxed solving of ILP models and are different in
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (31)
- et al.
Replicable parallel branch and bound search
J. Parallel Distrib. Comput.
(2018) - et al.
High-level synthesis of on-chip multiprocessor architectures based on answer set programming
J. Parallel Distrib. Comput.
(2018) Mixed-integer programming model and branch-and-price-and-cut algorithm for urban bus network design and timetabling
Transp. Res. Part B: Methodol.
(2018)- et al.
Efficient task scheduling for runtime reconfigurable systems
J. Syst. Archit.
(2010) - et al.
Energy-aware scheduling on heterogeneous multi-core systems with guaranteed probability
J. Parallel Distrib. Comput.
(2017) - et al.
Branch-and-bound algorithms: a survey of recent advances in searching, branching, and pruning
Discret. Optim.
(2016) - et al.
An enhanced MILP-based branch-and-price approach to modularity density maximization on graphs
Comput. Oper. Res.
(2019) - I. CPLEX, Ilog cplex 12.6 optimization studio,...
- et al.
Pebbl: an object-oriented framework for scalable parallel branch and bound
Math. Program. Comput.
(2015) - et al.
Accelerating datapath merging by task parallelisation on multicore systems
Int. J. Parallel Emerg. Distrib. Syst.
(2019)
A new datapath merging method for reconfigurable system
Proceedings of the International Workshop on Applied Reconfigurable Computing
A modified merging approach for datapath configuration time reduction
Proceedings of the International Symposium on Applied Reconfigurable Computing
Efficient datapath merging for the overhead reduction of run-time reconfigurable systems
J. Supercomput.
Data path configuration time reduction for run-time reconfigurable systems.
Proceedings of the ERSA
High speed merged-datapath design for run-time reconfigurable systems
Proceedings of the International Conference on Field-Programmable Technology
Cited by (5)
A lightweight semi-centralized strategy for the massive parallelization of branching algorithms
2023, Parallel ComputingRational Jacobi Kernel Functions: A novel massively parallelizable orthogonal kernel for support vector machines
2024, 2024 3rd International Conference on Distributed Computing and High Performance Computing, DCHPC 2024A fast MILP solver for high-level synthesis based on heuristic model reduction and enhanced branch and bound algorithm
2023, Journal of Supercomputing