Communication-aware Heterogeneous Multiprocessor Mapping for Real-time Streaming Systems

Lin, Jing; Gerstlauer, Andreas; Evans, Brian L.

doi:10.1007/s11265-012-0674-6

Communication-aware Heterogeneous Multiprocessor Mapping for Real-time Streaming Systems

Published: 19 May 2012

Volume 69, pages 279–291, (2012)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Jing Lin¹,
Andreas Gerstlauer¹ &
Brian L. Evans¹

568 Accesses
11 Citations
Explore all metrics

Abstract

Real-time streaming signal processing systems typically desire high throughput and low latency. Many such systems can be modeled as synchronous data flow graphs. In this paper, we address the problem of multi-objective mapping of SDF graphs onto heterogeneous multiprocessor platforms, where we account for the overhead of bus-based inter-processor communication. The primary contributions include (1) an integer linear programming (ILP) model that globally optimizes throughput, latency and cost; (2) low-complexity two-stage heuristics based on a combination of an evolutionary algorithm with an ILP to generate either a single sub-optimal mapping solution or a Pareto front for design space optimization. In our simulations, the proposed heuristic shows up to 12x run-time efficiency compared to the global ILP while maintaining a 10^− 6 optimality gap in throughput.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Optimization for energy-aware design of task scheduling in heterogeneous distributed systems: a meta-heuristic based approach

Article 07 April 2024

Cen Li & Liping Chen

Boosting white shark optimizer for global optimization and cloud scheduling problem

Article 28 March 2024

Reham R. Mostafa, Amit Chhabra, … Fatma A. Hashim

Optimizing CSR-Based SpMV on a New MIMD Architecture Pezy-SC3s

Notes

The ratio of weights between throughput and cost is kept approximately the same as in the global ILP.

References

Alander, J. (2002). On optimal population size of genetic algorithms. In Proc. IEEE int. conf. on comp. sys. and software eng. (pp. 65–70).
Aslam, N., Arslan, T., & Erdogan, A. (2007). Algorithmic level design space exploration tool for creation of highly optimized synthesizable circuits. In Proc. IEEE int. conf. on acoustics, speech and signal processing (Vol. 2).
Bambha, N., Bhattacharyya, S., Teich, J., & Zitzler, E. (2004). Systematic integration of parameterized local search into evolutionary algorithms. IEEE Transactions on Evolutionary Computation, 8(2), 137–155.
Article Google Scholar
Bambha, N., Kianzad, V., Khandelia, M., & Bhattacharyya, S. (2002). Intermediate representations for design automation of multiprocessor dsp systems. Design Automation for Embedded Systems, 7(4), 307–323.
Article MATH Google Scholar
Bhattacharyya, S., Murthy, P., & Lee, E. (1996). Software synthesis from dataflow graphs. New York: Springer.
Book MATH Google Scholar
Bonfietti, A., Benini, L., Lombardi, M., & Milano, M. (2010). An efficient and complete approach for throughput-maximal SDF allocation and scheduling on multi-core platforms. In Proc. IEEE conf. on design, automation and test in Europe (pp. 897–902).
Glover, F. (1975). Improved linear integer programming formulations of nonlinear integer problems. Management Science, 22(4), 455–460.
Article MathSciNet Google Scholar
Lee, E., & Messerschmitt, D. (1987). Static scheduling of synchronous data flow programs for digital signal processing. IEEE Transactions on Computers, 36(1), 24–35.
Article MATH Google Scholar
Lin, J., Srivatsa, A., Gerstlauer, A., & Evans, B. (2011). Heterogeneous multiprocessor mapping for real-time streaming systems. In Proc. IEEE int. conf. on acoustics, speech and signal processing.
Marwedel, P. (2010). Embedded system design: Embedded systems foundations of cyber-physical systems. New York: Springer.
Google Scholar
Pino, J., Parks, T., & Lee, E. (1994). Automatic code generation for heterogeneous multiprocessors. In Proc. IEEE int. conf. on acoustics, speech and signal processing (pp. 445–448).
Ruggiero, M., Guerri, A., Bertozzi, D., Poletti, F., & Milano, M. (2006). Communication-aware allocation and scheduling framework for stream-oriented multi-processor systems-on-chip. In Proc. conf. on design, automation and test in Europe (pp. 3–8). European Design and Automation Association.
Sih, G. (1992). Multiprocessor scheduling to account for interprocessor communication. Berkeley, CA: University of California at Berkeley.
Google Scholar
Sriram, S., & Lee, E. (1994). Statically sceduling communication resources in multiprocessor dsp architectures. In 1994 conference record of the twenty-eighth asilomar conference on signals, systems and computers (Vol. 2, pp. 1046–1051). IEEE.
Stuijk, S., Basten, T., Geilen, M., Ghamarian, A., & Theelen, B. (2008). Resource-efficient routing and scheduling of time-constrained streaming communication on networks-on-chip. Journal of Systems Architecture, 54(3–4), 411–426.
Article Google Scholar
Stuijk, S., Geilen, M., & Basten, T. (2006). SDF3: SDF for free. In Proc. IEEE int. conf. on application of concurrency to system design (pp. 276–278). Available at http://www.es.ele.tue.nl/sdf3.
Zhu, J., Sander, I., & Jantsch, A. (2009). Buffer minimization of real-time streaming applications scheduling on hybrid CPU/FPGA architectures. In Proc. IEEE conf. on design, automation and test in Europe (pp. 1506–1511).
Zitzler, E., Laumanns, M., & Thiele, L. (2001). SPEA2: Improving the strength Pareto evolutionary algorithm. In Eurogen (Vol. 3242).
Zitzler, E., Teich, J., & Bhattacharyya, S. (2000). Evolutionary algorithms for the synthesis of embedded software. IEEE Transactions on VLSI Systems, 8(4), 452–455.
Article Google Scholar
Zitzler, E., & Thiele, L. (1999). Multiobjective evolutionary algorithms: A comparative case study and the strength pareto approach. IEEE Transactions on Evolutionary Computation, 3(4), 257–271.
Article Google Scholar

Download references

Author information

Authors and Affiliations

The University of Texas at Austin, Austin, TX, USA
Jing Lin, Andreas Gerstlauer & Brian L. Evans

Authors

Jing Lin
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Gerstlauer
View author publications
You can also search for this author in PubMed Google Scholar
Brian L. Evans
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jing Lin.

Additional information

This research was supported by an equipment gift from Intel.

The work in this paper was presented in part at the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing.

Appendix: Scheduling ILP

When given a partition, the execution time of conventional SDF actors as well as the communication delays of the send and receive actors become fixed parameters. For conciseness, we slightly abuse the notation by defining parameters

$$\begin{array}{rll} D_i&=\sum\limits_j{a^*_{ij}D_{ij}}, \\ DS_k&=ipc^*_k\cdot\sum\limits_l{bs^*_{kl}DC_{kl}}, \\ DR_k&=ipc^*_k\cdot ch^*_k\cdot \sum\limits_l{br^*_{kl}DC_{kl}}, \end{array} $$

(23)

where v ^* denotes a specific value of a variable v. Let us define the index set $\mathcal{T}_p=[T_{\rm max}-Period^*,T_{\rm max}]$, in which Period ^* is the minimum period determined by the partition. Also denote $\mathcal{K}_s\stackrel{\triangle}{=}\{k|ipc^*_k=1\}$ and $\mathcal{K}_r\stackrel{\triangle}{=}\{k|ipc^*_k\cdot ch^*_k=1\}$, i.e. the set of edges with an active send or an active receive actor, respectively.

The scheduling sub-problem can then be formalized as an ILP that minimizes

$$\begin{array}{rll} Latency=&\sum\limits_{t\in \mathcal{T}_p}t[s_I(t)-s_I(t-1)]+D_I \\ &-\sum\limits_{t\in\mathcal{T}_p}t[s_1(t)-s_1(t-1)]\ \\ &+(s_1(T_{\rm max})-s_I(T_{\rm max}))Period^*, \end{array} $$

(24)

subject to

Precedence

$$\begin{array}{rll} s_{src(k)}(t-D_i)P_k-ss_k(t) &\geq 0, \\[3pt] ss_k(t-DS_k)-sr_k(t) &\geq 0, \\[3pt] sr_k(t-DR_k)-s_{dst(k)}(t)C_k+O_k &\geq 0, \forall k. \end{array} $$

(25)

Resource sharing

$$\begin{array}{rll} & \sum\limits_i{[s_i(t)-s_i(t-D_i)]} \\ &\sum\limits_{k\in\mathcal{K}_s}{[ss_k(t)-ss_k(t-DS_k)]a^*_{src(k)j}} \\ &\sum\limits_{k\in\mathcal{K}_r}{[sr_k(t)-sr_k(t-DR_k)]a^*_{dst(k)j}}\leq 1, \forall j;\\ \end{array} $$

(26)

$$\begin{array}{rll} &\sum\limits_{k\in\mathcal{K}_s}{[ss_k(t)-ss_k(t-DS_k)]bs^*_{kl}} \\ &\sum\limits_{k\in\mathcal{K}_r}{[sr_k(t)-sr_k(t-DR_k)]br^*_{kl}} \leq 1, \forall l. \end{array} $$

(27)

Periodicity

$$\begin{array}{rll} &\forall i, t\in[T_{\rm max}-D_i, T_{\rm max}] \\ &\quad s_i(t)-s_i(t-Period^*) = N_i; \\ \end{array} $$

(28)

$$\begin{array}{rll} &\forall k, t\in[T_{\rm max}-DS_k, T_{\rm max}], \\ &\quad ss_k(t)-ss_k(t-Period^*) = N_{src(k)}P_k; \\ \end{array} $$

(29)

$$\begin{array}{rll} &\forall k, t\in[T_{\rm max}-DR_k, T_{\rm max}], \\ &\quad sr_k(t)-sr_k(t-Period^*) = N_{dst(k)}C_k. \end{array} $$

(30)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, J., Gerstlauer, A. & Evans, B.L. Communication-aware Heterogeneous Multiprocessor Mapping for Real-time Streaming Systems. J Sign Process Syst 69, 279–291 (2012). https://doi.org/10.1007/s11265-012-0674-6

Download citation

Received: 31 October 2011
Revised: 11 April 2012
Accepted: 23 April 2012
Published: 19 May 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s11265-012-0674-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Communication-aware Heterogeneous Multiprocessor Mapping for Real-time Streaming Systems

Abstract

Access this article

Similar content being viewed by others

Optimization for energy-aware design of task scheduling in heterogeneous distributed systems: a meta-heuristic based approach

Boosting white shark optimizer for global optimization and cloud scheduling problem

Optimizing CSR-Based SpMV on a New MIMD Architecture Pezy-SC3s

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Scheduling ILP

Precedence

Resource sharing

Periodicity

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Communication-aware Heterogeneous Multiprocessor Mapping for Real-time Streaming Systems

Abstract

Access this article

Similar content being viewed by others

Optimization for energy-aware design of task scheduling in heterogeneous distributed systems: a meta-heuristic based approach

Boosting white shark optimizer for global optimization and cloud scheduling problem

Optimizing CSR-Based SpMV on a New MIMD Architecture Pezy-SC3s

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Scheduling ILP

Appendix: Scheduling ILP

Precedence

Resource sharing

Periodicity

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation