Abstract
The dynamic nature of state-of-the-art multicore signal processing systems limits the ability of designers to derive accurate models for the targeted applications. Inaccurate assumptions in the model can lead to inefficient implementations and restrict the runtime re-configuration of these systems. On the other hand, dataflow models have provided powerful techniques to analyze and explore the design space for many classes of signal processing systems. In this context, we develop the Partial Expansion Graph (PEG) as an implementation model where existing dataflow graph analysis is augmented with dynamic adaptation, efficient parallelism utilization, and online reconfiguration based on the measured performance of the targeted applications. In this paper, we develop new methods for scheduling and mapping DSP systems using PEGs. Collectively, these methods tune the amount of data parallelism in the application graph and distribute data- and task-parallel instances over different cores while balancing the load across the available processing units. We enable online adaptation for PEG systems using low-overhead customizable solutions. We demonstrate the utility of our PEG-based scheduling and mapping algorithms through experiments on real application models and various synthetic graphs.
Similar content being viewed by others
References
Adam, T.L., Chandy, K.M., & Dickson, J.R. (1974). A comparison of list schedules for parallel processing systems. Communications of the ACM, 17(12), 685–690.
Baudisch, D., Brandt, J., & Schneider, K. (2012). Out-of-order execution of synchronous data-flow networks. In Proceedings of the international conference on embedded computer systems: architectures, modeling, and simulation (pp. 168– 175).
Bellens, P., Perez, J.M., Badia, R.M., & Labarta, J. (2006). Cellss: a programming model for the Cell BE architecture. In Proceedings of the ACM/IEEE conference on supercomputing.
S.S. Bhattacharyya, E. Deprettere, R. Leupers, & J. Takala (Eds.) (2010). Handbook of signal processing systems. Berlin Heidelberg New York: Springer.
Bilsen, G., Engels, M., Lauwereins, R., & Peperstraete, J.A. (1996). Cyclo-static dataflow. IEEE Transactions on Signal Processing, 44(2), 397–408.
Blagojevic, F., Nikolopoulos, D.S., Stamatakis, A., & Antonopoulos, C.D. (2007). Dynamic multigrain parallelization on the cell broadband engine. In Proceedings of the symposium on principles and practices of parallel programming (pp. 90–100).
Dardaillon, M., Marquet, K., Risset, T., Martin, J., & Charles, H. (2014). A compilation flow for parametric dataflow: programming model, scheduling, and application to heterogeneous mpsoc. In International conference on compilers, architecture and synthesis for embedded systems (CASES). New delhi, India.
Gordon, M.I., Thies, W., & Amarasinghe, S. (2006). Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In Symposium on architectural support for programming languages and operating systems.
Hormati, A.H., Choi, Y., Kudlur, M., Rabbah, R., Mudge, T., & Mahlke, S. (2009). Flextream: adaptive compilation of streaming applications for heterogeneous architectures. In Proceedings of the international conference on parallel architectures and compilation techniques.
Hsu, C., Ko, M., & Bhattacharyya, S.S. (2005). Software synthesis from the dataflow interchange format. In Proceedings of the international workshop on software and compilers for embedded systems. Dallas, Texas (pp. 37–49).
Kennedy, J., & Eberhart, R.C. (1995). Particle swarm optimization. In Proceedings of the IEEE international conference on neural networks (pp. 1942–1948).
Kim, J., Hyeon, S., & Choi, S. (2010). Implementation of an SDR system using graphics processing unit. IEEE Communications Magazine, 48(3).
Kudlur, M., & Mahlke, S. (2008). Orchestrating the execution of stream programs on multicore platforms. In Proceedings of the ACM conference on programming language design and implementation (pp. 114–124).
Kwok, Y., & Ahmad, I. (1999). Static scheduling algorithms for allocating directed task graphs to multiprocessors. Journal of the Association for Computing Machinery, 31(4), 406–471.
Lee, E.A., & Messerschmitt, D.G. (1987). Static scheduling of synchronous data flow programs for digital signal processing. IEEE Transactions on Computers, 36(1), 24–35.
Pelcat, M., Menuet, P., Aridhi, S., & Nezan, J.F. (2009). Scalable compile-time scheduler for multi-core architectures. In Proceedings of the design, automation and test in europe conference and exhibition (pp. 1552–1555).
Pino, J.L., Bhattacharyya, S.S., & Lee, E.A. (1995). A hierarchical multiprocessor scheduling system for DSP applications. In Proceedings of the IEEE asilomar conference on signals, systems, and computers. Pacific Grove, California, (Vol. 1 pp. 122–126).
Ritz, S., Pankert, M., & Meyr, H. (1993). Optimum vectorization of scalable synchronous dataflow graphs. In Proceedings of the international conference on application specific array processors.
Salunkhe, H., Moreira, O., & van Berkel, K. (2014). Mode-controlled dataflow based modeling amp; analysis of a 4g-lte receiver. In Design, Automation and Test in Europe Conference and Exhibition (DATE). Dresden, Germany. doi:10.7873/DATE.2014.225.
Sriram, S., & Bhattacharyya, S.S. (2009). Embedded multiprocessors: scheduling and synchronization, 2nd edn. Boca Raton: CRC Press. ISBN:1420048015.
Tomasulo, R.M. (1967). An efficient algorithm for exploiting multiple arithmetic units. IBM Journal of Research and Development, 11(1), 225–33.
Wu, H.H. (2013). Modeling and mapping of optimized schedules for embedded signal processing systems. Ph.D. thesis, Department of Electrical and Computer Engineering, University of Maryland, College Park.
Zaki, G., Plishker, W., Bhattacharyya, S.S., Clancy, C., & Kuykendall, J. (2013). Integration of dataflow-based heterogeneous multiprocessor scheduling techniques in GNU radio. Journal of Signal Processing Systems, 70(2), 177–191. doi:10.1007/s11265-012-0696-0.
Zaki, G.F., Plishker, W., Bhattacharyya, S.S., & Fruth, F. (2012). Partial expansion graphs: exposing parallelism and dynamic scheduling opportunities for DSP applications. In Proceedings of the international conference on application specific systems, architectures, and processors (pp. 86–93).
Zaki, G.F., Plishker, W., Bhattacharyya, S.S., & Fruth, F. (2014). Partial expansion of dataflow graphs for resource-aware scheduling of multicore signal processing systems. In Proceedings of the IEEE asilomar conference on signals, systems, and computers. Pacific Grove, California.
Acknowledgments
This research was sponsored in part by the Laboratory for Telecommunications Sciences, and Texas Instruments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zaki, G.F., Plishker, W., Bhattacharyya, S.S. et al. Implementation, Scheduling, and Adaptation of Partial Expansion Graphs on Multicore Platforms. J Sign Process Syst 87, 107–125 (2017). https://doi.org/10.1007/s11265-016-1107-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-016-1107-8