Abstract
Efficient, scalable and productive parallel programming is a major challenge for exploiting the future multi-processor SoC platforms. This article presents the MultiFlex programming environment which has been developed to address this challenge. It is targeted for use on Platform 2012, a scalable multi-processor fabric. The MultiFlex environment supports high-level simulation, iterative platform mapping, and includes tools for programming model aware debug, trace, visualization and analysis.
This article focuses on the two classes of programming abstractions supported in MultiFlex. The first is a set of Parallel Programming Patterns (PPP) which offer a rich set of programming abstractions for implementing efficient data- and task-level parallel applications. The second is a Reactive Task Management (RTM) abstraction, which offers a lightweight C-based API to support dynamic dispatching of small grain tasks on tightly coupled parallel processing resources.
The use of the MultiFlex native programming model is illustrated through the capture and mapping of two representative video applications. The first is a high-quality rescaling (HQR) application on a multi-processor platform. We present the details of the optimization process which was required for mapping the HQR application, for which the reference code requires 350 GIPS (giga instructions per second), onto a 16 processor cluster. Our results show that the parallel implementation using the PPP model offers almost linear acceleration with respect to the number of processing elements.
The second application is a high-definition VC-1 decoder. For this application, we illustrate two different parallel programming model variants, one using PPPs, the other based on RTM. These two versions are mapped onto two variants of a homogeneous version of the Platform 2012 multi-core fabric.
- Benini, L., Flamand, E., Fuin, D., and Melpignano, D. 2012. P2012: Building an ecosystem for a scalable, modular and high-efficiency embedded computing accelerator. In Proceedings of the Design, Automation, and Test Conference. 983--987.Google Scholar
- Ferrer, R., Bellens, P., Beltran, V., Gonzalez, M., Martorell, X., Badia, R. M., and Ayguade, E. 2010, Parallel programming models for heterogeneous multicore architectures. IEEE Micro, 42--53. Google ScholarDigital Library
- Gamma, E, Helm. R., Johnson, R., and Vlissides, J. M. 1995. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley. Google ScholarDigital Library
- Intel. 2011a. CILK plus. http://software.intel.com/en-us/articles/intel-cilk-plus/.Google Scholar
- Intel. 2011b. Array building blocks. http://software.intel.com/en-us/articles/intel-array-buildingblocks/.Google Scholar
- Intel. 2011c. Threading building blocks, http://threadingbuildingblocks.org/.Google Scholar
- Khronos 2013, Khronos OpenCL. http://www.khronos.org/opencl/.Google Scholar
- Melpignano, D., Benini, L., Flamand, E., Jego, B., Lepley, T., Haugou, G. Clermidy, F., and Dutoit, D. 2012. Platform 2012, a many-core computing accelerator for embedded SoCs: Performance evaluation of visual analytics applications. In Proceedings of the Design Automation Conference. Google ScholarDigital Library
- Microsoft Corporation. 2006. VC-1 technical overview.Google Scholar
- OW2 Consortium 2011. The MIND Project. http://mind.ow2.org.Google Scholar
- Paulin, P. G., Benny, B., Langevin, M., Bouchebaba, Y., Pilkington, C., Lavigueur, B., Lo, D., Gagne, V., and Metzger, M. 2010. MPSoC Platform Mapping Tools for Data-Dominated Applications. In Model-Based Design for Embedded Systems, G. Nicolescu G. and P. Mosterman, Eds., CRC Press, 2010.Google Scholar
- Paulin, P. G., Pilkington, C., et al. 2006, Parallel programming models for a multiprocessor SoC platform applied to networking and multimedia. IEEE Trans. VLSI Syst. 14, 7, 667--680. Google ScholarDigital Library
- STMicroelectronics and CEA 2010. Platform 2012: A many-core programmable accelerator for ultra-efficient embedded computing in nanometer technology. http://www.cmc.ca/en/WhatWeOffer/Prototyping/~/media/WhatWeOffer/TechPub/20101105_Whitepaper_Final.pdfGoogle Scholar
Index Terms
- Parallel programming patterns for multi-processor SoC: Application to video processing
Recommendations
Programming challenges & solutions for multi-processor SoCs: an industrial perspective
DAC '11: Proceedings of the 48th Design Automation ConferenceIn this paper, we describe challenges and solutions for programming multi-processor systems-on-a-chip, based on our experience in programming Platform2012, a large-scale multicore fabric under development by STMicroelectronics and CEA, using the ...
Parallel programming of multi-processor SoC: a HW-SW interface perspective
Special Issue on Multiprocessor-based embedded systemsFor the design of classic computers the parallel programming concept is used to abstract HW/SW interfaces during high level specification of application software. The software is then adapted to existing multiprocessor platforms using a low level ...
Parallel Programming Models for a Multi-Processor SoC Platform Applied to High-Speed Traffic Management
CODES+ISSS '04: Proceedings of the international conference on Hardware/Software Codesign and System Synthesis: 2004In this paper, we describe the MultiFlex multi-processor SoC programming environment, with focus on two programming models: a distributed system object component (DSOC) message passing model, and a symmetrical multi-processing (SMP) model using shared ...
Comments