skip to main content
research-article

Dynamic task partition for video decoding on heterogeneous dual-core platforms

Published: 29 March 2013 Publication History

Abstract

This article presents the design of a video decoder using dynamic task partition approach on a heterogeneous dual-core embedded platform. For such systems, static task partition between the two cores at design time is a typical approach for application development. In this article, we propose a runtime dynamic task partition model and implement an MPEG-4 Simple Profile video decoder using this approach on a TI OMAP 5912 platform. Comparing with a traditional mobile video decoder optimized for the same DSP core, the performance gain from dynamic task partition is 38.4% on average. More importantly, the gain is achieved with the design constraint that the implementation effort for the dynamic task partition decoder is about the same as the effort using design-time task partition model. Unlike common belief that the inter-processor communication overhead would be too high to justify intense cooperation between two heterogeneous cores, this paper shows that it is indeed beneficial to adopt dynamic task partition model on commercially available heterogeneous multi-core platforms.

References

[1]
Avritzer, A., Gerla, M., Ribeiro, B. A. N., Carlyle, J. W., and Karplus, W. J. 1990. The advantage of dynamic tuning in distributed asymmetric systems. In Proceedings of the IEEE INFOCOM. 881--818.
[2]
Banakar, R., Steinke, S., Lee, B.-S., Balakrish, N., and Marwedel, P. 2002. Scratchpad memory: A design alternative for cache on-chip memory. In Proceedings of the 10th International Symposium on Hardware/Software Codesign (CODES). 73--78.
[3]
Baruah, S. 2004. Feasibility analysis of preemptive real-time systems upon heterogeneous multiprocessor platforms. In Proceedings of the 25th IEEE Internal Real-Time Systems Symposium.
[4]
Cancian, R. L. and Friedrich, L. F. 2002. Performance evaluation of real-time schedulers for a multicomputer. In Proceedings of 6th IEEE International Workshop on Distributed Simulation and Real-Time Applications.
[5]
Chiu, C.-N. 2005. H.264 Video encoding optimization on dual-core platform. Master thesis, National Chiao Tung University, Hsinchu, Taiwan.
[6]
Chiu C.-N., Tseng, C.-T., and Tsai, C.-J. 2005. Tightly-coupled MPEG-4 video encoder framework on asymmetric dual-core platforms. In Proceedings of the International Symposium on Circuits and Systems (ISCAS'05). Vol. 3, 2132--2135.
[7]
Choi, B.-D., Choi, K.-S., Ko, S.-J., and Morales, A. W. 2003. Efficient real-time implementation of MPEG-4 audiovisual decoder using DSP and RISC chips. In Proceedings of the IEEE International Conference on Consumer Electronics (ICCE). 246--247.
[8]
Chou, C. and Marculescu, R. 2008. Contention-aware application mapping for network-on-chip communication architectures. In Proceedings of International Conference on computer Design. 164--169.
[9]
Dias, W. P. and Colonese, E. 2008. Performance analysis of cache and scratchpad memory in an embedded high performance processor. In Proceedings of 5th International Conference on Information Technology: New Generations (ITNG). 657--661.
[10]
Freescale, 2008. mobile extreme convergence architecture. White Paper, MXCWP, Rev. 5.
[11]
Gheorghita, S. V., Palkovic, M., Hamers, J., Vandecappelle, A., Mamagkakis, S., Basten, T., Eeckhout, L., Corporaal, H., Catthoor, F., Vandeputte, F., and de Bosschere, K. 2009. System-scenario-based design of dynamic embedded systems. ACM Trans. Des. Autom. Electron. Syst. 14, 1, article 3.
[12]
Greenberg, A. G. and Wright, P. E. 1991. Design and analysis of master/slave multiprocessors. IEEE Trans. Computers 40, 8, 963--976.
[13]
Hoelzenspies, P. K. F., Hurink, J. L., Kuper, J., and Smit, G. J. M. 2008. Run-time spatial mapping of streaming applications to a heterogeneous multi-processor system-on-chip (MPSoC). In Proceedings of the Conference on Design, Automation, and Test in Europe. 212--217.
[14]
Janapsatya, A., Parameswaran, S., and Ignjatovic, A. 2004. Hardware/software managed scratchpad memory for embedded system. In Proceedings of the International Conference on Computer Aided Design (ICCAD). 370--377.
[15]
Janssens, M. D., Annot, J. K., and van de Goor, A. J. 1986. Adopting UNIX for a multiprocessor environment. Comm. ACM 29, 9, 895-901.
[16]
Lee, K.-C. 2006. Design and analysis of a dynamic fine-granularity task scheduler for heterogeneous dual-core platforms. Master thesis, National Chiao Tung University, Hsinchu, Taiwan.
[17]
Lee, K. H., Lee, K.-S., Hwang, T.-H., Park, Y.-C., and Youn, D. H. 2001. An architecture and implementation of MPEG audio layer III decoder using dual-core DSP. IEEE Trans. Consum. Electron. 47, 4, 928--933.
[18]
Manimaran, G. and Murthy, C. S. R. 1998. A fault-tolerant dynamic scheduling algorithm for multiprocessor real-time systems and its analysis. IEEE Trans. Parallel Distrib. Syst. 9, 11. 1137--1152.
[19]
Massa, A. J. 2002, Embedded Software Development with eCos, Prentice Hall.
[20]
Momtchev, M. and Marquet, P. 2002. An asymmetric real-time scheduling for Linux. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS).
[21]
Nxp 2005, PNX8526 programmable source decoder with integrated peripherals data sheet, Rev. 02.
[22]
Reuter C., Schwiegershausen, M., and Pirsch, P. 1997. Heterogeneous multiprocessor scheduling and allocation using evolutionary alogorithms. In Proceedings of the IEEE International Conference on Applicaton-Specific Systems, Architectures, and Processors. 294--303.
[23]
Shamshiri S. and Fakhraie, S. M. 2004. Parallel alias reduction for MP3 decoding. In Proceedings of the 16th International Conference on Microelectronics. 438--441.
[24]
Srinivasan, K. and Chatha, K. S. 2005. A technique for low energy mapping and routing in network-on-chip architectures. In Proceedings of the International Symposium on Low Power Electronics and Design. 387--392.
[25]
Su, Y.-Y. and Tsai, C.-J. 2006. A dual-core dynamic scheduling paradigm for embedded multimedia applications. In Proceedings of the VLSI Design/CAD Symposium.
[26]
Ti. 2004. TMS320C55x Image/Video Processing Library Programmer's Reference. SPRU037C, Texas Instruments, Dallas, Texas.
[27]
Ti. 2005a. OMAP5912 Applications Processor Data Manual. SPRS231E, Texas Instruments, Dallas, Texas.
[28]
Ti. 2005b. OMAP5912 Multimedia Processor Direct Memory Access (DMA) Support Reference Guide. SPRU755C, Texas Instruments, Dallas, Texas.
[29]
Tran, T. 2003. OMAP 5910 video encoding and decoding, TI Appl. rep. SPRA985.
[30]
van der Wolf, P., Kang, J., and Henriksson, T. 2005. Implementation of dynamic streaming applications on heterogeneous multi-processor architectures. In Proceedings of 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis. 57--62.
[31]
Wang, C.-P. and Tsai, C.-J. 2005. Design and analysis of a unified asymmetric multiprocessor scheduler. In Proceedings of the VLSI Design/CAD Symposium.
[32]
Wendorf, J. W., Wendorf, R. G., and Tokuda, 1989. H., Scheduling operating system processing on small-scale multiprocessors. In Proceedings of the 22nd Annual Hawaii International Conference on System Sciences.
[33]
Wolf, W. and Staunstrup, J. Eds. 1997. Hardware-Software Codesign: Principles and Practice. Kluwer Academic Publishers.
[34]
Xue L., Ozturk O., Li, F., Kandemir, M., and Kolcu, I. 2006. Dynamic partitioning of processing and memory resources in embedded MPSOC architectures. In Proceedings of the 9th Design, Automation and Test in Europe (DATE). 690--695.
[35]
Yang, P., Marchal, P., Wong, C., Himpe, S., Catthoor, F., David, P., Vounckx, J., and Lauwereins, R. 2002. Managing dynamic concurrent tasks in embedded real-time multimedia systems. In Proceedings of the 15th International Symposium on System Synthesis. 112--119.

Cited By

View all
  • (2016)Dynamic core allocation for energy efficient video decoding in homogeneous and heterogeneous multicore architecturesFuture Generation Computer Systems10.1016/j.future.2015.09.01856:C(247-261)Online publication date: 1-Mar-2016
  • (2015)An Efficient Application Processor Architecture for Multicore Software Video DecodingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2014.232936525:2(325-338)Online publication date: 1-Feb-2015

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 12, Issue 1s
Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
March 2013
701 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/2435227
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 29 March 2013
Accepted: 01 January 2011
Revised: 01 November 2010
Received: 01 July 2010
Published in TECS Volume 12, Issue 1s

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Video codec
  2. embedded multimedia systems
  3. heterogeneous multicore
  4. task partition

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2016)Dynamic core allocation for energy efficient video decoding in homogeneous and heterogeneous multicore architecturesFuture Generation Computer Systems10.1016/j.future.2015.09.01856:C(247-261)Online publication date: 1-Mar-2016
  • (2015)An Efficient Application Processor Architecture for Multicore Software Video DecodingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2014.232936525:2(325-338)Online publication date: 1-Feb-2015

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media