Skip to main content
Log in

Evaluation of data-parallel H.264 decoding approaches for strongly resource-restricted architectures

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Decoding of an H.264 video stream is a computationally demanding multimedia application which poses serious challenges on current processor architectures. For processors with strongly limited computational resources, a natural way to tackle this problem is the use of multi-core systems. The contribution of this paper lies in a systematic overview and performance evaluation of parallel video decoding approaches. We focus on decoder splittings for strongly resource-restricted environments inherent to mobile devices. For the evaluation, we introduce a high-level methodology which can estimate the runtime behaviour of multi-core decoding architectures. We use this methodology to investigate six methods for accomplishing data-parallel splitting of an H.264 decoder. These methods are compared against each other in terms of runtime complexity, core usage, inter-communication and bus transfers. We present benchmark results using different numbers of processor cores. Our results shall aid in finding the splitting strategy that is best-suited for the targeted hardware-architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

References

  1. Ball T, Larus JR (1994) Optimally profiling and tracing programs. ACM Trans Program Lang Syst 16(4):1319–1360

    Article  Google Scholar 

  2. Cesario WO, Lyonnard D, Nicolescu G, Paviot Y, Yoo S, Jerraya AA, Gauthier L, Diaz-Nava M (2002) Multiprocessor SoC platforms: a component-based design approach. IEEE Des Test Comput 19(6):52–63

    Article  Google Scholar 

  3. Chen TW, Huang YW, Chen TC, Chen YH, Tsai CY, Chen LG (2005) Architecture design of H.264/AVC decoder with hybrid task pipelining for high definition videos. In: Proc. of the IEEE int. symposium on circuits and systems, pp 2931–2934

  4. Chen YK, Tian X, Ge S, Girkar M (2004) Towards efficient multi-level threading of H.264 encoder on Intel hyper-threading architectures. In: Proc. of the 18th int. parallel and distributed processing symposium, vol 1, pp 63–72

  5. Cmelik B, Keppel D (1994) Shade: a fast instruction-set simulator for execution profiling. In: Proc. of the ACM SIGMETRICS conference on measurement and modeling of computer systems, pp 128–137

  6. Faichney J, Gonzalez R (2001) Video coding for mobile handheld conferencing. J Multimed Tools Appl 13(2):165–176

    Article  MATH  Google Scholar 

  7. Graham SL, Kessler PB, McKusick MK (1982) gprof: a call graph execution profiler. In: Proc. of the SIGPLAN symposium on compiler construction, pp 120–126

  8. Gulliver SR, Ghinea G, Patel M, Serif T (2007) A context-aware tour guide: user implications. Mobile Inform Syst 3(2):71–88

    Google Scholar 

  9. ITU-T, ISO/IEC (2005) Advanced video coding for generic audiovisual services (ITU Rec. H.264 | ISO/IEC 14496-10). ITU-T and ISO/IEC

  10. Jeon J, Kim H, Boo G, Song J, Lee E, Park H (2000) Real-time MPEG-2 video codec system using multiple digital signal processors. J Multimed Tools Appl 11(2):197–214

    Article  Google Scholar 

  11. Knudsen PV, Madsen J (1996) Pace: A dynamic programming algorithm for hardware/software partitioning. In: Proc. of the int. workshop on hardware–software co-design, pp 85–92

  12. Malik S, Martonosi M, Li YTS (1997) Static timing analysis of embedded software. In: Proc. of the 34th ACM/IEEE design automation conference, pp 147–152

  13. Meenderinck C, Azevedo A, Juurlink B, Alvarez M, Ramirez A (2008) Parallel scalability of video decoders. J Signal Process Syst 56:173–194

    Google Scholar 

  14. Moriyoshi T, Miura S (2008) Real-time H.264 encoder with deblocking filter parallelization. In: IEEE int. conference on consumer electronics, pp 63–64

  15. Nachtergaele L, Catthoor F, Kapoor B, Janssens S, Moolenaar D (1996) Low power storage exploration for H.263 video decoder. In: Proc. of the IX workshop on VLSI signal processing, pp 115–124. doi:10.1109/VLSISP.1996.558310

  16. Paver N, Khan M, Aldrich B (2006) Optimizing mobile multimedia using SIMD techniques. J Multimed Tools Appl 28(2):221–238

    Article  Google Scholar 

  17. Puschner PP, Koza C (1989) Calculating the maximum execution time of real-time programs. J Real-Time Syst 1(2):159–176

    Article  Google Scholar 

  18. Ravasi M, Mattavelli M (2003) High-level algorithmic complexity evaluation for system design. J Systems Archit 48(13–15):403–427

    Article  Google Scholar 

  19. Ravasi M, Mattavelli M (2005) High abstraction level complexity analysis and memory architecture simulations for multimedia algorithms. IEEE Trans Circuits Syst Video Technol 15(5):673–684

    Article  Google Scholar 

  20. Rodriguez A, Gonzalez A, Malumbres M (2006) Hierarchical parallelization of an H.264/AVC video encoder. In: Proc. of the int. symposium on parallel computing in electrical engineering, pp 363–368

  21. Schöffmann K, Fauster M, Lampl O, Böszörményi L (2007) An evaluation of parallelization concepts for baseline-profile compliant H.264/AVC decoders. In: Proc. of the Euro-Par 2007, pp 782–791

  22. Seitner F, Meser J, Schedelberger G, Wasserbauer A, Bleyer M, Gelautz M, Schutti M, Schreier R, Vaclavik P, Krottendorfer G, Truhlar G, Bauernfeind T, Beham P (2008) Design methodology for the SVENm multimedia engine. In: Proc. of the Austrochip 2008, poster presentation

  23. Seitner FH, Schreier RM, Bleyer M, Gelautz M (2008) A high-level simulator for the H.264/AVC decoding process in multi-core systems. In: Proc. of the SPIE, multimedia on mobile devices, vol 6821, pp 5–16

  24. Sun S, Wang D, Chen S (2007) A highly efficient parallel algorithm for H.264 encoder based on macro-block region partition. In: Proc. of the 3rd int. conference on high performance computing and communications, pp 577–585

  25. van der Tol EB, Jaspers EG, Gelderblom RH (2003) Mapping of H.264 decoding on a multiprocessor architecture. In: Proc. of the SPIE, vol 5022, pp 707–718

  26. Wang SH, Peng WH, He Y, Lin GY, Lin CY, Chang SC, Wang CN, Chiang P (2003) A platform-based MPEG-4 advanced video coding (AVC) decoder with block-level pipelining. In: Proc. of the 2003 joint conference of the 4th int. conference on information, communications and signal processing and the 4th Pacific rim conference on multimedia, vol 1, pp 51–55

  27. Witchel E, Rosenblum M (1996) Embra: fast and flexible machine simulation. In: Proc. of the ACM SIGMETRICS int. conference on measurement and modeling of computer systems, pp 68–79

  28. Zhao Z, Liang P (2006) A highly efficient parallel algorithm for H.264 video encoder. In: Proc. of the 31st IEEE int. conference on acoustics, speech, and signal processing, vol 5, pp 489–492

Download references

Acknowledgements

This work has been supported by the Austrian Federal Ministry of Transport, Innovation, and Technology under the FIT-IT project VENDOR (Project nr. 812429). Michael Bleyer would like to acknowledge the Austrian Science Fund (FWF) for financial support under project P19797.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Florian H. Seitner.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Seitner, F.H., Bleyer, M., Gelautz, M. et al. Evaluation of data-parallel H.264 decoding approaches for strongly resource-restricted architectures. Multimed Tools Appl 53, 431–457 (2011). https://doi.org/10.1007/s11042-010-0501-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-010-0501-7

Keywords

Navigation