Abstract
Decoding of an H.264 video stream is a computationally demanding multimedia application which poses serious challenges on current processor architectures. For processors with strongly limited computational resources, a natural way to tackle this problem is the use of multi-core systems. The contribution of this paper lies in a systematic overview and performance evaluation of parallel video decoding approaches. We focus on decoder splittings for strongly resource-restricted environments inherent to mobile devices. For the evaluation, we introduce a high-level methodology which can estimate the runtime behaviour of multi-core decoding architectures. We use this methodology to investigate six methods for accomplishing data-parallel splitting of an H.264 decoder. These methods are compared against each other in terms of runtime complexity, core usage, inter-communication and bus transfers. We present benchmark results using different numbers of processor cores. Our results shall aid in finding the splitting strategy that is best-suited for the targeted hardware-architecture.
Similar content being viewed by others
References
Ball T, Larus JR (1994) Optimally profiling and tracing programs. ACM Trans Program Lang Syst 16(4):1319–1360
Cesario WO, Lyonnard D, Nicolescu G, Paviot Y, Yoo S, Jerraya AA, Gauthier L, Diaz-Nava M (2002) Multiprocessor SoC platforms: a component-based design approach. IEEE Des Test Comput 19(6):52–63
Chen TW, Huang YW, Chen TC, Chen YH, Tsai CY, Chen LG (2005) Architecture design of H.264/AVC decoder with hybrid task pipelining for high definition videos. In: Proc. of the IEEE int. symposium on circuits and systems, pp 2931–2934
Chen YK, Tian X, Ge S, Girkar M (2004) Towards efficient multi-level threading of H.264 encoder on Intel hyper-threading architectures. In: Proc. of the 18th int. parallel and distributed processing symposium, vol 1, pp 63–72
Cmelik B, Keppel D (1994) Shade: a fast instruction-set simulator for execution profiling. In: Proc. of the ACM SIGMETRICS conference on measurement and modeling of computer systems, pp 128–137
Faichney J, Gonzalez R (2001) Video coding for mobile handheld conferencing. J Multimed Tools Appl 13(2):165–176
Graham SL, Kessler PB, McKusick MK (1982) gprof: a call graph execution profiler. In: Proc. of the SIGPLAN symposium on compiler construction, pp 120–126
Gulliver SR, Ghinea G, Patel M, Serif T (2007) A context-aware tour guide: user implications. Mobile Inform Syst 3(2):71–88
ITU-T, ISO/IEC (2005) Advanced video coding for generic audiovisual services (ITU Rec. H.264 | ISO/IEC 14496-10). ITU-T and ISO/IEC
Jeon J, Kim H, Boo G, Song J, Lee E, Park H (2000) Real-time MPEG-2 video codec system using multiple digital signal processors. J Multimed Tools Appl 11(2):197–214
Knudsen PV, Madsen J (1996) Pace: A dynamic programming algorithm for hardware/software partitioning. In: Proc. of the int. workshop on hardware–software co-design, pp 85–92
Malik S, Martonosi M, Li YTS (1997) Static timing analysis of embedded software. In: Proc. of the 34th ACM/IEEE design automation conference, pp 147–152
Meenderinck C, Azevedo A, Juurlink B, Alvarez M, Ramirez A (2008) Parallel scalability of video decoders. J Signal Process Syst 56:173–194
Moriyoshi T, Miura S (2008) Real-time H.264 encoder with deblocking filter parallelization. In: IEEE int. conference on consumer electronics, pp 63–64
Nachtergaele L, Catthoor F, Kapoor B, Janssens S, Moolenaar D (1996) Low power storage exploration for H.263 video decoder. In: Proc. of the IX workshop on VLSI signal processing, pp 115–124. doi:10.1109/VLSISP.1996.558310
Paver N, Khan M, Aldrich B (2006) Optimizing mobile multimedia using SIMD techniques. J Multimed Tools Appl 28(2):221–238
Puschner PP, Koza C (1989) Calculating the maximum execution time of real-time programs. J Real-Time Syst 1(2):159–176
Ravasi M, Mattavelli M (2003) High-level algorithmic complexity evaluation for system design. J Systems Archit 48(13–15):403–427
Ravasi M, Mattavelli M (2005) High abstraction level complexity analysis and memory architecture simulations for multimedia algorithms. IEEE Trans Circuits Syst Video Technol 15(5):673–684
Rodriguez A, Gonzalez A, Malumbres M (2006) Hierarchical parallelization of an H.264/AVC video encoder. In: Proc. of the int. symposium on parallel computing in electrical engineering, pp 363–368
Schöffmann K, Fauster M, Lampl O, Böszörményi L (2007) An evaluation of parallelization concepts for baseline-profile compliant H.264/AVC decoders. In: Proc. of the Euro-Par 2007, pp 782–791
Seitner F, Meser J, Schedelberger G, Wasserbauer A, Bleyer M, Gelautz M, Schutti M, Schreier R, Vaclavik P, Krottendorfer G, Truhlar G, Bauernfeind T, Beham P (2008) Design methodology for the SVENm multimedia engine. In: Proc. of the Austrochip 2008, poster presentation
Seitner FH, Schreier RM, Bleyer M, Gelautz M (2008) A high-level simulator for the H.264/AVC decoding process in multi-core systems. In: Proc. of the SPIE, multimedia on mobile devices, vol 6821, pp 5–16
Sun S, Wang D, Chen S (2007) A highly efficient parallel algorithm for H.264 encoder based on macro-block region partition. In: Proc. of the 3rd int. conference on high performance computing and communications, pp 577–585
van der Tol EB, Jaspers EG, Gelderblom RH (2003) Mapping of H.264 decoding on a multiprocessor architecture. In: Proc. of the SPIE, vol 5022, pp 707–718
Wang SH, Peng WH, He Y, Lin GY, Lin CY, Chang SC, Wang CN, Chiang P (2003) A platform-based MPEG-4 advanced video coding (AVC) decoder with block-level pipelining. In: Proc. of the 2003 joint conference of the 4th int. conference on information, communications and signal processing and the 4th Pacific rim conference on multimedia, vol 1, pp 51–55
Witchel E, Rosenblum M (1996) Embra: fast and flexible machine simulation. In: Proc. of the ACM SIGMETRICS int. conference on measurement and modeling of computer systems, pp 68–79
Zhao Z, Liang P (2006) A highly efficient parallel algorithm for H.264 video encoder. In: Proc. of the 31st IEEE int. conference on acoustics, speech, and signal processing, vol 5, pp 489–492
Acknowledgements
This work has been supported by the Austrian Federal Ministry of Transport, Innovation, and Technology under the FIT-IT project VENDOR (Project nr. 812429). Michael Bleyer would like to acknowledge the Austrian Science Fund (FWF) for financial support under project P19797.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Seitner, F.H., Bleyer, M., Gelautz, M. et al. Evaluation of data-parallel H.264 decoding approaches for strongly resource-restricted architectures. Multimed Tools Appl 53, 431–457 (2011). https://doi.org/10.1007/s11042-010-0501-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0501-7