Abstract
The added encoding efficiency and visual quality offered by the High Efficiency Video Coding (HEVC) standard is attained at the cost of a significant computational complexity of both the encoder and the decoder. In particular, the considerable amount of intra prediction modes that are now considered by this standard, together with the increased complexity of the adopted block coding tree structures using a larger diversity of transforms imposes demanding computational efforts that can hardly be satisfied by current general-purpose processors to attain hard real-time requirements. Furthermore, the strict data dependencies that are imposed make parallelization a difficult and hardly efficient option with conventional approaches. To circumvent this adversity, this paper exploits Graphics Processing Units (GPUs) to accelerate the intra decoding procedure in HEVC, encompassing the most demanding modules of the decoder (i.e., de-quantization, inverse transform, intra prediction, deblocking filter, and sample adaptive offset). The presented approaches comprehensively exploit both coarse and fine-grained parallelization opportunities in an integrated perspective by re-designing the execution pattern of the involved modules, while simultaneously coping with their inherent computational complexity and strict data dependencies. As a result, the proposed parallelization, which is fully compliant with the HEVC standard, has shown to be a remarkable viable approach, being capable of satisfying hard real-time requirements by processing each Ultra HD 4 K intra frame in less than 25 ms (about 40 fps).
Similar content being viewed by others
Notes
The presented GPU-based HEVC intra decoder is available upon request by e-mail to the corresponding authors.
References
Sullivan, G.J., Ohm, J., Han, W.-J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1649–1668 (2012). doi:10.1109/TCSVT.2012.2221191. ISSN 1051–8215
Nguyen, T., Marpe, D.: Performance analysis of HEVC-based intra coding for still image compression. Pict Coding Symp. 2012, 233–236 (2012). doi:10.1109/PCS.2012.6213335
Bossen, F., Bross, B., Suhring, K., Flynn, D.: HEVC complexity and implementation analysis. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1685–1696 (2012). doi:10.1109/TCSVT.2012.2221255. ISSN 1051–8215
Lainema, J., Bossen, F., Han, W.-J., Min, J., Ugur, K.: Intra coding of the HEVC standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1792–1801 (2012). doi:10.1109/TCSVT.2012.2221525. ISSN 1051–8215
Budagavi, M., Fuldseth, A., Bjøntegaard, G., Sze, V., Sadafale, M.: Core transform design in the high efficiency video coding (HEVC) standard. IEEE J. Sel. Top. Signal Process. 7(6), 1029–1041 (2013). doi:10.1109/JSTSP.2013.2270429. ISSN 1932–4553
Norkin, A., Bjøntegaard, G., Fuldseth, A., Narroschke, M., Ikeda, M., Andersson, K., Zhou, M., Van der Auwera, G.: HEVC deblockingfilter. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1746–1754 (2012). doi:10.1109/TCSVT.2012.2223053. ISSN 1051–8215
Fu, C.-M., Alshina, E., Alshin, A., Huang, Y.-W., Chen, C.-Y., Tsai, C.-Y., Hsu, C.-W., Lei, S.-M., Park, J.-H., Han, W.-J.: Sample adaptive offset in the HEVC standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1755–1764 (2012). doi:10.1109/TCSVT.2012.2221529. ISSN 1051–8215
Santos, L., López, S., Callicoó, G.M., López, J.F., Sarmiento, R.: Performance evaluation of the H.264/AVC video coding standard for lossy hyperspectral image compression. IEEE J. Sel. Top. Appl. Earth. Obs. Remote Sens. 5(2), 451–461 (2012). doi:10.1109/JSTARS.2011.2173906. ISSN 1939–1404
Sanchez, V., Bartrina-Rapesta, J.: Lossless compression of medical images based on HEVC intra coding. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6622–6626 (2014). doi:10.1109/ICASSP.2014.6854881
Panasonic UK & Ireland. LUMIX G-DMC-GH4 Camera. http://www.panasonic.com/uk/consumer/cameras-camcorders/lumix-g-compact-system-cameras/dmc-gh4.html. Accessed 11 June 2015
Khan, M.U.K., Shafique, M., Henkel, J.: Software architecture of high efficiency video coding for many-core systems with power-efficient workload balancing. In: Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014, pp. 1–6. European Design and Automation Association, Leuven, Belgium (2014). doi:10.7873/DATE.2014.232
Abramowski, A., Pastuszak, G.: A novel intra prediction architecture for the hardware HEVC encoder. In: 2013 Euromicro Conference on Digital System Design (DSD), pp. 429–436 (2013). doi:10.1109/DSD.2013.54
Chi, C.C., Alvarez-Mesa, M., Bross, B., Juurlink, B., Schierl, T.: SIMD acceleration for HEVC decoding. IEEE Trans. Circuits Syst. Video Technol. PP(99), 1–1 (2014). doi:10.1109/TCSVT.2014.2364413. ISSN 1051–8215
Kirk, D.B., Hwu, W.W.: Programming Massively Parallel Processors: A Hands-on Approach, 2nd edn. Morgan Kaufmann Publishers Inc., Waltham (2013)
JCT-VC: High Efficient Video Coding (HEVC). ITU-T Recommendation H.265 and ISO/IEC 23008–2, ITU-T and ISO/IEC JTC 1, Apr. (2013)
Cebrián-Márquez, G., Hernández-Losada, J.L., Martínez, J.L., Cuenca, P., Tang, M., Wen, J.: Accelerating HEVC using heterogeneous platforms. J. Supercomput. (2014). doi:10.1007/s11227-014-1313-8. ISSN 0920-8542
Xiao, W., Wu, F., Xu, J., Shi, G.: Fast HEVC encoding with GPU assisted reference picture selection. In Advances in Multimedia Information Processing–PCM 2013. In: Lecture Notes in Computer Science, vol. 8294, pp. 233–244. Springer, Berlin (2013). doi:10.1007/978-3-319-03731-8_22. ISBN 978-3-319-03730-1
Momcilovic, S., Ilic, A., Roma, N., Sousa, L.: Dynamic load balancing for real-time video encoding on heterogeneous CPU \(+\) GPU systems. IEEE Trans. Multimed 16(1), 108–121 (2014). doi:10.1109/TMM.2013.2284892. ISSN 1520–9210
Engelhardt, D., Möller, J., Hahlbeck, J., Stabernack, B.: FPGA implementation of a Full HD real-time HEVC main profile decoder. IEEE Trans. Consum. Electron. 60(3), 476–484 (2014). doi:10.1109/TCE.2014.6937333. ISSN 0098–3063
de Souza, D.F., Roma, N., Sousa, L.: Cooperative CPU \(+\) GPU deblocking filter parallelization for high performance HEVC video codecs. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4993–4997 (2014). doi:10.1109/ICASSP.2014.6854552
de Souza, D.F., Roma, N., Sousa, L.: OpenCL parallelization of the HEVC de-quantization and inverse transform for heterogeneous platforms. In: 2014 Proceedings of the 22nd European Signal Processing Conference (EUSIPCO), pp. 755–759 (2014)
Lee, K., Lee, H.-J., Kim, J., Choi, Y.: A novel algorithm for zero block detection in high efficiency video coding. IEEE J. Sel. Top. Signal Process. 7(6), 1124–1134 (2013). doi:10.1109/JSTSP.2013.2272772. ISSN 1932–4553
Sousa, L.A.: General method for eliminating redundant computations in video coding. Electron Lett. 36(4), 306–307 (2000). doi:10.1049/el:20000272. ISSN 0013–5194
NVIDIA: CUDA™ Programming Guide. NVIDIA. v6.5 (2014)
Bossen, F.: Common test conditions and software reference configurations. Doc. JCTVC-L1100 of JCT-VC (2013)
Haglund, L.: The SVT high definition multi format test set. Technical report, Sveriges Television AB (SVT), Sweden (2006). ftp://vqeg.its.bldrdoc.gov/HDTV/SVT_MultiFormat/2160p50_CgrLevels_Master_SVTdec05_/. Accessed 11 June 2015
JCT-VC: Subversion repository for the HEVC test model version HM 15.0 (2014). https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-15.0/
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by national funds through FCT—Fundação para a Ciência e a Tecnologia, under projects SFRH/BD/76285/2011, PTDC/EEI-ELC/3152/2012, and UID/CEC/50021/2013.
Rights and permissions
About this article
Cite this article
de Souza, D.F., Ilic, A., Roma, N. et al. GPU-assisted HEVC intra decoder. J Real-Time Image Proc 12, 531–547 (2016). https://doi.org/10.1007/s11554-015-0519-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-015-0519-1