Real-time UHD video super-resolution and transcoding on heterogeneous hardware

Dong, Yu; Song, Li; Xie, Rong; Zhang, Wenjun

doi:10.1007/s11554-019-00913-7

Real-time UHD video super-resolution and transcoding on heterogeneous hardware

Special Issue Paper
Published: 20 September 2019

Volume 17, pages 2029–2045, (2020)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Yu Dong¹,
Li Song^1,2,
Rong Xie¹ &
…
Wenjun Zhang¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Videos have become the major type of data produced and consumed every day. With screens grow larger, ultra high definition (UHD) videos are becoming more popular since they provide better visual experience. However, video contents with UHD resolution are still scarce. High-performance video super-resolution (SR) techniques that can obtain high resolution (HR) videos from low resolution (LR) sources are recently used in UHD video production. Deep learning (DL)-based SR methods can provide HR videos with appreciable objective and subjective qualities, while their massive computational complexity makes the processing speed far slower than real-time even on GPU servers when producing UHD videos. Moreover, transcoding and other video processing algorithms executed during the enhancement are also time and resource consuming, which performs relatively slow on ordinary CPU and GPU servers. Nowadays, hardware including GPU, field-programmable gate array (FPGA) and application specific integrated circuit (ASIC) are proved to have outstanding capability on image and video processing tasks in different aspects, and there are also dedicated hardware accelerators meant for specific video processing tasks. In this paper, we focus on accelerating a UHD video enhancement workflow on heterogeneous system with multiple hardware accelerators. First, we optimize the most time consuming task, video SR, with CUDNN and CUDA libraries to achieve real-time processing speed for a single UHD output frame on an ordinary GPU. Second, we design a GPU-friendly multi-thread scheduling algorithm for data and computation to better utilize GPU resources and achieve real-time performance on outputting UHD video clips. Third, targeting on production environment, we build a UHD video enhancement application on selected heterogeneous hardware, with an integrated command line tool of our proposed algorithm, and achieve 60 fps real-time end to end processing speed. Experiments show high efficiency, robustness and compatibility of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 2

A CNN-Based Multi-scale Super-Resolution Architecture on FPGA for 4K/8K UHD Applications

GPU-based real-time super-resolution system for high-quality UHD video up-conversion

Article 13 September 2017

Heterogeneous CPU plus GPU approaches for HEVC

Article 02 April 2018

References

Wang, Z., Jian, C., Steven, C.: Deep learning for image super-resolution: a survey (2019). arXiv preprint arXiv:1902.06068
Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A., Bishop, R., Rueckert, D., Wang, Z.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016)
Elad, M., Feuer, A.: Restoration of a single superresolution image from several blurred, noisy, and undersampled measured images. IEEE Trans. Image Process. 6(12), 1646–1658 (1997)
Article Google Scholar
Bose, N.K., Boo, K.J.: High?resolution image reconstruction with multisensors. Int. J. Imaging Syst. Technol. 9(4), 294–304 (1998)
Article Google Scholar
He, Y., Yap, K.H., Chen, L., Chau, L.: A nonlinear least square technique for simultaneous image registration and super-resolution. IEEE Trans. Image Process. 16(11), 2830–2841 (2007)
Article MathSciNet Google Scholar
Anbarjafari, G., Demirel, H.: Image super resolution based on interpolation of wavelet domain high frequency subbands and the spatial domain input image. ETRI J. 32(3), 390–394 (2010)
Article Google Scholar
Kim, K.I., Kwon, Y.: Single-image super-resolution using sparse regression and natural image prior. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1127–1133 (2010)
Article Google Scholar
Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, L.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding (2012)
Jung, C., Ke, P., Sun, Z., Gu, A.: A fast deconvolution-based approach for single-image super-resolution with GPU acceleration. J. Real-Time Image Process. 14(2), 501–512 (2018)
Article Google Scholar
Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. European conference on computer vision. Springer, Cham (2014)
Google Scholar
Zhao, Z., Song, L., Xie, R., Yang, X.: GPU accelerated high-quality video/image super-resolution. 2016 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB) (2016)
Kim, J., Jung, K., Kyoung, M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016)
Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2017)
Kim, J., Kwon Lee, J., Mu Lee, K.: Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016)
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2017)
Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., Change Loy, C.: Esrgan: Enhanced super-resolution generative adversarial networks. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Caballero, J., Ledig, C., Aitken, A., Acosta, A., Totz, J., Wang, Z., Shi, W.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Jo, Y., Wug Oh, S., Kang, J., Joo Kim, S.: Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2018)
Yang, W., Zhang, X., Tian, Y., Wang, W., Xue, J., Liao, Q.: Deep learning for single image super-resolution: a brief review. IEEE Trans Multimedia (2019)
Chang, J., Keon-Woo, K., Suk-Ju, K.: An energy-efficient fpga-based deconvolutional neural networks accelerator for single image super-resolution. IEEE Trans Circuits Syst Video Technol (2018)
Kim, Y., Choi, J.S., Kim, M.: A real-time convolutional neural network for super-resolution on FPGA with applications to 4K UHD 60 fps Video Services. IEEE Trans Circuits Syst Video Technol (2018)
He, Z., Huang, H., Jiang, M., Bai, Y., Luo, G.: FPGA-based real-time super-resolution system for ultra high definition videos. In: 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) (2018)
Ko, Y., Yi, Y., Ha, S.: An efficient parallelization technique for x264 encoder on heterogeneous platforms consisting of CPUs and GPUs. J. Real-Time Image Process. 9(1), 5–18 (2014)
Article Google Scholar
Lee, D., Sim, D., Cho, K., Oh, S.: Fast motion estimation for HEVC on graphics processing unit (GPU). J. Real-Time Image Process. 12(2), 549–562 (2016)
Article Google Scholar
Zhu, H., Wang, D., Zhang, P., Luo, Z., Jiao, L., Han, H.: Parallel implementations of frame rate up-conversion algorithm using OpenCL on heterogeneous computing devices. Multimedia Tools Appl 78, 9311–9334 (2018)
Article Google Scholar
Bittner, R., Ruf, E., Forin, A.: Direct GPU/FPGA communication via PCI express. Cluster Comput. 17(2), 339–348 (2014)
Article Google Scholar
Chang, Z.H., Jong, B.F., Wong, W.J., Wong, M.: Distributed video transcoding on a heterogeneous computing platform. In: 2016 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) (2016)
HajiRassouliha, A., Taberner, A.J., Nash, M.P., Nielsen, P.M.: Suitability of recent hardware accelerators (DSPs, FPGAs, and GPUs) for computer vision and image processing algorithms. Signal Process. Image Commun. 68, 101–119 (2018)
Article Google Scholar
Georgis, G., Lentaris, G., Reisis, D.: Acceleration techniques and evaluation on multi-core CPU, GPU and FPGA for image processing and super-resolution. J. Real-Time Image Process. 16, 1–28 (2016)
Google Scholar
Schulter, S., Christian, L., Horst, B.: Fast and accurate image upscaling with super-resolution forests. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
Fang, M., Fang, J., Zhang, W., Zhou, H., Liao, J., Wang, Y.: Benchmarking the GPU memory at the warp level. Parallel Comput. 71, 23–41 (2018)
Article MathSciNet Google Scholar
https://pytorch.org
https://hevc.hhi.fraunhofer.de
https://docs.nvidia.com/

Download references

Acknowledgements

This work was supported by NSFC (61521062, U1611461, 61671296), MoE-China Mobile Research Fund Project (MCM20180702), the 111 Project (B07022 and Sheitc No. 150633) and the Shanghai Key Laboratory of Digital Media Processing and Transmissions.

Author information

Authors and Affiliations

Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, China
Yu Dong, Li Song, Rong Xie & Wenjun Zhang
MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China
Li Song

Authors

Yu Dong
View author publications
You can also search for this author inPubMed Google Scholar
Li Song
View author publications
You can also search for this author inPubMed Google Scholar
Rong Xie
View author publications
You can also search for this author inPubMed Google Scholar
Wenjun Zhang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Li Song.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, Y., Song, L., Xie, R. et al. Real-time UHD video super-resolution and transcoding on heterogeneous hardware. J Real-Time Image Proc 17, 2029–2045 (2020). https://doi.org/10.1007/s11554-019-00913-7

Download citation

Received: 30 April 2019
Accepted: 06 September 2019
Published: 20 September 2019
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11554-019-00913-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time UHD video super-resolution and transcoding on heterogeneous hardware

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A CNN-Based Multi-scale Super-Resolution Architecture on FPGA for 4K/8K UHD Applications

GPU-based real-time super-resolution system for high-quality UHD video up-conversion

Heterogeneous CPU plus GPU approaches for HEVC

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now