Skip to main content

Advertisement

Log in

Parallel implementations of frame rate up-conversion algorithm using OpenCL on heterogeneous computing devices

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

As a video post-processing technology, frame rate up-conversion (FRUC) converts a low frame rate video into a higher one by inserting intermediate frames between adjacent original frames. Because computing consumption grows rapidly with the increase of video resolution and frame rate, accelerating FRUC by parallel computing may serve as an appropriate method. In this paper, an effective parallel FRUC algorithm is proposed, which consists mainly of two parts: parallel motion estimation algorithm (Three-dimensional Recursive Search algorithm, 3DRS algorithm) and parallel motion compensation algorithm. We design macro-block-level parallelism and candidate motion vector level parallelism strategies based on different granularity in the motion estimation module, and pixel-level parallelism in the motion compensation module. The proposed parallel FRUC algorithm has been tested on different hardware platforms. The results show that the method achieves significant speedups of up to 96× for 1920 × 1080 video and 254× for 3840 × 2160 video when compared with sequential implementation on CPU. Moreover, the OpenCL program of the parallel FRUC algorithm shows good portability on various GPU platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Ahn J, Song S, Kim K (2009) Implementation of H.264 fractional motion estimation using full search algorithm. IEEE Int Soc Des Conf: 357–360

  2. AI-Kadi G, Hoogerbrugge J, Guntur S (2010) Meandering based parallel 3DRS algorithm for the multicore era. ICCE 2010-2010 Digest Tech Papers Int Conf Consum Electro 143(6):21–22

    Article  Google Scholar 

  3. Calandra H, Dolbeau R, Fortin P (2013) Evaluation of successive CPUs/APUs/GPUs based on an OpenCL finite difference stencil. 21st Euromicro Int Conf Parallel Distrib Netw-based Process: 405–409

  4. Choi BT, Lee SH, Ko SJ (2000) New frame rate up-conversion using bi-directional motion estimation. IEEE Trans Consum Electron 46(3):603–609

    Article  Google Scholar 

  5. Choi BD, Han JW, Kim CS (2007) Motion-compensated frame interpolation using bilateral motion estimation and adaptive overlapped block motion compensation. IEEE Trans Circ Syst Video Technol 17(4):407–416

    Article  Google Scholar 

  6. De Haan G, Biezen PWAC, Huijgen H (1993) True-motion estimation with 3-d recursive search block matching. IEEE Trans Circ Syst Video Technol 3(5):368–379

    Article  Google Scholar 

  7. Zhang Dejun, He Fazhi, Han Soonhung (2017) An efficient approach to directly compute the exact Hausdorff distance for 3D point sets. Integrat Comput-Aided Eng, v 24, n 3, p 261–277

    Article  Google Scholar 

  8. Diniz C, Corrêa G, Susin A (2010) Comparative analysis of parallel SAD calculation hardware architectures for H.264/AVC video coding. IEEE Latin Am Symp Circ Syst: 113–116

  9. Du P, Weber R, Luszczek P (2012) From CUDA to OpenCL: towards a performance-portable solution for multi-platform GPU programming. Parallel Comput 38(8):391–407

    Article  Google Scholar 

  10. Gaetano R, Pesquet-Popescu B (2011) OpenCL implementation of motion estimation for cloud video processing. IEEE 13th Int Workshop Multimed Sign Process IEEE: 1–6

  11. Guo Y, Chen L, Gao Z (2016) Frame rate up-conversion using linear quadratic motion estimation and trilateral filtering motion smoothing. IEEE J Display Technol 12(1):89–98

    Google Scholar 

  12. Ha T, Lee S, Kim J (2004) Motion compensated frame interpolation by new block-based motion estimation algorithm. IEEE Trans Consum Electron 50(2):752–759

    Article  Google Scholar 

  13. He S, Zhang X (2009) An efficient fast block-matching motion estimation algorithm. IEEE Int Conf Image Anal Sign Process: 216–220

  14. Hwang J, Choi Y, Choe Y (2011) Frame rate up-conversion technique using hardware-efficient motion estimator architecture for motion blur reduction of TFT-LCD. IEICE Trans Electron 94(5):896–904

    Article  Google Scholar 

  15. Kang L, Fazhi He, Haiping Y (2018) A parallel and robust object tracking approach synthesizing adaptive Bayesian learning and improved incremental subspace learning. Front Comput Sci (ISSN 2095-2228)

  16. Lee B, Kim M (2013) No-reference PSNR estimation for HEVC encoded video. IEEE Trans Broadcast 59(1):20–27

    Article  Google Scholar 

  17. Lee SH, Kwon O, Park RH (2003) Weighted-adaptive motion-compensated frame rate up-conversion. IEEE Trans Consum Electron 49(3):485–492

    Article  Google Scholar 

  18. Kang Li, Fazhi He, Pan Yiteng (2017) A correlative classifiers approach based on particle filter and sample set for tracking occluded target. Appl Math: 294–312

    Article  MathSciNet  Google Scholar 

  19. Liu S, Pan Z, Song H (2017) Digital image watermarking method based on DCT and fractal encoding. IET Image Process 11(10):815–821

    Article  Google Scholar 

  20. Liu S, Pan Z, Cheng X (2017) A novel fast fractal image compression method based on distance clustering in high dimensional sphere surface. Fractals 25(4):1740004

    Article  Google Scholar 

  21. Monteiro E, Vizzotto B, Diniz C (2014) Parallelization of full search motion estimation algorithm for parallel and distributed platforms. Int J Parallel Prog 42(2):239–264

    Article  Google Scholar 

  22. Moren K, Göhringer D (2016) A framework for accelerating local feature extraction with OpenCL on multi-core CPUs and co-processors. J Real-Time Image Process: 1–18

  23. Nickolls J, Dally WJ (2010) The GPU computing era. IEEE Micro 30(2):56–69

    Article  Google Scholar 

  24. NvidiaCuda: NVIDIA CUDA Programming Guide (2011) http://developer.download.nvidia.com/compute/cuda/3_0/toolkit/docs/NVIDIA_CUDA_ProgrammingGuide.pdf

  25. Paramkusam AV, Reddy VSK (2016) An efficient multi-layer reference frame motion estimation for video coding. J Real-Time Image Process 11(4):645–661

    Article  Google Scholar 

  26. Poljicak A, Botella G, Garcia C (2016) Portable real-time DCT-based steganography using OpenCL. J. Real-time Image Process: 1–13

  27. Schnetter E, Raiskila K, Takala J (2015) Pocl: a performance-portable OpenCL implementation. Int J Parallel Prog 43(5):752–785

    Article  Google Scholar 

  28. Smistad E, Elster AC, Lindseth F (2015) Real-time gradient vector flow on GPUs using OpenCL. J Real-Time Image Process 10(1):67–74

    Article  Google Scholar 

  29. Tay R (2013) OpenCL parallel programming development cookbook. Packt Publishing Ltd

  30. The KhronosOpenCL Working Group (2011) OpenCL - The open standard for parallel programming of heterogeneous systems, http://www.khronos.org/opencl/

  31. Trawczynski D, Zalewski J (2013) Application of accelerated processing units in safety-critical systems. SAE Int J Passenger Cars - Electr Electric Syst 6(1):93–101

    Article  Google Scholar 

  32. Wu Yiqi, He Fazhi, Zhang Dejun (2018) Service-oriented feature-based data exchange for cloud-based design and manufacturing. IEEE Trans Serv Comput, v 11, n 2, p 341–353

    Article  Google Scholar 

  33. Yan XH, He FZ, Chen YL (2017) A novel hardware/software partitioning method based on position disturbed particle swarm optimization with invasive weed optimization. J Comput Sci Technol 32(2):340–355

    Article  MathSciNet  Google Scholar 

  34. Yan X, He F, Hou N (2018) An efficient particle swarm optimization for large scale hardware/software co-design system. J Coop Inform Syst 27(1)

    Article  Google Scholar 

  35. Yu Haiping, He Fazhi, Pan Yiteng (2018) A novel region-based active contour model via local patch similarity measure for image segmentation. Multimed Tools Appl: 1–23

  36. Zhao M, Heijden H V D (2008) 3D recursive search block matching on graphics processing unit. ICCE 2008 Digest Tech Papers Int Conf Consum Electro: 1–2

  37. Zhou Y, He F, Qiu Y (2016) Optimization of parallel iterated local search algorithms on graphics processing unit. J Supercomput 72(6):2394–2416

    Article  Google Scholar 

  38. Zhou Y, Fazhi HE, Qiu Y (2017) Dynamic strategy based parallel ant colony optimization on GPUs for TSPs. SCIENCE CHINA Inf Sci 60(6):068102

    Article  Google Scholar 

  39. Zhou Y, He F, Hou N (2018) Parallel ant Colony optimization on multi-core SIMD CPUs. Futur Gener Comput Syst 79:473–487

    Article  Google Scholar 

  40. Zhu S, Ma KK (2000) A new diamond search algorithm for fast block-matching motion estimation. IEEE Trans Image Process 9(2):287–290

    Article  Google Scholar 

Download references

Acknowledgements

This work is funded in part by the National Natural Science Foundation of China (No. 61303032), the Fundamental Research Funds for the Central Universities (JB160209), and the National Basic Research Program (973 Program) of China (No. 2013CB329402).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huming Zhu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, H., Wang, D., Zhang, P. et al. Parallel implementations of frame rate up-conversion algorithm using OpenCL on heterogeneous computing devices. Multimed Tools Appl 78, 9311–9334 (2019). https://doi.org/10.1007/s11042-018-6532-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6532-1

Keywords

Navigation