Skip to main content

Advertisement

Log in

Human face super-resolution on poor quality surveillance video footage

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Most super-resolution (SR) methods proposed to date do not use real ground-truth high-resolution (HR) and low-resolution (LR) image pairs; instead, the vast majority of methods use synthetic LR images generated from the HR images. This approach yields excellent performance on synthetic datasets, but on real-world poor quality surveillance video footage, they suffer from performance degradation. A promising alternative is to apply recent advances in style transfer for unpaired datasets, but state-of-the-art work along these lines has used LR images and HR images from completely different datasets, introducing more variation between the HR and LR domains than necessary. In this paper, we propose methods that overcome both of these shortcomings, applying unpaired style transfer learning methods to face SR but using HR and LR datasets that share important properties. The key is to acquire roughly paired training data from a high-quality main stream and a lower-quality sub-stream of the same IP camera. Based on this principle, we have constructed four datasets comprising more than 400 people, with 1–15 weakly aligned real HR–LR pairs for each subject. We adopt a cycle generative adversarial networks (Cycle GANs) approach that produces impressive super-resolved images for low-quality test images never seen during training. Experiments prove the efficacy of the method. The approach to face SR advocated for in this paper makes possible many real-world applications requiring the extraction of high-quality face images from low-resolution video streams such as those produced by security cameras. Developers of diverse applications such as face recognition, 3D face reconstruction, face alignment, face parsing, human–computer interaction, remote sensing, and access control will benefit from the methods introduced in this work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Notes

  1. The dataset will be released publicly on acceptance of this paper.

  2. https://github.com/richzhang/PerceptualSimilarity.

  3. https://github.com/mseitzer/pytorch-fid.

References

  1. Bulat A, Tzimiropoulos G (2017) How far are we from solving the 2D & 3D face alignment problem? (And a dataset of 230,000 3D facial landmarks). In: IEEE computer society conference on computer vision (CVPR). pp 1021–1030

  2. Bulat A, Tzimiropoulos G (2018) Super-FAN: integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with GANs. In: IEEE computer society conference on computer vision and pattern recognition (CVPR). pp 109–117

  3. Bulat A, Yang J, Tzimiropoulos G (2018) To learn image super-resolution, use a GAN to learn how to do image degradation first. In: European conference on computer vision (ECCV). pp 185–200

  4. Chen Y, Tai Y, Liu X, Shen C, Yang J (2018) Fsrnet: End-to-end learning face super-resolution with facial priors. In: IEEE computer society conference on computer vision and pattern recognition (CVPR). pp 2492–2501

  5. Dong C, Loy CC, He K, Tang X (2015) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38(2):295–307

    Article  Google Scholar 

  6. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. Adv Neural Inf Process Syst (NeurIPS) 3:2672–2680

    Google Scholar 

  7. Grm K, Pernus M, Cluzel L, Scheirer WJ, Dobrisek S, Struc V (2019) Face hallucination revisited: an exploratory study on dataset bias. In: IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW)

  8. Grm K, Scheirer WJ, Štruc V (2019) Face hallucination using cascaded super-resolution and identity priors. IEEE Trans Image Process 29(1):2150–2165

    Google Scholar 

  9. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Advances in neural information processing systems (NeurIPS). pp 6626–6637

  10. Huang GB, Mattar M, Berg T, Learned-Miller E (2008) Labeled faces in the wild: a database for studying face recognition in unconstrained environments. In: DANS workshop on faces in real-life images: detection, alignment, and recognition

  11. Huang JB, Singh A, Ahuja N (2015) Single image super-resolution from transformed self-exemplars. In: IEEE computer society conference on computer vision and pattern recognition (CVPR). pp 5197–5206

  12. Jackson AS, Bulat A, Argyriou V, Tzimiropoulos G (2017) Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In: IEEE computer society international conference on computer vision (CVPR). pp 1031–1039

  13. Jesorsky O, Kirchberg KJ, Frischholz RW (2001) Robust face detection using the Hausdorff distance. In: International conference on audio-and video-based biometric person authentication (AVBPA). pp 90–95

  14. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision (ECCV). Springer, pp 694–711

  15. Kim J, Kwon Lee J, Mu Lee K (2016) Accurate image super-resolution using very deep convolutional networks. In: IEEE computer society conference on computer vision and pattern recognition (CVPR). pp 1646–1654

  16. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representations (ICLR). pp 1–15

  17. Kuter S, Akyürek Z, Kuter N, Weber GW (2014) An alternative method for snow cover mapping on satellite images by modern applied mathematics. In: International conference on dynamics, games and science. Springer, pp 267–292

  18. Kuter S, Weber GW, Özmen A, Akyürek Z (2014) Modern applied mathematics for alternative modeling of the atmospheric effects on satellite images. In: Modeling, dynamics, optimization and bioeconomics I. Springer, pp 469–485

  19. Lai WS, Huang JB, Ahuja N, Yang MH (2017) Deep laplacian pyramid networks for fast and accurate super-resolution. In: IEEE computer society conference on computer vision and pattern recognition (CVPR). pp 624–632

  20. Le V, Brandt J, Lin Z, Bourdev L, Huang TS (2012) Interactive facial feature localization. In: European conference on computer vision (ECCV). pp 679–692

  21. Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE computer society conference on computer vision and pattern recognition (CVPR). pp 4681–4690

  22. Li Y, Liu S, Yang J, Yang MH (2017) Generative face completion. In: IEEE computer society conference on computer vision and pattern recognition (CVPR). pp 3911–3919

  23. Lim B, Son S, Kim H, Nah S, Mu Lee K (2017) Enhanced deep residual networks for single image super-resolution. In: IEEE computer society conference on computer vision and pattern recognition (CVPR) workshops. pp 136–144

  24. Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: IEEE computer society international conference on computer vision (ICCV). pp 3730–3738

  25. Mikula K, Urbán J, Kollár M, Ambroz M, Jarolímek I, Šibík J, Šibíková M (2021) An automated segmentation of natura 2000 habitats from sentinel-2 optical data. Discrete Contin Dyn Syst S 14(3):1017

    Article  MathSciNet  Google Scholar 

  26. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations (ICLR). pp 1–14

  27. Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deepface: closing the gap to human-level performance in face verification. In: IEEE computer society conference on computer vision and pattern recognition. pp 1701–1708

  28. Tekeli AE, Akyürek Z, Şensoy A, Şorman AA, Şorman AÜ (2005) Modelling the temporal variation in snow-covered area derived from satellite images for simulating

  29. Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, Qiao Y, Change LC (2018) Esrgan: enhanced super-resolution generative adversarial networks. In: European conference on computer vision (ECCV)

  30. Yu X, Fernando B, Ghanem B, Porikli F, Hartley R (2018) Face super-resolution guided by facial component heatmaps. In: European conference on computer vision (ECCV). pp 217–233

  31. Yu X, Porikli F (2016) Ultra-resolving face images by discriminative generative networks. In: European conference on computer vision (ECCV). pp 318–333

  32. Yu X, Porikli F (2017) Face hallucination with tiny unaligned images by transformative discriminative neural networks. In: Thirty-first AAAI conference on artificial intelligence (AAAI). pp 4327–4333

  33. Yu X, Porikli F (2017) Hallucinating very low-resolution unaligned and noisy face images by transformative discriminative autoencoders. In: IEEE computer society conference on computer vision and pattern recognition (CVPR). pp 3760–3768

  34. Yuan Y, Liu S, Zhang J, Zhang Y, Dong C, Lin L (2018) Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks. In: IEEE computer society conference on computer vision and pattern recognition (CVPR) workshops. pp 701–710

  35. Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: IEEE computer society conference on computer vision and pattern recognition (CVPR). pp 586–595

  36. Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) S3FD: Single shot scale-invariant face detector. In: IEEE computer society international conference on computer vision (ICCV). pp 192–201

  37. Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE computer society international conference on computer vision (ICCV). pp 2223–2232

  38. Zhu S, Liu S, Loy CC, Tang X (2016) Deep cascaded bi-network for face hallucination. In: European conference on computer vision (ECCV). Springer, pp 614–630

Download references

Acknowledgements

This work was supported by a Thailand National Science and Technology Development Agency grant to MND and ME as well as graduate fellowships from the University of the Punjab, Lahore, Pakistan, and the Asian Institute of Technology (AIT), Thailand, to MF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Farooq.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Farooq, M., Dailey, M.N., Mahmood, A. et al. Human face super-resolution on poor quality surveillance video footage. Neural Comput & Applic 33, 13505–13523 (2021). https://doi.org/10.1007/s00521-021-05973-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-05973-0

Keywords