Continuous digital zoom with cross attention for dual camera system

Yang, Yifan; Li, Qi; Xu, Zhihai; Feng, Huajun; Chen, Yueting

doi:10.1007/s11042-021-11688-0

Continuous digital zoom with cross attention for dual camera system

Published: 10 November 2021

Volume 81, pages 2959–2977, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yifan Yang¹,
Qi Li ORCID: orcid.org/0000-0002-1672-6362¹,
Zhihai Xu¹,
Huajun Feng¹ &
…
Yueting Chen¹

411 Accesses
1 Altmetric
Explore all metrics

Abstract

Reference-based super-resolution (RefSR) aims to recover realistic textures, when a reference (Ref) image and low-resolution (LR) image are given. Because the Ref images are selected randomly, the quality of RefSR will degrade when the Ref image has less similar content with LR input. In this article,we propose a dual camera system to unleash the potential of RefSR. Additionally, we presents a cross attention mechanism to realize a high-quality digital zoom by using two camera modules with different focal lengths. In dual camera system, shorter focal length module produces the wide-view image with the low resolution. On the other hand, the longer focal length module produces the tele-view image via optical zoom. The long-focal image contains more details than short-focal image and can be used to guide short-focal image to reconstruct high-frequency part. Since the two images are taken from the same scene, we can get better image matching correlation in dual camera system. Inspired by the recent work on reference-based image super-resolution (RefSR), we propose a cross attention mechanism to fuse two images with different focal length and generate more feature correlations within them by texture transferring. Besides, we use segmentation information to improve match accuracy. Instead of using a direct matching between different images, the attention module fully utilizes texture of different levels. Additionally, we present a feature restoration module to reconstruct more image details. Extensive experiments show that Our method achieves state-of-the-art results both quantitatively and qualitatively across different datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reference-Based Image Super-Resolution with Deformable Attention Transformer

Multi-FAN: multi-spectral mosaic super-resolution via multi-scale feature aggregation network

Article 24 February 2021

MAFT: An Image Super-Resolution Method Based on Mixed Attention and Feature Transfer

References

Bai Y, Zhang Y, Ding M, Ghanem B (2018) Finding tiny faces in the wild with generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 21–30
Bi L, Feng D, Kim J (2018) Dual-path adversarial learning for fully convolutional network (fcn)-based medical image segmentation. The Visual Computer 34(6–8):1043–1052
Article Google Scholar
Boominathan V, Mitra K, Veeraraghavan A (2014) Improving resolution and depth-of-field of light field cameras using a hybrid imaging system. In: 2014 IEEE international conference on computational photography (ICCP). IEEE, pp 1–10
Chang H, Yeung DY, Xiong Y (2004) Super-resolution through neighbor embedding. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004. CVPR 2004, vol 1. IEEE, pp I–I
Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Dong C, Loy CC, He K, Tang X (2014) Learning a deep convolutional network for image super-resolution. In: European conference on computer vision. Springer, pp 184–199
Dong C, Loy CC, He K, Tang X (2015) Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 38(2):295–307
Article Google Scholar
Ben Fredj H, Bouguezzi S, Souani C (2021) Face recognition in unconstrained environment with CNN[J]. The Visual Computer 37(2):217–226
Freedman G, Fattal R (2011) Image and video upscaling from local self-examples. ACM Transactions on Graphics (TOG) 30(2):1–11
Article Google Scholar
Freeman WT, Jones TR, Pasztor EC (2002) Example-based super-resolution. IEEE Computer Graphics and Applications 22(2):56–65
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777
Han W, Chang S, Liu D, Yu M, Witbrock M, Huang TS (2018) Image super-resolution via dual-state recurrent networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1654–1663
Haris M, Shakhnarovich G, Ukita N (2018) Deep back-projection networks for super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1664–1673
Huang JB, Singh A, Ahuja N (2015) Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5197–5206
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision. Springer, pp 694–711
Kim J, Kwon Lee J, Mu Lee K (2016) Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1646–1654
Kim J, Kwon Lee J, Mu Lee K (2016) Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1637–1645
Kim SY, Sim H, Kim M (2021) Koalanet: Blind super-resolution using kernel-oriented adaptive local adjustment. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10611–10620
Lai WS, Huang JB, Ahuja N, Yang MH (2017) Deep laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 624–632
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681–4690
Liao X, Li K, Zhu X, Liu KR (2020) Robust detection of image operator chain with two-stream convolutional neural network. IEEE Journal of Selected Topics in Signal Processing 14(5):955–968
Article Google Scholar
Liao X, Yin J, Chen M, Qin Z (2020) Adaptive payload distribution in multiple images steganography based on image texture features. IEEE Transactions on Dependable and Secure Computing, pp 1–1. https://doi.org/10.1109/TDSC.2020.3004708
Liao X, Yu Y, Li B, Li Z, Qin Z (2020) A new payload partition strategy in color image steganography. IEEE Transactions on Circuits and Systems for Video Technology 30(3):685–696. https://doi.org/10.1109/TCSVT.2019.2896270
Article Google Scholar
Libin S, Hays J (2012) Super-resolution from internet-scale scene matching. In: 2012 IEEE International conference on computational photography (ICCP), pp 1–12
Lim B, Son S, Kim H, Nah S, Mu Lee K (2017) Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 136–144
Lu J, Healy D (1994) Contrast enhancement via multiscale gradient transformation. In: Proceedings of 1st international conference on image processing, vol 2. IEEE, pp 482–486
Lu X, Wang W, Shen J, Crandall D, Luo, J (2020) Zero-shot video object segmentation with co-attention siamese networks. IEEE Transactions on Pattern Analysis and Machine Intelligence
Mikaeli E, Aghagolzadeh A, Azghani M (2020) Single-image super-resolution via patch-based and group-based local smoothness modeling. The Visual Computer 36(8):1573–1589
Article Google Scholar
Nasrollahi K, Moeslund TB (2014) Super-resolution: a comprehensive survey. Machine Vision and Applications 25(6):1423–1468
Article Google Scholar
Oktay O, Bai W, Lee M, Guerrero R, Kamnitsas K, Caballero J, de Marvao A, Cook S, Oregan D, Rueckert D (2016, October) Multi-input cardiac image super-resolution using convolutional neural networks. In: International conference on medical image computing and computer-assisted intervention. Springer, Cham, pp 246–254
Shi W, Caballero J, Huszár F, Totz J, Aitken AP, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp 1874–1883
Shi W, Du H, Mei W, Ma Z (2020) (sarn) spatial-wise attention residual network for image super-resolution. The Visual Computer, pp 1–12
Timofte R, Agustsson E, Van Gool L, Yang MH, Zhang L, Lim B, et al (2017) Ntire 2017 challenge on single image super-resolution: Methods and results. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops
Timofte R, De Smet V, Van Gool L (2013) Anchored neighborhood regression for fast example-based super-resolution. In: Proceedings of the IEEE international conference on computer vision, pp 1920–1927
Wang X, Yu K, Dong C, Change Loy C (2018) Recovering realistic texture in image super-resolution by deep spatial feature transform. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 606–615
Wang Y, Liu Y, Heidrich W, Dai Q (2016) The light field attachment: Turning a dslr into a light field camera using a low budget camera ring. IEEE Transactions on Visualization and Computer Graphics 23(10):2357–2364
Article Google Scholar
Wang Z, Liu D, Yang J, Han W, Huang T (2015) Deep networks for image super-resolution with sparse prior. In: Proceedings of the IEEE international conference on computer vision, pp 370–378
Xu K, Wang X, Yang X, He S, Zhang Q, Yin B, Wei X, Lau RW (2018) Efficient image super-resolution integration. The Visual Computer 34(6–8):1065–1076
Article Google Scholar
Yang CY, Ma C, Yang MH (2014) Single-image super-resolution: A benchmark. In: European Conference on Computer Vision. Springer, pp 372–386
Yang F, Yang H, Fu J, Lu H, Guo B (2020) Learning texture transformer network for image super-resolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5791–5800
Yang J, Wright J, Huang TS, Ma Y (2010) Image super-resolution via sparse representation. IEEE Transactions on Image Processing 19(11):2861–2873
Article MathSciNet Google Scholar
Yıldırım D, Güngör O (2012) A novel image fusion method using ikonos satellite images. Journal of Geodesy and Geoinformation 1(1):75–83
Article Google Scholar
Yue H, Sun X, Yang J, Wu F (2013) Landmark image super-resolution by retrieving web images. IEEE Transactions on Image Processing 22(12):4865–4878
Article MathSciNet Google Scholar
Zhang C, Benz P, Argaw DM, Lee S, Kim J, Rameau F, Bazin JC, Kweon IS (2021) Resnet or densenet? introducing dense shortcuts to resnet. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3550–3559
Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: International conference on machine learning, pp 7354–7363. PMLR
Zhang Y, Tian Y, Kong Y, Zhong B, Fu Y (2018) Residual dense network for image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2472–2481
Zhang Z, Wang Z, Lin Z, Qi H (2019) Image super-resolution by neural texture transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7982–7991
Zheng H, Ji M, Han L, Xu Z, Wang H, Liu Y, Fang L (2017) Learning cross-scale correspondence and patch-based synthesis for reference-based super-resolution. In: BMVC
Zheng H, Ji M, Wang H, Liu Y, Fang L (2018) Crossnet: an end-to-end reference-based super resolution network using cross-scale warping. In: Proceedings of the European conference on computer vision (ECCV), pp 88–104

Download references

Author information

Authors and Affiliations

State Key Laboratory of Modern Optical Instrumentation Zhejiang University, Zhejiang University, 310000, HangZhou, China
Yifan Yang, Qi Li, Zhihai Xu, Huajun Feng & Yueting Chen

Authors

Yifan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Qi Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhihai Xu
View author publications
You can also search for this author in PubMed Google Scholar
Huajun Feng
View author publications
You can also search for this author in PubMed Google Scholar
Yueting Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qi Li.

Ethics declarations

Conflict of interest

The authors have declared that no conflict of interests or competing interests exist.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, Y., Li, Q., Xu, Z. et al. Continuous digital zoom with cross attention for dual camera system. Multimed Tools Appl 81, 2959–2977 (2022). https://doi.org/10.1007/s11042-021-11688-0

Download citation

Received: 05 March 2021
Revised: 29 September 2021
Accepted: 18 October 2021
Published: 10 November 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s11042-021-11688-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Continuous digital zoom with cross attention for dual camera system

Abstract

Access this article

Similar content being viewed by others

Reference-Based Image Super-Resolution with Deformable Attention Transformer

Multi-FAN: multi-spectral mosaic super-resolution via multi-scale feature aggregation network

MAFT: An Image Super-Resolution Method Based on Mixed Attention and Feature Transfer

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Continuous digital zoom with cross attention for dual camera system

Abstract

Access this article

Similar content being viewed by others

Reference-Based Image Super-Resolution with Deformable Attention Transformer

Multi-FAN: multi-spectral mosaic super-resolution via multi-scale feature aggregation network

MAFT: An Image Super-Resolution Method Based on Mixed Attention and Feature Transfer

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation