Abstract
Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this problem and propose the participants to design an end-to-end real-time video super-resolution solution for mobile NPUs optimized for low energy consumption. The participants were provided with the REDS training dataset containing video sequences for a 4X video upscaling task. The runtime and power efficiency of all models was evaluated on the powerful MediaTek Dimensity 9000 platform with a dedicated AI processing unit capable of accelerating floating-point and quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 500 FPS rate and 0.2 [Watt/30 FPS] power consumption. A detailed description of all models developed in the challenge is provided in this paper.
Andrey Ignatov, Radu Timofte, Cheng-Ming Chiang, Hsien-Kai Kuo, Hong-Han Shuai and Wen-Huang Cheng are the main Mobile AI & AIM 2022 challenge organizers. The other authors participated in the challenge.
Appendix A contains the authors’ team names and affiliations.
Mobile AI 2022 Workshop website:
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abdelhamed, A., Afifi, M., Timofte, R., Brown, M.S.: NTIRE 2020 challenge on real image denoising: dataset, methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 496–497 (2020)
Abdelhamed, A., Timofte, R., Brown, M.S.: NTIRE 2019 challenge on real image denoising: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Cai, J., Gu, S., Timofte, R., Zhang, L.: NTIRE 2019 challenge on real image super-resolution: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Cai, Y., Yao, Z., Dong, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: ZeroQ: a novel zero shot quantization framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13169–13178 (2020)
Chiang, C.M., et al.: Deploying image deblurring across mobile devices: a perspective of quality and latency. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 502–503 (2020)
Conde, M.V., Timofte, R., et al.: Reversed image signal processing and RAW reconstruction. AIM 2022 challenge report. In: Karlinsky, L., et al. (eds.) ECCV 2022. LNCS, vol. 13803, pp. 3–26. Springer, Cham (2023)
Du, Z., Liu, J., Tang, J., Wu, G.: Anchor-based plain net for mobile image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Fuoli, D., Gu, S., Timofte, R.: Efficient video super-resolution through recurrent latent space propagation. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3476–3485. IEEE (2019)
Gao, S., et al.: RCBSR: re-parameterization convolution block for super-resolution. In: Karlinsky, L., et al. (eds.) ECCV 2022. LNCS, vol. 13802, pp. 540–548. Springer, Cham (2023)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Howard, A., et al.: Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019)
Hui, Z., Gao, X., Yang, Y., Wang, X.: Lightweight image super-resolution with information multi-distillation network. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2024–2032 (2019)
Ignatov, A., Byeoung-su, K., Timofte, R.: Fast camera image denoising on mobile GPUs with deep learning, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., Chiang, J., Kuo, H.K., Sycheva, A., Timofte, R.: Learned smartphone ISP on mobile NPUs with deep learning, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van Gool, L.: DSLR-quality photos on mobile devices with deep convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3277–3285 (2017)
Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van Gool, L.: WESPE: weakly supervised photo enhancer for digital cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 691–700 (2018)
Ignatov, A., Malivenko, G., Plowman, D., Shukla, S., Timofte, R.: Fast and accurate single-image depth estimation on mobile devices, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., Malivenko, G., Timofte, R.: Fast and accurate quantized camera scene detection on smartphones, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., et al.: PyNet-V2 mobile: efficient on-device photo processing with neural networks. In: 2021 26th International Conference on Pattern Recognition (ICPR). IEEE (2022)
Ignatov, A., Malivenko, G., Timofte, R., et al.: Efficient single-image depth estimation on mobile devices, mobile AI & AIM 2022 challenge: report. In: Karlinsky, L., et al. (eds.) ECCV 2022. LNCS, vol. 13803, pp. 71–91. Springer, Cham (2023)
Ignatov, A., Patel, J., Timofte, R.: Rendering natural camera bokeh effect with deep learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 418–419 (2020)
Ignatov, A., et al.: AIM 2019 challenge on bokeh effect synthesis: methods and results. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3591–3598. IEEE (2019)
Ignatov, A., et al.: MicroISP: processing 32MP photos on mobile devices with deep learning. In: Karlinsky, L., et al. (eds.) ECCV 2022. LNCS, vol. 13802, pp. 729–746. Springer, Cham (2023)
Ignatov, A., Timofte, R.: NTIRE 2019 challenge on image enhancement: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Ignatov, A., et al.: AI benchmark: running deep neural networks on Android smartphones. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 288–314. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_19
Ignatov, A., Timofte, R., Denna, M., Younes, A.: Real-time quantized image super-resolution on mobile NPUs, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., Timofte, R., Denna, M., Younes, A., et al.: Efficient and accurate quantized image super-resolution on mobile NPUs, mobile AI & AIM 2022 challenge: report. In: Karlinsky, L., et al. (eds.) ECCV 2022. LNCS, vol. 13803, pp. 92–129. Springer, Cham (2023)
Ignatov, A., et al.: AIM 2019 challenge on raw to RGB mapping: methods and results. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3584–3590. IEEE (2019)
Ignatov, A., et al.: AI benchmark: all about deep learning on smartphones in 2019. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3617–3635. IEEE (2019)
Ignatov, A., et al.: AIM 2020 challenge on rendering realistic bokeh. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 213–228. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_13
Ignatov, A., et al.: PIRM challenge on perceptual image enhancement on smartphones: report. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 315–333. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_20
Ignatov, A., et al.: AIM 2020 challenge on learned image signal processing pipeline. arXiv preprint arXiv:2011.04994 (2020)
Ignatov, A., Timofte, R., et al.: Learned smartphone ISP on mobile GPUs with deep learning, mobile AI & AIM 2022 challenge: report. In: Karlinsky, L., et al. (eds.) ECCV 2022. LNCS, vol. 13803, pp. 44–70. Springer, Cham (2023)
Ignatov, A., Timofte, R., et al.: Realistic bokeh effect rendering on mobile GPUs, mobile AI & AIM 2022 challenge: report. In: Karlinsky, L., et al. (eds.) ECCV 2022. LNCS, vol. 13803, pp. 153–173. Springer, Cham (2023)
Ignatov, A., Van Gool, L., Timofte, R.: Replacing mobile camera ISP with a single deep learning model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 536–537 (2020)
Ignatov, D., Ignatov, A.: Controlling information capacity of binary neural network. Pattern Recogn. Lett. 138, 276–281 (2020)
Isobe, T., Zhu, F., Jia, X., Wang, S.: Revisiting temporal modeling for video super-resolution. arXiv preprint arXiv:2008.05765 (2020)
Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2704–2713 (2018)
Jain, S.R., Gural, A., Wu, M., Dick, C.H.: Trained quantization thresholds for accurate and efficient fixed-point inference of deep neural networks. arXiv preprint arXiv:1903.08066 (2019)
Kappeler, A., Yoo, S., Dai, Q., Katsaggelos, A.K.: Video super-resolution with convolutional neural networks. IEEE Trans. Comput. Imaging 2(2), 109–122 (2016)
Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kınlı, F.O., Menteş, S., Özcan, B., Kirac, F., Timofte, R., et al.: AIM 2022 challenge on Instagram filter removal: methods and results. In: Karlinsky, L., et al. (eds.) ECCV 2022. LNCS, vol. 13803, pp. 27–43. Springer, Cham (2023)
Lee, Y.L., Tsung, P.K., Wu, M.: Technology trend of edge AI. In: 2018 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), pp. 1–2. IEEE (2018)
Li, Y., Gu, S., Gool, L.V., Timofte, R.: Learning filter basis for convolutional neural network compression. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5623–5632 (2019)
Li, Y., et al.: NTIRE 2022 challenge on efficient super-resolution: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1062–1102 (2022)
Lian, W., Lian, W.: Sliding window recurrent network for efficient video super-resolution. In: Karlinsky, L., et al. (eds.) ECCV 2022. LNCS, vol. 13802, pp. 591–601. Springer, Cham (2023)
Lian, W., Peng, S.: Kernel-aware raw burst blind super-resolution. arXiv preprint arXiv:2112.07315 (2021)
Liang, J., et al.: VRT: a video restoration transformer. arXiv preprint arXiv:2201.12288 (2022)
Liu, H., et al.: Video super-resolution based on deep learning: a comprehensive survey. Artif. Intell. Rev. 55, 5981–6035 (2022). https://doi.org/10.1007/s10462-022-10147-y
Liu, J., Tang, J., Wu, G.: Residual feature distillation network for lightweight image super-resolution. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 41–55. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_2
Liu, Z., et al.: MetaPruning: meta learning for automatic neural network channel pruning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3296–3305 (2019)
Liu, Z., Wu, B., Luo, W., Yang, X., Liu, W., Cheng, K.T.: Bi-real net: enhancing the performance of 1-bit CNNs with improved representational capability and advanced training algorithm. In: Proceedings of the European conference on computer vision (ECCV), pp. 722–737 (2018)
Lugmayr, A., Danelljan, M., Timofte, R.: NTIRE 2020 challenge on real-world image super-resolution: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 494–495 (2020)
Luo, Z., et al.: BSRT: improving burst super-resolution with swin transformer and flow-guided deformable alignment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 998–1008 (2022)
Luo, Z., et al.: EBSR: feature enhanced burst super-resolution with deformable alignment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 471–478 (2021)
Nah, S., et al.: NTIRE 2019 challenge on video deblurring and super-resolution: dataset and study. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Nah, S., Son, S., Timofte, R., Lee, K.M.: NTIRE 2020 challenge on image and video deblurring. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 416–417 (2020)
Nah, S., et al.: NTIRE 2019 challenge on video super-resolution: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Obukhov, A., Rakhuba, M., Georgoulis, S., Kanakis, M., Dai, D., Van Gool, L.: T-basis: a compact representation for neural networks. In: International Conference on Machine Learning, pp. 7392–7404. PMLR (2020)
Romero, A., Ignatov, A., Kim, H., Timofte, R.: Real-time video super-resolution on smartphones with deep learning, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Sajjadi, M.S., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6626–6634 (2018)
Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874–1883 (2016)
Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820–2828 (2019)
TensorFlow-Lite. https://www.tensorflow.org/lite
Timofte, R., Gu, S., Wu, J., Van Gool, L.: NTIRE 2018 challenge on single image super-resolution: methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 852–863 (2018)
Uhlich, S., et al.: Mixed precision DNNs: all you need is a good parametrization. arXiv preprint arXiv:1905.11452 (2019)
Wan, A., et al.: FBNetV2: differentiable neural architecture search for spatial and channel dimensions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12965–12974 (2020)
Wang, X., Chan, K.C., Yu, K., Dong, C., Change Loy, C.: EDVR: video restoration with enhanced deformable convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Wu, B., et al.: FBNet: hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10734–10742 (2019)
Yang, J., et al.: Quantization networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7308–7316 (2019)
Yang, R., Timofte, R., et al.: AIM 2022 challenge on super-resolution of compressed image and video: dataset, methods and results. In: Karlinsky, L., et al. (eds.) ECCV 2022. LNCS, vol. 13803, pp. 174–202. Springer, Cham (2023)
Yue, S., Li, C., Zhuge, Z., Song, R.: EESRNet: a network for energy efficient super-resolution. In: Karlinsky, L., et al. (eds.) ECCV 2022. LNCS, vol. 13802, pp. xx–yy. Springer, Cham (2023)
Zhang, X., Zeng, H., Zhang, L.: Edge-oriented convolution block for real-time super resolution on mobile devices. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4034–4043 (2021)
Acknowledgements
We thank the sponsors of the Mobile AI and AIM 2022 workshops and challenges: AI Witchlabs, MediaTek, Huawei, Reality Labs, OPPO, Synaptics, Raspberry Pi, ETH Zürich (Computer Vision Lab) and University of Würzburg (Computer Vision Lab).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Teams and Affiliations
A Teams and Affiliations
1.1 Mobile AI & AIM 2022 Team
Title:
Mobile AI & AIM 2022 Video Super-Resolution Challenge
Members:
Andrey Ignatov\(^{1,2}\) (andrey@vision.ee.ethz.ch), Radu Timofte\(^{1,2,3}\) (radu.timofte@uni-wuerzburg.de), Cheng-Ming Chiang\(^{4}\) (jimmy.chiang@mediatek.com), Hsien-Kai Kuo\(^{4}\) (hsienkai.kuo@mediatek.com), Yu-Syuan Xu\(^{4}\) (yu-syuan.xu@mediatek.com), Man-Yu Lee\(^{4}\) (my.lee@mediatek.com), Allen Lu\(^{4}\) (allen-cl.lu@mediatek.com), Chia-Ming Cheng\(^{4}\) (cm.cheng@mediatek.com), Chih-Cheng Chen\(^{4}\) (ryan.chen@mediatek.com), Jia-Ying Yong\(^{4}\) (jiaying.ee10@nycu.edu.tw), Hong-Han Shuai\(^{5}\) (hhshuai@nycu.edu.tw), Wen-Huang Cheng\(^{5}\) (whcheng@nycu.edu.tw)
Affiliations:
\(^1\) Computer Vision Lab, ETH Zurich, Switzerland
\(^2\) AI Witchlabs, Switzerland
\(^3\) University of Wuerzburg, Germany
\(^4\) MediaTek Inc., Taiwan
\(^5\) National Yang Ming Chiao Tung University, Taiwan
1.2 MVideoSR
Title:
Extreme Low Power Network for Real-time Video Super Resolution
Members:
Zhuang Jia (jiazhuang@xiaomi.com), Tianyu Xu (xutianyu@xiaomi.com), Yijian Zhang (zhangyijian@xiaomi.com), Long Bao (baolong@xiaomi.com), Heng Sun (sunheng3@xiaomi.com)
Affiliations:
Video Algorithm Group, Camera Department, Xiaomi Inc., China
1.3 ZX VIP
Title:
Real-Time Video Super-Resolution Model [9]
Members:
Diankai Zhang (zhang.diankai@zte.com.cn), Si Gao, Shaoli Liu, Biao Wu, Xiaofeng Zhang, Chengjian Zheng, Kaidi Lu, Ning Wang
Affiliations:
Audio & Video Technology Platform Department, ZTE Corp., China
1.4 Fighter
Title:
Fast Real-Time Video Super-Resolution
Members:
Xiao Sun (2609723059@qq.com), HaoDong Wu
Affiliations:
None, China
1.5 XJTU-MIGU SUPER
Title:
Light and Fast On-Mobile VSR
Members:
Xuncheng Liu (liuxuncheng123@stu.xjtu.edu.cn), Weizhan Zhang, Caixia Yan, Haipeng Du, Qinghua Zheng, Qi Wang, Wangdu Chen
Affiliations:
School of Computer Science and Technology, Xi’an Jiaotong University, China
MIGU Video Co. Ltd, China
1.6 BOE-IOT-AIBD
Title:
Lightweight Quantization CNN-Net for Mobile Video Super-Resolution
Members:
Ran Duan (duanr@boe.com.cn), Ran Duan, Mengdi Sun, Dan Zhu, Guannan Chen
Affiliations:
BOE Technology Group Co., Ltd., China
1.7 GenMedia Group
Title:
SkipSkip Video Super-Resolution
Members:
Hojin Cho (jin@gengen.ai), Steve Kim
Affiliations:
GenGenAI, South Korea
1.8 NCUT VGroup
Title:
EESRNet: A Network for Energy Efficient Super Resolution [73]
Members:
Shijie Yue (1161126955@qq.com), Chenghua Li, Zhengyang Zhuge
Affiliations:
North China University of Technology, China
Institute of Automation, Chinese Academy of Sciences, China
1.9 Mortar ICT
Title:
Real-Time Video Super-Resolution Model
Members:
Wei Chen (chenwei21s@ict.ac.cn), Wenxu Wang, Yufeng Zhou
Affiliations:
State Key Laboratory of Computer Architecture, Institute of Computing Technology, China
1.10 RedCat AutoX
Title:
Forward Recurrent Residual Network
Members:
Xiaochen Cai\(^{1}\) (caixc@lamda.nju.edu.cn), Hengxing Cai\(^{1}\), Kele Xu\(^{2}\), Li Liu\(^{2}\), Zehua Cheng\(^{3}\)
Affiliations:
\(^{1}\)4Paradigm Inc., Beijing, China
\(^{2}\)National University of Defense Technology, Changsha, China
\(^{3}\)University of Oxford, Oxford, United Kingdom
1.11 221B
Title:
Sliding Window Recurrent Network for Efficient Video Super-Resolution [47]
Members:
Wenyi Lian (shermanlian@163.com), Wenjing Lian
Affiliations:
Uppsala University, Sweden
Northeastern University, China
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ignatov, A. et al. (2023). Power Efficient Video Super-Resolution on Mobile NPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13803. Springer, Cham. https://doi.org/10.1007/978-3-031-25066-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-25066-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25065-1
Online ISBN: 978-3-031-25066-8
eBook Packages: Computer ScienceComputer Science (R0)