skip to main content
research-article

SplitSR: An End-to-End Approach to Super-Resolution on Mobile Devices

Published: 30 March 2021 Publication History

Abstract

Super-resolution (SR) is a coveted image processing technique for mobile apps ranging from the basic camera apps to mobile health. Existing SR algorithms rely on deep learning models with significant memory requirements, so they have yet to be deployed on mobile devices and instead operate in the cloud to achieve feasible inference time. This shortcoming prevents existing SR methods from being used in applications that require near real-time latency. In this work, we demonstrate state-of-the-art latency and accuracy for on-device super-resolution using a novel hybrid architecture called SplitSR and a novel lightweight residual block called SplitSRBlock. The SplitSRBlock supports channel-splitting, allowing the residual blocks to retain spatial information while reducing the computation in the channel dimension. SplitSR has a hybrid design consisting of standard convolutional blocks and lightweight residual blocks, allowing people to tune SplitSR for their computational budget. We evaluate our system on a low-end ARM CPU, demonstrating both higher accuracy and up to 5× faster inference than previous approaches. We then deploy our model onto a smartphone in an app called ZoomSR to demonstrate the first-ever instance of on-device, deep learning-based SR. We conducted a user study with 15 participants to have them assess the perceived quality of images that were post-processed by SplitSR. Relative to bilinear interpolation --- the existing standard for on-device SR --- participants showed a statistically significant preference when looking at both images (Z=-9.270, p<0.01) and text (Z=-6.486, p<0.01).

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 265--283.
[2]
Dawn M Becker, Chelsea A Tafoya, Sören L Becker, Grant H Kruger, Matthew J Tafoya, and Torben K Becker. 2016. The use of portable ultrasound devices in low-and middle-income countries: a systematic review of the literature. Tropical Medicine & International Health 21, 3 (2016), 294--311. https://doi.org/10.1111/tmi.12657
[3]
Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie Line Alberi-Morel. 2012. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. (2012). https://doi.org/10.5244/c26.135
[4]
Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, et al. 2018. TVM: An automated end-to-end optimizing compiler for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 578--594.
[5]
Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. Learning to optimize tensor programs. In Advances in Neural Information Processing Systems. 3389--3400.
[6]
Tao Dai, Jianrui Cai, Yongbing Zhang, Shu-Tao Xia, and Lei Zhang. 2019. Second-order attention network for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11065--11074. https://doi.org/10.1109/cvpr.2019.01132
[7]
Mallesham Dasari, Arani Bhattacharya, Santiago Vargas, Pranjal Sahu, Aruna Balasubramanian, and Samir R. Das. 2020. Streaming 360-Degree Videos Using Super-Resolution. In IEEE INFOCOM 2020 - IEEE Conference on Computer Communications. IEEE. https://doi.org/10.1109/infocom41043.2020.9155477
[8]
Dmitry Datsenko and Michael Elad. 2007. Example-based single document image super-resolution: a global MAP approach with outlier rejection. Multidimensional Systems and Signal Processing 18, 2-3 (2007), 103--121.
[9]
Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2015. Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence 38, 2 (2015), 295--307. https://doi.org/10.1109/tpami.2015.2439281
[10]
Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016. Accelerating the super-resolution convolutional neural network. In European conference on computer vision. Springer, 391--407. https://doi.org/10.1007/978-3-319-46475-6_25
[11]
Zhaoxin Geng, Xiong Zhang, Zhiyuan Fan, Xiaoqing Lv, Yue Su, and Hongda Chen. 2017. Recent progress in optical biosensors based on smartphone platforms. Sensors 17, 11 (2017), 2449. https://doi.org/10.3390/s17112449
[12]
Nikolaos Georgis, Fredrik Carpio, and Paul Jin Hwang. 2013. Super-resolution digital zoom. US Patent 8,587,696.
[13]
Hayit Greenspan. 2009. Super-resolution in medical imaging. The computer journal 52, 1 (2009), 43--63.
[14]
Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, and Chang Xu. 2020. GhostNet: More Features From Cheap Operations. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. https://doi.org/10.1109/cvpr42600.2020.00165
[15]
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
[16]
Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5197--5206. https://doi.org/10.1109/cvpr.2015.7299156
[17]
Zheng Hui, Xiumei Wang, and Xinbo Gao. 2018. Fast and accurate single image super-resolution via information distillation network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 723--731. https://doi.org/10.1109/cvpr.2018.00082
[18]
Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1646--1654. https://doi.org/10.1109/cvpr.2016.182
[19]
Yongwoo Kim, Jae-Seok Choi, and Munchurl Kim. 2018. A real-time convolutional neural network for super-resolution on fpga with applications to 4k uhd 60 fps video services. IEEE Transactions on Circuits and Systems for Video Technology 29, 8 (2018), 2521--2534. https://doi.org/10.1109/tcsvt.2018.2864321
[20]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[21]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105. https://doi.org/10.1145/3065386
[22]
Volodymyr Kuleshov, S Zayd Enam, and Stefano Ermon. 2017. Audio super resolution using neural networks. arXiv preprint arXiv:1708.00853 (2017).
[23]
Royson Lee, Stylianos I. Venieris, Lukasz Dudziak, Sourav Bhattacharya, and Nicholas D. Lane. 2019. MobiSR: Efficient On-Device Super-Resolution through Heterogeneous Mobile Processors. In The 25th Annual International Conference on Mobile Computing and Networking (Los Cabos, Mexico) (MobiCom '19). Association for Computing Machinery, New York, NY, USA, Article 54, 16 pages. https://doi.org/10.1145/3300061.3345455
[24]
Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 136--144. https://doi.org/10.1109/cvprw.2017.151
[25]
Xin Liu, Josh Fromm, Shwetak Patel, and Daniel McDuff. 2020. Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement. arXiv preprint arXiv:2006.03790 (2020). https://arxiv.org/pdf/2006.03790.pdf
[26]
Zhi-Song Liu, Li-Wen Wang, Chu-Tak Li, and Wan-Chi Siu. 2019. Hierarchical back projection network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 0--0.
[27]
Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. 2018. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European Conference on Computer Vision (ECCV). 116--131. https://doi.org/10.1007/978-3-030-01264-9_8
[28]
Alex Mariakakis, Jacob Baudin, Eric Whitmire, Vardhman Mehta, Megan A Banks, Anthony Law, Lynn McGrath, and Shwetak N Patel. 2017. PupilScreen: Using Smartphones to Assess Traumatic Brain Injury. In Proc. IMWUT '17, Vol. 1. 81:1-81:27. https://doi.org/10.1145/3131896
[29]
Daniel McDuff. 2018. Deep super resolution for recovering physiological information from videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 1367--1374.
[30]
Vitor F Pamplona, Ankit Mohan, Manuel M Oliveira, and Ramesh Raskar. 2010. NETRA: interactive display for estimating refractive errors and focal range. ACM transactions on graphics (TOG) 29, 4 (2010), 77. https://doi.org/10.1145/1833349.1778814
[31]
Vitor F Pamplona, Erick B Passos, Jan Zizka, Manuel M Oliveira, Everett Lawson, Esteban Clua, and Ramesh Raskar. 2011. Catra: cataract probe with a lightfield display and a snap-on eyepiece for mobile phones. In Proc. SIGGRAPH '11. 7--11. http://doi.acm.org/10.1145/1964921.1964942
[32]
Ram Krishna Pandey, K Vignesh, AG Ramakrishnan, et al. 2018. Binary document image super resolution for improved readability and OCR performance. arXiv preprint arXiv:1812.02475 (2018).
[33]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. Acm Sigplan Notices 48, 6 (2013), 519--530. https://doi.org/10.1145/2499370.2462176
[34]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779--788. https://doi.org/10.1109/cvpr.2016.91
[35]
M Dirk Robinson, Stephanie J Chiu, J Lo, C Toth, J Izatt, and Sina Farsiu. 2010. New applications of super-resolution in medical imaging. Super-Resolution Imaging 2010 (2010), 384--412.
[36]
M Dirk Robinson, Stephanie J Chiu, Cynthia A Toth, Joseph A Izatt, Joseph Y Lo, and Sina Farsiu. 2017. New applications of super-resolution in medical imaging. In Super-Resolution Imaging. CRC Press, 401--430. https://doi.org/10.1201/9781439819319-13
[37]
Jared Roesch, Steven Lyubomirsky, Logan Weber, Josh Pollock, Marisa Kirisame, Tianqi Chen, and Zachary Tatlock. 2018. Relay: A new ir for machine learning frameworks. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages. 58--68. https://doi.org/10.1145/3211346.3211348
[38]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4510--4520. https://doi.org/10.1109/cvpr.2018.00474
[39]
Wanjie Sun and Zhenzhong Chen. 2020. Learned image downscaling for upscaling using content adaptive resampler. IEEE Transactions on Image Processing 29 (2020), 4027--4040.
[40]
Yulung Sung, Fernando Campa, and Wei-Chuan Shih. 2017. Open-source do-it-yourself multi-color fluorescence smartphone microscopy. Biomedical optics express 8, 11 (2017), 5075--5086. https://doi.org/10.1364/boe.8.005075
[41]
Radu Timofte, Eirikur Agustsson, Luc Van Gool, Ming-Hsuan Yang, and Lei Zhang. 2017. Ntire 2017 challenge on single image super-resolution: Methods and results. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 114--125. https://doi.org/10.1109/cvprw.2017.150
[42]
Radu Timofte, Vincent De Smet, and Luc Van Gool. 2013. Anchored neighborhood regression for fast example-based super-resolution. In Proceedings of the IEEE international conference on computer vision. 1920--1927. https://doi.org/10.1109/iccv.2013.241
[43]
Radu Timofte, Vincent De Smet, and Luc Van Gool. 2014. A+: Adjusted anchored neighborhood regression for fast super-resolution. In Asian conference on computer vision. Springer, 111--126. https://doi.org/10.1007/978-3-319-16817-3_8
[44]
Dinh-Hoan Trinh, Marie Luong, Francoise Dibos, Jean-Marie Rocchisani, Canh-Duong Pham, and Truong Q Nguyen. 2014. Novel example-based method for super-resolution and denoising of medical images. IEEE Transactions on Image processing 23, 4 (2014), 1882--1895. https://doi.org/10.1109/tip.2014.2308422
[45]
Tarun Wadhawan, Ning Situ, Hu Rui, Keith Lancaster, Xiaojing Yuan, and George Zouridakis. 2011. Implementation of the 7-point checklist for melanoma detection on smart handheld devices. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS. IEEE, 3180--3183. https://doi.org/10.1109/IEMBS.2011.6090866
[46]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600--612. https://doi.org/10.1109/tip.2003.819861
[47]
Bing Xu, Andrew Tulloch, Yunpeng Chen, Xiaomeng Yang, and Lin Qiao. 2019. Hybrid Composition with IdleBlock: More Efficient Networks for Image Recognition. arXiv preprint arXiv:1911.08609 (2019).
[48]
Jianchao Yang, John Wright, Thomas S Huang, and Yi Ma. 2010. Image super-resolution via sparse representation. IEEE transactions on image processing 19, 11 (2010), 2861--2873. https://doi.org/10.1109/tip.2010.2050625
[49]
Wenming Yang, Xuechen Zhang, Yapeng Tian, Wei Wang, Jing-Hao Xue, and Qingmin Liao. 2019. Deep learning for single image super-resolution: A brief review. IEEE Transactions on Multimedia 21, 12 (2019), 3106--3121. https://doi.org/10.1109/tmm.2019.2919431
[50]
Roman Zeyde, Michael Elad, and Matan Protter. 2010. On single image scale-up using sparse-representations. In International conference on curves and surfaces. Springer, 711--730. https://doi.org/10.1007/978-3-642-27413-8_47
[51]
Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018. Image super-resolution using very deep residual channel attention networks. In Proceedings of the European Conference on Computer Vision (ECCV). 286--301. https://doi.org/10.1007/978-3-030-01234-2_18

Cited By

View all
  • (2024)Lipwatch: Enabling Silent Speech Recognition on Smartwatches using Acoustic SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596148:2(1-29)Online publication date: 15-May-2024
  • (2024)Sensing to Hear through MemoryProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36595988:2(1-31)Online publication date: 15-May-2024
  • (2024)UHeadProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435518:1(1-28)Online publication date: 6-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 5, Issue 1
March 2021
1272 pages
EISSN:2474-9567
DOI:10.1145/3459088
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 March 2021
Published in IMWUT Volume 5, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. edge computing
  2. image super-resolution
  3. mobile computing
  4. on-device machine learning

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)116
  • Downloads (Last 6 weeks)11
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Lipwatch: Enabling Silent Speech Recognition on Smartwatches using Acoustic SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596148:2(1-29)Online publication date: 15-May-2024
  • (2024)Sensing to Hear through MemoryProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36595988:2(1-31)Online publication date: 15-May-2024
  • (2024)UHeadProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435518:1(1-28)Online publication date: 6-Mar-2024
  • (2024)UFaceProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435468:1(1-27)Online publication date: 6-Mar-2024
  • (2024)EVLeSen: In-Vehicle Sensing with EV-Leaked SignalProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3649389(679-693)Online publication date: 29-May-2024
  • (2024)Water Salinity Sensing with UAV-Mounted IR-UWB RadarACM Transactions on Sensor Networks10.1145/363351520:4(1-37)Online publication date: 11-May-2024
  • (2024)Wi-Cyclops: Room-Scale WiFi Sensing System for Respiration Detection Based on Single-AntennaACM Transactions on Sensor Networks10.1145/363295820:4(1-24)Online publication date: 11-May-2024
  • (2024)FSS-TagProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314577:4(1-24)Online publication date: 12-Jan-2024
  • (2024)LiqDetectorProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314437:4(1-24)Online publication date: 12-Jan-2024
  • (2024)Emotion Embodied: Unveiling the Expressive Potential of Single-Hand GesturesProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642255(1-17)Online publication date: 11-May-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media