research-article

PatchSR: Achieving HD Video Streaming with Pre-downloaded Universal SR Models

Authors:

Gangqiang Zhou,

Miao HuAuthors Info & Claims

ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence

Pages 343 - 350

https://doi.org/10.1145/3594315.3594664

Published: 02 August 2023 Publication History

Abstract

The high-definition (HD) video streaming has gained tremendous popularity with the proliferation of smartphones and mobile networks. However, it is quite challenging to deliver HD online video streams directly to devices with very low bandwidth in current systems. In this paper, we propose a neural-enhanced HD video streaming system named PatchSR to provide HD video streaming for bandwidth-constrained devices. PatchSR delivers universal super-resolution (SR) models with high performance to devices in advance. Only low-resolution video streams are sent to bandwidth-constrained devices, and the video quality at the device side can be enhanced with SR techniques. The main challenge is training multiple universal SR models with high performance and selecting the dedicated SR model for each video content. To overcome this new challenge, we propose an image classification algorithm of texture features according to the Discrete Fourier Transform (DFT) feature map of the training patch. We also design a dynamic selection algorithm of SR models for clients to improve video quality. Finally, we achieve and evaluate our proposed PatchSR system with real network traces and the experimental results show that PatchSR achieves higher video quality and up to 28.65% QoE improvement compared to baselines.

References

[1]

Eirikur Agustsson and Radu Timofte. 2017. Ntire 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 126–135.

[2]

Mohiuddin Ahmed, Raihan Seraj, and Syed Mohammed Shamsul Islam. 2020. The k-means algorithm: A comprehensive survey and performance evaluation. Electronics 9, 8 (2020), 1295.

[3]

bilibili. 2022. Bilibili. Retrieved October 10, 2022 from https://www.bilibili.com.

[4]

Ronald Newbold Bracewell and Ronald N Bracewell. 1986. The Fourier transform and its applications. Vol. 31999. McGraw-Hill New York.

[5]

Yanpeng Cao, Chengcheng Wang, Changjun Song, Yongming Tang, and He Li. 2021. Real-time super-resolution system of 4k-video based on deep learning. In 2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP). IEEE, 69–76.

[6]

Jiawen Chen, Miao Hu, Zhenxiao Luo, Zelong Wang, and Di Wu. 2020. Sr360: boosting 360-degree video streaming with super-resolution. In Proceedings of the 30th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video. 1–6.

Digital Library

[7]

Cisco. 2020. Cisco Annual Internet Report (2018–2023) White Paper. Retrieved October 10, 2022 from https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.html.

[8]

Federal Communications Commission 2016. Raw data-measuring broadband america.(2016).

[9]

Tao Dai, Jianrui Cai, Yongbing Zhang, Shu-Tao Xia, and Lei Zhang. 2019. Second-order attention network for single image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11065–11074.

[10]

Abdolkarim Mardanian Dehkordi and Vahid Tabataba Vakili. 2009. Equation based rate control and multiple connections for adaptive video streaming over cellular networks. In SoftCOM 2009-17th International Conference on Software, Telecommunications & Computer Networks. IEEE, 176–180.

[11]

Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2015. Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence 38, 2 (2015), 295–307.

[12]

FFmpeg. 2022. FFmpeg. Retrieved September 29, 2022 from https://ffmpeg.org.

[13]

Alain Hore and Djemel Ziou. 2010. Image quality metrics: PSNR vs. SSIM. In 2010 20th international conference on pattern recognition. IEEE, 2366–2369.

Digital Library

[14]

Te-Yuan Huang, Ramesh Johari, Nick McKeown, Matthew Trunnell, and Mark Watson. 2014. A buffer-based approach to rate adaptation: Evidence from a large video streaming service. In Proceedings of the 2014 ACM conference on SIGCOMM. 187–198.

Digital Library

[15]

DH Johnson. 2006. Signal-to-noise ratio. Scholarpedia 1 (12): 2088.

[16]

Jaehong Kim, Youngmok Jung, Hyunho Yeo, Juncheol Ye, and Dongsu Han. 2020. Neural-enhanced live streaming: Improving live video ingest via online learning. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication. 107–125.

Digital Library

[17]

Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1646–1654.

[18]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[19]

Earl J Kirkland. 2010. Bilinear interpolation. In Advanced Computing in Electron Microscopy. Springer, 261–263.

[20]

Jean Le Feuvre. 2020. Gpac filters. In Proceedings of the 11th ACM Multimedia Systems Conference. 249–254.

Digital Library

[21]

Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4681–4690.

[22]

Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 136–144.

[23]

Limelight. 2020. The state of online video 2020.Retrieved October 10, 2022 from https://screenforce.at/media/file/100101_2020_ww_limelight_state_of_online_video_vod_ott_en.pdf.

[24]

Jiaming Liu, Ming Lu, Kaixin Chen, Xiaoqi Li, Shizun Wang, Zhaoqing Wang, Enhua Wu, Yurong Chen, Chuang Zhang, and Ming Wu. 2021. Overfitting the data: Compact neural video delivery via content-aware feature modulation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4631–4640.

[25]

Zhenxiao Luo, Baili Chai, Zelong Wang, Miao Hu, and Di Wu. 2023. Masked360: Enabling Robust 360-degree Video Streaming with Ultra Low Bandwidth Consumption. IEEE Transactions on Visualization and Computer Graphics (2023), 1–10.

[26]

Zhenxiao Luo, Zelong Wang, Jinyu Chen, Miao Hu, Yipeng Zhou, Tom ZJ Fu, and Di Wu. 2021. Crowdsr: Enabling high-quality video ingest in crowdsourced livecast via super-resolution. In Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video. 90–97.

Digital Library

[27]

Zhenxiao Luo, Zelong Wang, Miao Hu, Yipeng Zhou, and Di Wu. 2022. LiveSR: Enabling Universal HD Live Video Streaming with Crowdsourced Online Learning. IEEE Transactions on Multimedia (2022), 1–11.

[28]

Ting Ma, Yee Hui Lee, Maode Ma, and Stefan Winkler. 2012. The Rain Attenuation on Real-Time Video Streaming via Satellite Links. International Journal of Computer Theory and Engineering 4, 4 (2012), 595.

[29]

Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. 2017. Neural adaptive video streaming with pensieve. In Proceedings of the conference of the ACM special interest group on data communication. 197–210.

Digital Library

[30]

NetFlix. 2022. Netflix. Retrieved October 10, 2022 from https://www.netflix.com.

[31]

opencv. 2022. opencv-python. Retrieved September 29, 2022 from https://github.com/opencv/opencv-python.

[32]

Pytorch. 2022. Pytorch. Retrieved September 29, 2022 from https://pytorch.org/.

[33]

Haakon Riiser, Paul Vigmostad, Carsten Griwodz, and Pål Halvorsen. 2013. Commute path bandwidth traces from 3G networks: analysis and applications. In Proceedings of the 4th ACM Multimedia Systems Conference. 114–118.

Digital Library

[34]

Tim Salimans and Durk P Kingma. 2016. Weight normalization: A simple reparameterization to accelerate training of deep neural networks. Advances in neural information processing systems 29 (2016).

[35]

Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1874–1883.

[36]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[37]

Yi Sun, Xiaoqi Yin, Junchen Jiang, Vyas Sekar, Fuyuan Lin, Nanshu Wang, Tao Liu, and Bruno Sinopoli. 2016. CS2P: Improving video bitrate selection and adaptation with data-driven throughput prediction. In Proceedings of the 2016 ACM SIGCOMM Conference. 272–285.

Digital Library

[38]

T.TV. 2022. Twitch tv. Retrieved October 10, 2022 from https://www.twitch.tv.

[39]

Twitchtracker. 2022. Twitch tv. Retrieved October 10, 2022 from https://twitchtracker.com/statistics.

[40]

Hyunho Yeo, Youngmok Jung, Jaehong Kim, Jinwoo Shin, and Dongsu Han. 2018. Neural adaptive content-aware internet video delivery. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 645–661.

[41]

Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, and Bruno Sinopoli. 2015. A control-theoretic approach for dynamic adaptive video streaming over HTTP. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication. 325–338.

Digital Library

[42]

YouTube. 2022. Youtube. Retrieved October 10, 2022 from https://www.youtube.com.

[43]

Jiahui Yu, Yuchen Fan, Jianchao Yang, Ning Xu, Zhaowen Wang, Xinchao Wang, and Thomas Huang. 2018. Wide activation for efficient and accurate image super-resolution. arXiv preprint arXiv:1808.08718 (2018).

[44]

Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018. Image super-resolution using very deep residual channel attention networks. In Proceedings of the European conference on computer vision (ECCV). 286–301.

Digital Library

[45]

Yinjie Zhang, Yuanxing Zhang, Yi Wu, Yu Tao, Kaigui Bian, Pan Zhou, Lingyang Song, and Hu Tuo. 2020. Improving quality of experience by adaptive video streaming with super-resolution. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 1957–1966.

Digital Library

Recommendations

Revisiting super-resolution for internet video streaming
NOSSDAV '22: Proceedings of the 32nd Workshop on Network and Operating Systems Support for Digital Audio and Video

Recent advancements of neural-enhanced techniques, especially super-resolution (SR), show great potential in revolutionizing the landscape of Internet video delivery. However, there are still quite a few key questions (e.g., how to choose a proper ...
Neural-Enhanced Live Streaming: Improving Live Video Ingest via Online Learning
SIGCOMM '20: Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication

Live video accounts for a significant volume of today's Internet video. Despite a large number of efforts to enhance user quality of experience (QoE) both at the ingest and distribution side of live video, the fundamental limitations are that streamer's ...
LiveSR: Enabling Universal HD Live Video Streaming With Crowdsourced Online Learning
The high-definition (HD) live video streaming has gained significant popularity due to the rapid growth of 4 G/5 G and social media. However, for devices with constrained bandwidth, they still have no sufficient bandwidth to support HD live video ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence

March 2023

824 pages

ISBN:9781450399029

DOI:10.1145/3594315

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 August 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China

Conference

ICCAI 2023

ICCAI 2023: 2023 9th International Conference on Computing and Artificial Intelligence

March 17 - 20, 2023

Tianjin, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
62
Total Downloads

Downloads (Last 12 months)24
Downloads (Last 6 weeks)7

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten