Abstract
360° video has been becoming one of the major media in recent years, providing immersive experience for viewers with more interactions compared with traditional videos. Most of today’s implementations rely on bulky Head-Mounted Displays (HMDs) or require touch screen operations for interactive display, which are not only expensive but also inconvenient for viewers. In this paper, we demonstrate that interactive 360° video streaming can be done with hints from gaze movement detected by the front camera of today's mobile devices (e.g., a smartphone). We design a lightweight real-time gaze point tracking method for this purpose. We integrate it with streaming module and apply a dynamic margin adaption algorithm to minimize the overall energy consumption for battery-constrained mobile devices. Our experiments on state-of-the-art smartphones show the feasibility of our solution and its energy efficiency toward cost-effective real-time 360° video streaming.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Jiang N, Liu Y, Guo T, Xu W Y, Swaminathan V, Xu L S, Wei S. QuRate: Power-efficient mobile immersive video streaming. In Proc. the 11th ACM Multimedia Systems Conference, June 2020, pp.99-111. DOI: 10.1145/3339825.3391863.
Chen H X, Dai Y T, Meng H, Chen Y L, Li T. Understanding the characteristics of mobile augmented reality applications. In Proc. the 2018 IEEE International Symposium on Performance Analysis of Systems and Software, Apr. 2018, pp.128-138. DOI: https://doi.org/10.1109/ISPASS.2018.00026.
Chen J W, Hu M, Luo Z X, Wang Z L, Wu D. SR360: Boosting 360-degree video streaming with super-resolution. In Proc. the 30th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video, Jun. 2020, pp.1-6. DOI: 10.1145/3386290.3396929.
Saredakis D, Szpak A, Birckhead B, Keage H, Rizzo A, Loetscher T. Factors associated with virtual reality sickness in head-mounted displays: A systematic review and meta-analysis. Frontiers in Human Neuroscience, 2020, 14: Article No. 96. DOI: 10.3389/fnhum.2020.00096.
Chen X, Nixon K, Chen Y R. Practical power consumption analysis with current smartphones. In Proc. the 29th IEEE International System-on-Chip Conference, Sept. 2016, pp.333-337. DOI: https://doi.org/10.1109/SOCC.2016.7905505.
Lee S, Jang D M, Jeong J B, Ryu E. Motion-constrained tile set based 360-degree video streaming using saliency map prediction. In Proc. the 29th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video, Jun. 2019, pp.20-24. DOI: 10.1145/3304112.3325614.
He J, Qureshi M A, Qiu L L, Li J, Li F, Han L. Rubiks: Practical 360-degree streaming for smartphones. In Proc. the 16th Annual International Conference on Mobile Systems, Applications, and Services, Jun. 2018, pp.482-494. DOI: 10.1145/3210240.3210323.
Qian F, Ji L S, Han B, Gopalakrishnan V. Optimizing 360 video delivery over cellular networks. In Proc. the 6th Workshop on All Things Cellular: Operations, Applications and Challenges, Oct. 2016, pp.1-6. DOI: https://doi.org/10.1145/2980055.2980056.
Dambra S, Samela G, Sassatelli L, Pighetti R, Aparicio-Pardo R, Pinna-Déry A. Film editing: New levers to improve VR streaming. In Proc. the 9th ACM Multimedia Systems Conference, Jun. 2018, pp.27-39. DOI: 10.1145/3204949.3204962.
Sassatelli L, Winckler M, Fisichella T, Aparicio R, Pinna-Déry A. A new adaptation lever in 360° video streaming. In Proc. the 29th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video, Jun. 2019, pp.37-42. DOI: 10.1145/3304112.3325610.
Li Z J, Li M, Mohapatra P, Han J S, Chen S Y. iType: Using eye gaze to enhance typing privacy. In Proc. the 2017 IEEE Conference on Computer Communications, May 2017. DOI: https://doi.org/10.1109/INFOCOM.2017.8057233.
Li Y H, Cao Z C, Wang J L. Gazture: Design and implementation of a gaze based gesture control system on tablets. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2017, 1(3): Article No. 74. DOI: 10.1145/3130939.
Yan Z S, Song C, Lin F, Xu W Y. Exploring eye adaptation in head-mounted display for energy efficient smartphone virtual reality. In Proc. the 19th International Workshop on Mobile Computing Systems & Applications, Feb. 2018, pp.13-18. DOI: 10.1145/3177102.3177121.
Yan K G, Zhang X Y, Tan JWJ, Fu X. Redefining QoS and customizing the power management policy to satisfy individual mobile users. In Proc. the 49th Annual IEEE/ACM International Symposium on Microarchitecture, Oct. 2016, pp.1-12. DOI: https://doi.org/10.1109/MICRO.2016.7783756.
He S H, Shen H Y, Soundararaj V, Yu L. Cloud assisted traffic redundancy elimination for power efficiency in smartphones. In Proc. the 15th IEEE International Conference on Mobile Ad Hoc and Sensor Systems, Oct. 2018, pp.371-379. DOI: 10.1109/MASS.2018.00060.
Lugaresi C, Tang J Q, Nash H et al. MediaPipe: A framework for building perception pipelines. arXiv:1906.08172, 2019. https://arxiv.org/abs/1906.08172, May 2021.
Shen L F, Chen Y C, Liu J C. Energy-efficient interactive 360° video streaming with real-time gaze tracking on mobile devices. In Proc. the 18th International Conference on Mobile Ad Hoc and Smart Systems, Oct. 2021, pp.243-251. DOI: 10.1109/MASS52906.2021.00040.
Sandler M, Howard A, Zhu M L, Zhmoginov A, Chen L C. MobileNetV2: Inverted residuals and linear bottlenecks. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.4510-4520. DOI: 10.1109/CVPR.2018.00474.
He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2016, pp.770-778. DOI: 10.1109/CVPR.2016.90.
Shi S, Gupta V, Jana R. Freedom: Fast recovery enhanced VR delivery over mobile networks. In Proc. the 17th Annual International Conference on Mobile Systems, Applications, and Services, Jun. 2019, pp.130-141. DOI: 10.1145/3307334.3326087.
Fan C L, Lee J, Lo W C, Huang C Y, Chen K T, Hsu C H. Fixation prediction for 360°video streaming in headmounted virtual reality. In Proc. the 27th Workshop on Network and Operating Systems Support for Digital Audio and Video, Jun. 2017, pp.67-72. DOI: 10.1145/3083165.3083180.
Gül S, Podborski D, Buchholz T, Schierl T, Hellge C. Lowlatency cloud-based volumetric video streaming using head motion prediction. In Proc. the 30th Workshop on Network and Operating Systems Support for Digital Audio and Video, Jun. 2020, pp.27-33. DOI: 10.1145/3386290.3396933.
Weisberg S. Applied Linear Regression (3rd edition). John Wiley & Sons, 2005.
Królak A, Strumi l lo P. Eye-blink detection system for human-computer interaction. Universal Access in the Information Society, 2012, 11(4): 409-419. DOI: https://doi.org/10.1007/s10209-011-0256-6.
Soukupová T, Čech J. Real-time eye blink detection using facial landmarks. In Proc. the 21st Computer Vision Winter Workshop, Feb. 2016.
Zhang X C, Sugano Y, Fritz M, Bulling A. It's written all over your face: Full-face appearance-based gaze estimation. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Jul. 2017, pp.2299-2308. DOI: 10.1109/CVPRW.2017.284.
Zhang X C, Sugano Y, Fritz Mario, Bulling A. MPIIGaze: Real-world dataset and deep appearancebased gaze estimation. IEEE Trans. Pattern Analysis and Machine Intelligence, 2019, 41(1): 162-175. DOI: https://doi.org/10.1109/TPAMI.2017.2778103.
Author information
Authors and Affiliations
Corresponding author
Supplementary Information
ESM 1
(PDF 171 kb)
Rights and permissions
About this article
Cite this article
Shen, L., Chen, Y. & Liu, J. Gaze-Assisted Viewport Control for 360° Video on Smartphone. J. Comput. Sci. Technol. 37, 906–918 (2022). https://doi.org/10.1007/s11390-022-2037-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-022-2037-5