skip to main content
10.1145/3651863.3651876acmconferencesArticle/Chapter ViewAbstractPublication PagesmmsysConference Proceedingsconference-collections
research-article

You Only Look Once in Panorama: Object Detection for 360° Videos with MLaaS

Published: 15 April 2024 Publication History

Abstract

360° videos are gaining popularity, but immersive analytics, particularly in object detection, confront challenges from complex scenes and high data volume. This imposes significant burdens on individual users and resource-limited edge devices. Fortunately, Machine Learning as a Service (MLaaS) offers an economical solution for quick deployment without specific hardware or expertise. However, current MLaaS are mostly 2D image-designated and not optimized for the distinctive characteristics of raw 360° video frames. In this paper, we propose a novel MLaaS-based system to address this challenge. Our solution partitions 360° frames into distortion-free 2D regions with dynamic region of interest prediction. We then present an image-stitching algorithm featuring Skyline representation, seamlessly combining all the 2D regions into a unified frame. This frame is then transmitted to the MLaaS platform, with the detected objects being back-projected to yield the final results. Our experiments demonstrate the superiority of this system over baselines, proving its effectiveness in 360° video object detection tasks.

References

[1]
Boto3. 2023. AWS SDK for Python (Boto3). Retrieved April 28, 2023 from https://aws.amazon.com/sdk-for-python/
[2]
Miao Cao, Satoshi Ikehata, and Kiyoharu Aizawa. 2022. Field-of-View IoU for Object Detection in 360° Images. arXiv e-prints (2022), arXiv-2202.
[3]
Lovish Chopra, Sarthak Chakraborty, Abhijit Mondal, and Sandip Chakraborty. 2021. Parima: Viewport adaptive 360-degree video streaming. In Proceedings of the Web Conference (WWW'21).
[4]
COCO. 2023. COCO: Detection Evaluation. Retrieved Sep 28, 2023 from https://cocodataset.org/#detection-eval
[5]
Benjamin Coors, Alexandru Paul Condurache, and Andreas Geiger. 2018. SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images. In Proceedings of the 15th European Conference on Computer Vision (ECCV'18).
[6]
Harold Scott Macdonald Coxeter. 1961. Introduction to geometry. (1961), 93 and 289--290.
[7]
Marc Eder, Mykhailo Shvets, John Lim, and Jan-Michael Frahm. 2020. Tangent Images for Mitigating Spherical Distortion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'20).
[8]
Ila Gokarn, Hemanth Sabbella, Yigong Hu, Tarek Abdelzaher, and Archan Misra. 2023. MOSAIC: Spatially-multiplexed edge AI optimization over multiple concurrent video sensing streams. In Proceedings of the 14th Conference on ACM Multimedia Systems (MMSys'23).
[9]
IMARC. 2023. 360-Degree Camera Market: Global Industry Trends, Share, Size, Growth, Opportunity and Forecast 2023-2028. Retrieved Sep 28, 2023 from https://www.imarcgroup.com/360-degree-camera-market
[10]
Junchen Jiang, Ganesh Ananthanarayanan, Peter Bodik, Siddhartha Sen, and Ion Stoica. 2018. Chameleon: Scalable Adaptation of Video Analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (SIGCOMM'18).
[11]
Shanyang Jiang and Lan Zhang. 2022. Quality-aided Annotation Service Selection in MLaaS Market. In Proceedings of the IEEE/ACM 30th International Symposium on Quality of Service (IWQoS'22).
[12]
Glenn Jocher, Alex Stoken, Jirka Borovec, NanoCode012, ChristopherSTAN, Liu Changyu, Laughing, tkianai, Adam Hogan, lorenzomammana, yxNONG, AlexWang1900, Laurentiu Diaconu, Marc, wanghaoyang0106, ml5ah, Doug, Francisco Ingham, Frederik, Guilhen, Hatovix, Jake Poznanski, Jiacong Fang, Lijun Yu, changyu98, Mingyu Wang, Naman Gupta, Osama Akhtar, PetrDvoracek, and Prashant Rai. 2020. ultralytics/yolov5: v3.1 - Bug Fixes and Performance Improvements.
[13]
Sudarshan Lamkhede, Praveen Chandar, Vladan Radosavljevic, Amit Goyal, and Lan Luo. 2023. Machine Learning for Streaming Media. In Companion Proceedings of the ACM Web Conference (WWW'23 Companion).
[14]
Jiaxi Li, Jingwei Liao, Bo Chen, Anh Nguyen, Aditi Tiwari, Qian Zhou, Zhisheng Yan, and Klara Nahrstedt. 2023. Latency-Aware 360-Degree Video Analytics Framework for First Responders Situational Awareness. In Proceedings of the 33rd Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV'23).
[15]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and Larry Zitnick. 2014. Microsoft COCO: Common Objects in Context. In Proceedings of the 13th European Conference on Computer Vision (ECCV'14).
[16]
Andrea Lodi, Silvano Martello, and Michele Monaci. 2002. Two-dimensional packing problems: A survey. European journal of operational research 141, 2 (2002), 241--252.
[17]
Yixiang Mao, Liyang Sun, Yong Liu, and Yao Wang. 2020. Low-Latency FoV-Adaptive Coding and Streaming for Interactive 360° Video Streaming. In Proceedings of the 28th ACM International Conference on Multimedia (MM'20).
[18]
A. Neubeck and L. Van Gool. 2006. Efficient Non-Maximum Suppression. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR'06).
[19]
OpenCV. 2023. OpenCV: Open source computer vision library. Retrieved April 28, 2023 from https://opencv.org/
[20]
Feng Qian, Bo Han, Qingyang Xiao, and Vijay Gopalakrishnan. 2018. Flare: Practical Viewport-Adaptive 360-Degree Video Streaming for Mobile Devices. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking (MobiCom'18).
[21]
Yu-Chuan Su and Kristen Grauman. 2017. Learning Spherical Convolution for Fast Features from 360° Imagery. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17).
[22]
Kuan-Hsun Wang and Shang-Hong Lai. 2019. Object Detection in Curved Space for 360-Degree Camera. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'19).
[23]
Lijun Wei, Wee-Chong Oon, Wenbin Zhu, and Andrew Lim. 2011. A skyline heuristic for the 2D rectangular packing and strip packing problems. European Journal of Operational Research 215, 2 (2011), 337--346.
[24]
Qizhen Weng, Wencong Xiao, Yinghao Yu, Wei Wang, Cheng Wang, Jian He, Yong Li, Liping Zhang, Wei Lin, and Yu Ding. 2022. MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI'22).
[25]
Shuzhao Xie, Yuan Xue, Yifei Zhu, and Zhi Wang. 2022. Cost Effective MLaaS Federation: A Combinatorial Reinforcement Learning Approach. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM'22).
[26]
Wenyan Yang, Yanlin Qian, Joni-Kristian Kämäräinen, Francesco Cricri, and Lixin Fan. 2018. Object Detection in Equirectangular Panorama. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR'18). 2190--2195.
[27]
Heeseung Yun, Sehun Lee, and Gunhee Kim. 2022. Panoramic Vision Transformer For Saliency Detection In 360° Videos. In Proceedings of the 17th European Conference on Computer Vision (ECCV'22).
[28]
Ilwi Yun, Hyuk-Jae Lee, and Chae Eun Rhee. 2022. Improving 360 monocular depth estimation via non-local dense prediction transformer and joint supervised and self-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI'22).
[29]
Haoyu Zhang, Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, Paramvir Bahl, and Michael J. Freedman. 2017. Live Video Analytics at Scale with Approximation and Delay-Tolerance. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI'17).
[30]
Miao Zhang, Yifei Zhu, Linfeng Shen, Fangxin Wang, and Jiangchuan Liu. 2023. OmniSense: Towards Edge-Assisted Online Analytics for 360-Degree Videos. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM'23).
[31]
Qiao Zhang, Tao Xiang, Yifei Cai, Zhichao Zhao, Ning Wang, and Hongyi Wu. 2022. Privacy-Preserving Machine Learning as a Service: Challenges and Opportunities. IEEE Network (2022).
[32]
Yuanxing Zhang, Pengyu Zhao, Kaigui Bian, Yunxin Liu, Lingyang Song, and Xiaoming Li. 2019. DRL360: 360-degree Video Streaming with Deep Reinforcement Learning. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM'19).
[33]
Pengyu Zhao, Ansheng You, Yuanxing Zhang, Jiaying Liu, Kaigui Bian, and Yunhai Tong. 2020. Spherical Criteria for Fast and Accurate 360° Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI'20).

Index Terms

  1. You Only Look Once in Panorama: Object Detection for 360° Videos with MLaaS

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      NOSSDAV '24: Proceedings of the 34th edition of the Workshop on Network and Operating System Support for Digital Audio and Video
      April 2024
      77 pages
      ISBN:9798400706134
      DOI:10.1145/3651863
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 April 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. 360° video
      2. object detection
      3. machine learning as a service

      Qualifiers

      • Research-article

      Conference

      NOSSDAV '24
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 118 of 363 submissions, 33%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 178
        Total Downloads
      • Downloads (Last 12 months)178
      • Downloads (Last 6 weeks)10
      Reflects downloads up to 01 Mar 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media