ABSTRACT
Emerging Internet of Things (IoT) and mobile computing applications are expected to support latency-sensitive deep neural network (DNN) workloads. To realize this vision, the Internet is evolving towards an edge-computing architecture, where computing infrastructure is located closer to the end device to help achieve low latency. However, edge computing may have limited resources compared to cloud environments and thus, cannot run large DNN models that often have high accuracy. In this work, we develop REACT, a framework that leverages cloud resources to execute large DNN models with higher accuracy to improve the accuracy of models running on edge devices. To do so, we propose a novel edge-cloud fusion algorithm that fuses edge and cloud predictions, achieving low latency and high accuracy. We extensively evaluate our approach and show that our approach can significantly improve the accuracy compared to baseline approaches. We focus specifically on object detection in videos (applicable in many video analytics scenarios) and show that the fused edge-cloud predictions can outperform the accuracy of edge-only and cloud-only scenarios by as much as 50%. REACT shows that for Edge AI, the choice between offloading and on-device inference is not binary — redundant execution at cloud and edge locations complement each other when carefully employed.
- Ganesh Ananthanarayanan, Victor Bahl, Landon Cox, Alex Crown, Shadi Nogbahi, and Yuanchao Shu. 2019. Demo: Video Analytics-Killer App for Edge Computing. In Proc. ACM MobiSys.Google Scholar
- Kittipat Apicharttrisorn, Xukan Ran, Jiasi Chen, Srikanth V Krishnamurthy, and Amit K Roy-Chowdhury. 2019. Frugal following: Power thrifty object detection and tracking for mobile augmented reality. In Proc. SenSys. 96–109.Google ScholarDigital Library
- Ashwin Ashok, Peter Steenkiste, and Fan Bai. 2015. Enabling vehicular applications using cloud services through adaptive computation offloading. In Proceedings of the 6th International Workshop on Mobile Cloud Computing and Services. 1–7.Google ScholarDigital Library
- Mohammad Farhadi Bajestani and Yezhou Yang. 2020. TKD: Temporal Knowledge Distillation for Active Perception. In Proc. WACV. 953–962.Google Scholar
- Ravi Bhandari, Akshay Uttama Nambi, Venkata N Padmanabhan, and Bhaskaran Raman. 2018. DeepLane: camera-assisted GPS for driving lane detection. In Proc. BuildSys. 73–82.Google ScholarDigital Library
- Erik Bochinski, Volker Eiselein, and Thomas Sikora. 2017. High-speed tracking-by-detection without using image information. In 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, 1–6.Google ScholarCross Ref
- Daniel Bolya, Sean Foley, James Hays, and Judy Hoffman. 2020. Tide: A general toolbox for identifying object detection errors. In Proc. ECCV.Google ScholarDigital Library
- Rainer E Burkard and Ulrich Derigs. 1980. The linear sum assignment problem. In Assignment and Matching Problems: Solution Methods with FORTRAN-Programs. Springer, 1–15.Google Scholar
- Zhengping Che, Guangyu Li, Tracy Li, Bo Jiang, Xuefeng Shi, Xinsheng Zhang, Ying Lu, Guobin Wu, Yan Liu, and Jieping Ye. 2019. D2-City: A Large-Scale Dashcam Video Dataset of Diverse Traffic Scenarios. arXiv preprint arXiv:1904.01975 (2019).Google Scholar
- Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng, Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu, Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, and Dahua Lin. 2019. MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv preprint arXiv:1906.07155 (2019).Google Scholar
- Tiffany Yu-Han Chen, Lenin Ravindranath, Shuo Deng, Paramvir Bahl, and Hari Balakrishnan. 2015. Glimpse: Continuous, real-time object recognition on mobile devices. In Proc. SenSys. 155–168.Google Scholar
- Byung-Gon Chun, Sunghwan Ihm, Petros Maniatis, Mayur Naik, and Ashwin Patti. 2011. Clonecloud: elastic execution between mobile device and cloud. In Proceedings of the sixth conference on Computer systems. 301–314.Google ScholarDigital Library
- Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. IJCV 88, 2 (2010).Google Scholar
- Anurag Ghosh, Akshay Nambi, Aditya Singh, Harish YVS, and Tanuja Ganu. 2021. Adaptive streaming perception using deep reinforcement learning. arXiv preprint arXiv:2106.05665 (2021).Google Scholar
- Google. 2020. Google Coral USB Accelerator. https://coral.ai/products/accelerator.Google Scholar
- Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).Google Scholar
- Jonatan Heyman, Carl Byström, Joakim Hamrén, and Hugo Heyman. 2020. Locust: An Open Source Load Testing Tool. https://locust.io/Google Scholar
- Jonathan Huang, Vivek Rathod, Chen Sun, Menglong Zhu, Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna, Yang Song, Sergio Guadarrama, 2017. Speed/accuracy trade-offs for modern convolutional object detectors. In Proc. CVPR.Google ScholarCross Ref
- Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2017. Quantized neural networks: Training neural networks with low precision weights and activations. The Journal of Machine Learning Research 18, 1 (2017), 6869–6898.Google ScholarDigital Library
- Intel. 2020. Intel Neural Compute Stick 2. https://software.intel.com/en-us/neural-compute-stick.Google Scholar
- Srinivasan Iyengar, Ravi Raj Saxena, Joydeep Pal, Bhawana Chhaglani, Anurag Ghosh, Venkata N Padmanabhan, and Prabhakar T Venkata. 2021. Holistic energy awareness for intelligent drones. In Proc. BuildSys.Google ScholarDigital Library
- Junchen Jiang, Ganesh Ananthanarayanan, Peter Bodik, Siddhartha Sen, and Ion Stoica. 2018. Chameleon: scalable adaptation of video analytics. In Proc. SIGCOMM. 253–266.Google ScholarDigital Library
- Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. 2017. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Computer Architecture News 45, 1 (2017), 615–629.Google ScholarDigital Library
- Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.Google Scholar
- Mengtian Li, Yu-Xiong Wang, and Deva Ramanan. 2020. Towards Streaming Image Understanding. arXiv preprint arXiv:2005.10420 (2020).Google Scholar
- Yuanqi Li, Arthi Padmanabhan, Pengzhan Zhao, Yufei Wang, Guoqing Harry Xu, and Ravi Netravali. 2020. Reducto: On-Camera Filtering for Resource-Efficient Real-Time Video Analytics. In Proc. SIGCOMM.Google ScholarDigital Library
- Robert LiKamWa, Yunhui Hou, Julian Gao, Mia Polansky, and Lin Zhong. 2016. RedEye: analog ConvNet image sensor architecture for continuous mobile vision. ACM SIGARCH Computer Architecture News 44, 3 (2016).Google ScholarDigital Library
- Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proc. ICCV.Google ScholarCross Ref
- Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proc. ECCV.Google ScholarCross Ref
- Luyang Liu, Hongyu Li, and Marco Gruteser. 2019. Edge assisted real-time object detection for mobile augmented reality. In Proc. MobiCom. 1–16.Google ScholarDigital Library
- Alan Lukezic, Tomas Vojir, Luka Cehovin Zajc, Jiri Matas, and Matej Kristan. 2017. Discriminative correlation filter with channel and spatial reliability. In Proc. CVPR.Google ScholarCross Ref
- Ravi Netravali, Anirudh Sivaraman, Somak Das, Ameesh Goyal, Keith Winstein, James Mickens, and Hari Balakrishnan. 2015. Mahimahi: Accurate record-and-replay for { HTTP}. In USENIX ATC. 417–429.Google ScholarDigital Library
- Nvidia. 2020. Meet Jetson, the Platform for AI at the Edge.https://developer.nvidia.com/embedded-computing.Google Scholar
- Xukan Ran, Haolianz Chen, Xiaodan Zhu, Zhenming Liu, and Jiasi Chen. 2018. Deepdecision: A mobile deep learning framework for edge video analytics. In Proc. INFOCOM.Google ScholarDigital Library
- Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proc. CVPR. 779–788.Google ScholarCross Ref
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proc. NeurIPS. 91–99.Google Scholar
- Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proc. CVPR.Google ScholarCross Ref
- Mahadev Satyanarayanan, Paramvir Bahl, Ramón Caceres, and Nigel Davies. 2009. The case for vm-based cloudlets in mobile computing. IEEE pervasive Computing 8, 4 (2009), 14–23.Google ScholarDigital Library
- Xiaofan Zhang, Haoming Lu, Cong Hao, Jiachen Li, Bowen Cheng, Yuhong Li, Kyle Rupnow, Jinjun Xiong, Thomas Huang, Honghui Shi, 2020. Skynet: a hardware-efficient method for object detection and tracking on embedded systems. In Proc. MLSys.Google Scholar
- Huajun Zhou, Zechao Li, Chengcheng Ning, and Jinhui Tang. 2017. Cad: Scale invariant framework for real-time object detection. In Proc. ICCV Workshops.Google ScholarCross Ref
- Xingyi Zhou, Dequan Wang, and Philipp Krähenbühl. 2019. Objects as points. arXiv preprint arXiv:1904.07850 (2019).Google Scholar
- Pengfei Zhu, Longyin Wen, Dawei Du, Xiao Bian, Qinghua Hu, and Haibin Ling. 2020. Vision Meets Drones: Past, Present and Future. arXiv preprint arXiv:2001.06303 (2020).Google Scholar
- Pengfei Zhu, Longyin Wen, Dawei Du, Xiao Bian, Haibin Ling, Qinghua Hu, Haotian Wu, Qinqin Nie, Hao Cheng, Chenfeng Liu, 2018. Visdrone-vdt2018: The vision meets drone video detection and tracking challenge results. In Proc. ECCV Workshops.Google Scholar
Recommendations
Cost-Effective Resource Configuration for Cloud Video Streaming Services
ICPADS '15: Proceedings of the 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS)Video streaming services are migrating to cloud environments for the economic expense with good scalability. However, cloud providers offer flexible resource configurations, e.g., on-demand, reserved and spot instances, with significant different ...
A Survey on End-Edge-Cloud Orchestrated Network Computing Paradigms: Transparent Computing, Mobile Edge Computing, Fog Computing, and Cloudlet
Sending data to the cloud for analysis was a prominent trend during the past decades, driving cloud computing as a dominant computing paradigm. However, the dramatically increasing number of devices and data traffic in the Internet-of-Things (IoT) era ...
Towards Seamless Serverless Computing Across an Edge-Cloud Continuum
UCC '23: Proceedings of the IEEE/ACM 16th International Conference on Utility and Cloud ComputingServerless computing has emerged as an attractive paradigm due to the efficiency of development and the ease of deployment without managing any underlying infrastructure. Nevertheless, serverless computing approaches face numerous challenges to unlock ...
Comments