skip to main content
10.1145/3472883.3486972acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines

Published:01 November 2021Publication History

ABSTRACT

The proliferation of camera-enabled devices and large video repositories has led to a diverse set of video analytics applications. These applications rely on video pipelines, represented as DAGs of operations, to transform videos, process extracted metadata, and answer questions like, "Is this intersection congested?" The latency and resource efficiency of pipelines can be optimized using configurable knobs for each operation (e.g., sampling rate, batch size, or type of hardware used). However, determining efficient configurations is challenging because (a) the configuration search space is exponentially large, and (b) the optimal configuration depends on users' desired latency and cost targets, (c) input video contents may exercise different paths in the DAG and produce a variable amount intermediate results. Existing video analytics and processing systems leave it to the users to manually configure operations and select hardware resources.

We present Llama: a heterogeneous and serverless framework for auto-tuning video pipelines. Given an end-to-end latency target, Llama optimizes for cost efficiency by (a) calculating a latency target for each operation invocation, and (b) dynamically running a cost-based optimizer to assign configurations across heterogeneous hardware that best meet the calculated per-invocation latency target. This makes the problem of auto-tuning large video pipelines tractable and allows us to handle input-dependent behavior, conditional branches in the DAG, and execution variability. We describe the algorithms in Llama and evaluate it on a cloud platform using serverless CPU and GPU resources. We show that compared to state-of-the-art cluster and serverless video analytics and processing systems, Llama achieves 7.8x lower latency and 16x cost reduction on average.

Skip Supplemental Material Section

Supplemental Material

Day1_Session1_Order_1_Llama.mp4

mp4

450.7 MB

References

  1. 2021. Amazon ECU. https://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it.Google ScholarGoogle Scholar
  2. 2021. Ambarella CVFlow Architecture. https://www.ambarella.com/teehnology/#evflow.Google ScholarGoogle Scholar
  3. 2021. AWS Lambda. https://aws.amazon.com/lambda/.Google ScholarGoogle Scholar
  4. 2021. AWS Step Functions. https://docs.aws.amazon.com/step-functions/latest/dg/welcome.html.Google ScholarGoogle Scholar
  5. 2021. Azure Functions. https://azure.microsoft.com/en-us/services/functions/.Google ScholarGoogle Scholar
  6. 2021. Cisco Annual Internet Report (2018-2023). https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.html.Google ScholarGoogle Scholar
  7. 2021. CNN - Futuristic cop cars may identify suspects. https://money.cnn.com/2017/10/19/technology/future/police-ai-dashcam/index.html.Google ScholarGoogle Scholar
  8. 2021. Google Cloud. https://cloud.google.com/.Google ScholarGoogle Scholar
  9. 2021. Google Cloud Functions. https://cloud.google.com/functions.Google ScholarGoogle Scholar
  10. 2021. Multi-Process Service. https://docs.nvidia.com/deploy/pdf/CUDA_Multi_Process_Service_Overview.pdf.Google ScholarGoogle Scholar
  11. 2021. NVIDIA A100 GPU. https://www.nvidia.com/en-us/data-center/a100/.Google ScholarGoogle Scholar
  12. 2021. Political Rally Video. https://www.youtube.com/watch?v=FGDFAD3Jkuc.Google ScholarGoogle Scholar
  13. 2021. Scanner. http://scanner.run/.Google ScholarGoogle Scholar
  14. 2021. Tears of Steel. https://www.youtube.com/watch?v=tjgM6ckoz88.Google ScholarGoogle Scholar
  15. 2021. Traffic Footage. https://www.youtube.com/watch?v=MNn9qKG2UFI.Google ScholarGoogle Scholar
  16. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, and et al. 2016. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (Savannah, GA, USA) (OSDI'16). USENIX Association, USA, 265--283.Google ScholarGoogle Scholar
  17. Yanif Ahmad, Oliver Kennedy, Christoph Koch, and Milos Nikolic. 2012. DBToaster: Higher-Order Delta Processing for Dynamic, Frequently Fresh Views. Proc. VLDB Endow. 5, 10 (June 2012), 968--979. https://doi.org/10.14778/2336664.2336670Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Omid Alipourfard, Hongqiang Harry Liu, Jianshu Chen, Shivaram Venkataraman, Minlan Yu, and Ming Zhang. 2017. CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 469--482. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/alipourfardGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  19. Amazon Go 2021. Amazon Go. https://www.amazon.com/b?ie=UTF8&node=16008589011.Google ScholarGoogle Scholar
  20. G. Ananthanarayanan, P. Bahl, P. Bodík, K. Chintalapudi, M. Philipose, L. Ravindranath, and S. Sinha. 2017. Real-Time Video Analytics: The Killer App for Edge Computing. Computer 50, 10 (2017), 58--67. https://doi.org/10.1109/MC.2017.3641638Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2013. Effective Straggler Mitigation: Attack of the Clones. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13). USENIX Association, Lombard, IL, 185--198. https://www.usenix.org/conference/nsdi13/technical-sessions/presentation/ananthanarayananGoogle ScholarGoogle Scholar
  22. Lixiang Ao, Liz Izhikevich, Geoffrey M. Voelker, and George Porter. 2018. Sprocket: A Serverless Video Processing Framework. In Proceedings of the ACM Symposium on Cloud Computing (Carlsbad, CA, USA) (SoCC '18). Association for Computing Machinery, New York, NY, USA, 263--274. https://doi.org/10.1145/3267809.3267815Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Artificial Intelligence Security Surveillance Cameras 2018. Artificial Intelligence Security Surveillance Cameras. https://www.theverge.com/2018/1/23/16907238/artificial-intelligence-surveillance-cameras-security.Google ScholarGoogle Scholar
  24. Ayon Basumallik and Rudolf Eigenmann. 2006. Optimizing Irregular Shared-Memory Applications for Distributed-Memory Systems. In Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (New York, New York, USA) (PPoPP '06). Association for Computing Machinery, New York, NY, USA, 119--128. https://doi.org/10.1145/1122971.1122990Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Laurent Bindschaedler, Jasmina Malicevic, Nicolas Schiper, Ashvin Goel, and Willy Zwaenepoel. 2018. Rock You like a Hurricane: Taming Skew in Large Scale Analytics. In Proceedings of the Thirteenth EuroSys Conference (Porto, Portugal) (EuroSys '18). Association for Computing Machinery, New York, NY, USA, Article 20, 15 pages. https://doi.org/10.1145/3190508.3190532Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. Bradski. 2000. The OpenCV Library. Dr. Dobb's Journal of Software Tools (2000).Google ScholarGoogle Scholar
  27. Jack Choquette and Wishwesh Gandhi. 2020. NVIDIA's A100 GPU: Performance and Innovation for GPU Computing. In 2020 IEEE Hot Chips 32 Symposium (HCS), Virtual, August 16-18, 2020. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  28. Daniel Crankshaw, Gur-Eyal Sela, Xiangxi Mo, Corey Zumar, Ion Stoica, Joseph Gonzalez, and Alexey Tumanov. 2020. InferLine: Latency-Aware Provisioning and Scaling for Prediction Serving Pipelines. In Proceedings of the 11th ACM Symposium on Cloud Computing (Virtual Event, USA) (SoCC '20). Association for Computing Machinery, New York, NY, USA, 477--491. https://doi.org/10.1145/3419111.3421285Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. 2017. Clipper: A Low-Latency Online Prediction Serving System. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 613--627. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/crankshawGoogle ScholarGoogle Scholar
  30. Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation - Volume 6 (San Francisco, CA) (OSDI'04). USENIX Association, USA, 10.Google ScholarGoogle Scholar
  31. Amol Deshpande, Zachary Ives, and Vijayshankar Raman. 2007. Adaptive Query Processing. Found. Trends Databases 1, 1 (Jan. 2007), 1--140.Google ScholarGoogle ScholarCross RefCross Ref
  32. T. Elgamal. 2018. Costless: Optimizing Cost of Serverless Computing through Function Fusion and Placement. In 2018 IEEE/ACM Symposium on Edge Computing (SEC). 300--312. https://doi.org/10.1109/SEC.2018.00029Google ScholarGoogle ScholarCross RefCross Ref
  33. Andrew D. Ferguson, Peter Bodik, Srikanth Kandula, Eric Boutin, and Rodrigo Fonseca. 2012. Jockey: Guaranteed Job Latency in Data Parallel Clusters. In Proceedings of the 7th ACM European Conference on Computer Systems (Bern, Switzerland) (EuroSys '12). Association for Computing Machinery, New York, NY, USA, 99--112. https://doi.org/10.1145/2168836.2168847Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. FFmpeg 2021. FFmpeg. https://ffmpeg.org/.Google ScholarGoogle Scholar
  35. Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Shuvo Chatterjee, Christos Kozyrakis, Matei Zaharia, and Keith Winstein. 2019. From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers. In Proceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference (Renton, WA, USA) (USENIX ATC '19). USENIX Association, USA, 475--488.Google ScholarGoogle Scholar
  36. Sadjad Fouladi, Riad S. Wahby, Brennan Shacklett, Karthikeyan Vasuki Balasubramaniam, William Zeng, Rahul Bhalerao, Anirudh Sivaraman, George Porter, and Keith Winstein. 2017. Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads. In Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation (Boston, MA, USA) (NSDI'17). USENIX Association, USA, 363--376.Google ScholarGoogle Scholar
  37. Ilya Ganusov and Mahesh Iyer. 2020. Agilex Generation of Intel FPGAs. In 2020 IEEE Hot Chips 32 Symposium (HCS), Virtual, August 16-18, 2020. IEEE.Google ScholarGoogle Scholar
  38. James Gibson, David Atkins, Torrey Creed, Zac Imel, Panayiotis Georgiou, and Shrikanth Narayanan. 2019. Multi-label Multi-task Deep Learning for Behavioral Coding. IEEE Transactions on Affective Computing (2019), 1--1. https://doi.org/10.1109/TAFFC.2019.2952113Google ScholarGoogle ScholarCross RefCross Ref
  39. Ionel Gog, Malte Schwarzkopf, Natacha Crooks, Matthew P. Grosvenor, Allen Clement, and Steven Hand. 2015. Musketeer: All for One, One for All in Data Processing Systems. In Proceedings of the Tenth European Conference on Computer Systems (Bordeaux, France) (EuroSys '15). Association for Computing Machinery, New York, NY, USA, Article 2, 16 pages. https://doi.org/10.1145/2741948.2741968Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jashwant Raj Gunasekaran, Prashanth Thinakaran, Nachiappan C. Nachiappan, Mahmut Taylan Kandemir, and Chita R. Das. 2020. Fifer: Tackling Resource Underutilization in the Serverless Era. In Proceedings of the 21st International Middleware Conference (Delft, Netherlands) (Middleware '20). Association for Computing Machinery, New York, NY, USA, 280--295. https://doi.org/10.1145/3423211.3425683Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Herodotos Herodotou and Shivnath Babu. 2011. Profiling, What-If Analysis, and Cost-Based Optimization of MapReduce Programs. Proc. VLDB Endow. 4, 11 (Aug. 2011), 1111--1122. https://doi.org/10.14778/3402707.3402746Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Kevin Hsieh, Ganesh Ananthanarayanan, Peter Bodik, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B. Gibbons, and Onur Mutlu. 2018. Focus: Querying Large Video Datasets with Low Latency and Low Cost. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 269--286. https://www.usenix.org/conference/osdi18/presentation/hsiehGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  43. Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. 2007. Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007 (Lisbon, Portugal) (EuroSys '07). Association for Computing Machinery, New York, NY, USA, 59--72. https://doi.org/10.1145/1272996.1273005Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Fei Jiang, Yong Jiang, Hui Zhi, Yi Dong, Hao Li, Sufeng Ma, Yilong Wang, Qiang Dong, Haipeng Shen, and Yongjun Wang. 2017. Artificial intelligence in healthcare: past, present and future. Stroke and Vascular Neurology 2, 4 (2017), 230--243. https://doi.org/10.1136/svn-2017-000101 arXiv:https://svn.bmj.com/content/2/4/230.full.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  45. Junchen Jiang, Ganesh Ananthanarayanan, Peter Bodik, Siddhartha Sen, and Ion Stoica. 2018. Chameleon: Scalable Adaptation of Video Analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (Budapest, Hungary) (SIGCOMM '18). Association for Computing Machinery, New York, NY, USA, 253--266. https://doi.org/10.1145/3230543.3230574Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. 2017. Occupy the Cloud: Distributed Computing for the 99%. In Proceedings of the 2017 Symposium on Cloud Computing (Santa Clara, California) (SoCC '17). Association for Computing Machinery, New York, NY, USA, 445--451. https://doi.org/10.1145/3127479.3128601Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers, R. Boyle, P. Cantin, C. Chao, C. Clark, J. Coriell, M. Daley, M. Dau, J. Dean, B. Gelb, T. V. Ghaemmaghami, R. Gottipati, W. Gulland, R. Hagmann, C. R. Ho, D. Hogberg, J. Hu, R. Hundt, D. Hurt, J. Ibarz, A. Jaffey, A. Jaworski, A. Kaplan, H. Khaitan, D. Killebrew, A. Koch, N. Kumar, S. Lacy, J. Laudon, J. Law, D. Le, C. Leary, Z. Liu, K. Lucke, A. Lundin, G. MacKean, A. Maggiore, M. Mahony, K. Miller, R. Nagarajan, R. Narayanaswami, R. Ni, K. Nix, T. Norrie, M. Omernick, N. Penukonda, A. Phelps, J. Ross, M. Ross, A. Salek, E. Samadiani, C. Severn, G. Sizikov, M. Snelham, J. Souter, D. Steinberg, A. Swing, M. Tan, G. Thorson, B. Tian, H. Toma, E. Tuttle, V. Vasudevan, R. Walter, W. Wang, E. Wilcox, and D. H. Yoon. 2017. In-datacenter performance analysis of a tensor processing unit. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA). 1--12. https://doi.org/10.1145/3079856.3080246Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Ram Srivatsa Kannan, Lavanya Subramanian, Ashwin Raju, Jeongseob Ahn, Jason Mars, and Lingjia Tang. 2019. GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks. In Proceedings of the Fourteenth EuroSys Conference 2019 (Dresden, Germany) (EuroSys '19). Association for Computing Machinery, New York, NY, USA, Article 34, 16 pages. https://doi.org/10.1145/3302424.3303958Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Sunghwan Kim, Taesung Lee, Seung-won Hwang, and Sameh Elnikety. 2018. List Intersection for Web Search: Algorithms, Cost Models, and Optimizations. Proc. VLDB Endow. 12, 1 (Sept. 2018), 1--13. https://doi.org/10.14778/3275536.3275537Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Davis E. King. 2009. Dlib-ml: A Machine Learning Toolkit. Journal of Machine Learning Research 10 (2009), 1755--1758.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Ana Klimovic, Heiner Litz, and Christos Kozyrakis. 2018. Selecta: Heterogeneous Cloud Storage Configuration for Data Analytics. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 759--773. https://www.usenix.org/conference/atc18/presentation/klimovic-selectaGoogle ScholarGoogle Scholar
  52. Fan Lai, Jie You, Xiangfeng Zhu, Harsha V. Madhyastha, and Mosharaf Chowdhury. 2020. Sol: Fast Distributed Computation Over Slow Networks. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 273--288. https://www.usenix.org/conference/nsdi20/presentation/laiGoogle ScholarGoogle Scholar
  53. Kshiteej Mahajan, Mosharaf Chowdhury, Aditya Akella, and Shuchi Chawla. 2018. Dynamic Query Re-Planning Using QOOP. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (Carlsbad, CA, USA) (OSDI'18). USENIX Association, USA, 253--267.Google ScholarGoogle Scholar
  54. Ashraf Mahgoub, Alexander Michaelson Medoff, Rakesh Kumar, Subrata Mitra, Ana Klimovic, Somali Chaterji, and Saurabh Bagchi. 2020. OPTIMUSCLOUD: Heterogeneous Configuration Optimization for Distributed Databases in the Cloud. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 189--203. https://www.usenix.org/conference/atc20/presentation/mahgoubGoogle ScholarGoogle Scholar
  55. Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A System for Large-Scale Graph Processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (Indianapolis, Indiana, USA) (SIGMOD '10). Association for Computing Machinery, New York, NY, USA, 135--146. https://doi.org/10.1145/1807167.1807184Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, Theo Vassilakis, Hossein Ahmadi, Dan Delorey, Slava Min, Mosha Pasumansky, and Jeff Shute. 2020. Dremel: A Decade of Interactive SQL Analysis at Web Scale. Proc. VLDB Endow. (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. R. Mirchandaney, J. H. Saltz, R. M. Smith, D. M. Nico, and K. Crowley. 1988. Principles of Runtime Support for Parallel Processors. In Proceedings of the 2nd International Conference on Supercomputing (St. Malo, France) (ICS '88). Association for Computing Machinery, New York, NY, USA, 140--152. https://doi.org/10.1145/55364.55378Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A Timely Dataflow System. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (Farminton, Pennsylvania) (SOSP '13). Association for Computing Machinery, New York, NY, USA, 439--455. https://doi.org/10.1145/2517349.2522738Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Derek G. Murray, Malte Schwarzkopf, Christopher Smowton, Steven Smith, Anil Madhavapeddy, and Steven Hand. 2011. CIEL: A Universal Execution Engine for Distributed Data-Flow Computing. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (Boston, MA) (NSDI'11). USENIX Association, USA, 113--126.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Kay Ousterhout, Patrick Wendell, Matei Zaharia, and Ion Stoica. 2013. Sparrow: Distributed, Low Latency Scheduling. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (Farminton, Pennsylvania) (SOSP '13). Association for Computing Machinery, New York, NY, USA, 69--84. https://doi.org/10.1145/2517349.2522716Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Alex Poms, Will Crichton, Pat Hanrahan, and Kayvon Fatahalian. 2018. Scanner: Efficient Video Analysis at Scale. ACM Trans. Graph. 37, 4, Article 138 (July 2018), 13 pages. https://doi.org/10.1145/3197517.3201394Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Christopher J. Rossbach, Yuan Yu, Jon Currey, Jean-Philippe Martin, and Dennis Fetterly. 2013. Dandelion: A Compiler and Runtime for Heterogeneous Systems. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (Farminton, Pennsylvania) (SOSP '13). Association for Computing Machinery, New York, NY, USA, 49--68. https://doi.org/10.1145/2517349.2522715Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Haichen Shen, Lequn Chen, Yuchen Jin, Liangyu Zhao, Bingyu Kong, Matthai Philipose, Arvind Krishnamurthy, and Ravi Sundaram. 2019. Nexus: A GPU Cluster Engine for Accelerating DNN-Based Video Analysis. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (Huntsville, Ontario, Canada) (SOSP '19). Association for Computing Machinery, New York, NY, USA, 322--337. https://doi.org/10.1145/3341301.3359658Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Ji Sun and Guoliang Li. 2019. An End-to-End Learning-Based Cost Estimator. Proc. VLDB Endow. 13, 3 (Nov. 2019), 307--319. https://doi.org/10.14778/3368289.3368296Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Jian Tan, Tieying Zhang, Feifei Li, Jie Chen, Qixing Zheng, Ping Zhang, Honglin Qiao, Yue Shi, Wei Cao, and Rui Zhang. 2019. IBTune: Individualized Buffer Tuning for Large-Scale Cloud Databases. Proc. VLDB Endow. 12, 10 (June 2019), 1221--1234. https://doi.org/10.14778/3339490.3339503Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Alexey Tumanov, Timothy Zhu, Jun Woo Park, Michael A. Kozuch, Mor Harchol-Balter, and Gregory R. Ganger. 2016. TetriSched: Global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters. In Proceedings of the 11th European Conference on Computer Systems, EuroSys 2016 (Proceedings of the 11th European Conference on Computer Systems, EuroSys 2016). Association for Computing Machinery, Inc. https://doi.org/10.1145/2901318.2901355 11th European Conference on Computer Systems, EuroSys 2016; Conference date: 18-04-2016 Through 21-04-2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Shivaram Venkataraman, Aurojit Panda, Ganesh Ananthanarayanan, Michael J. Franklin, and Ion Stoica. 2014. The Power of Choice in Data-Aware Cluster Scheduling. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (Broomfield, CO) (OSDI'14). USENIX Association, USA, 301--316.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Shivaram Venkataraman, Zongheng Yang, Michael Franklin, Benjamin Recht, and Ion Stoica. 2016. Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16). USENIX Association, Santa Clara, CA, 363--378. https://www.usenix.org/conference/nsdi16/technical-sessions/presentation/venkataramanGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  69. Stratis D. Viglas and Jeffrey F. Naughton. 2002. Rate-Based Query Optimization for Streaming Information Sources. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (Madison, Wisconsin) (SIGMOD '02). Association for Computing Machinery, New York, NY, USA, 37--48. https://doi.org/10.1145/564691.564697Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Martin Voogel, Yohan Frans, and Matt Ouellette. 2020. Xilinx Versal Premium Series. In 2020 IEEE Hot Chips 32 Symposium (HCS), Virtual, August 16-18, 2020. IEEE.Google ScholarGoogle Scholar
  71. Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking Behind the Curtains of Serverless Platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 133--146. https://www.usenix.org/conference/atc18/presentation/wang-liangGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  72. Zuozhi Wang, Kai Zeng, Botong Huang, Wei Chen, Xiaozong Cui, Bo Wang, Ji Liu, Liya Fan, Dachuan Qu, Zhenyu Hou, Tao Guan, Chen Li, and Jingren Zhou. 2020. Tempura: A General Cost-Based Optimizer Framework for Incremental Data Processing. Proc. VLDB Endow. 14, 1 (Sept. 2020), 14--27. https://doi.org/10.14778/3421424.3421427Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Ran Xu, Jinkyu Koo, Rakesh Kumar, Peter Bai, Subrata Mitra, Sasa Misailovic, and Saurabh Bagchi. 2018. VideoChef: Efficient Approximation for Streaming Video Processing Pipelines. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 43--56. https://www.usenix.org/conference/atc18/presentation/xu-ranGoogle ScholarGoogle Scholar
  74. Neeraja J. Yadwadkar, Ganesh Ananthanarayanan, and Randy Katz. 2014. Wrangler: Predictable and Faster Jobs Using Fewer Resources. In Proceedings of the ACM Symposium on Cloud Computing (Seattle, WA, USA) (SOCC '14). Association for Computing Machinery, New York, NY, USA, 1--14. https://doi.org/10.1145/2670979.2671005Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Neeraja J. Yadwadkar, Bharath Hariharan, Joseph E. Gonzalez, Burton Smith, and Randy H. Katz. 2017. Selecting the Best VM Across Multiple Public Clouds: A Data-driven Performance Modeling Approach. In Proceedings of the 2017 Symposium on Cloud Computing (Santa Clara, California) (SoCC '17). ACM, 452--465. https://doi.org/10.1145/3127479.3131614Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Tao Yu, Yue Zhang, and Kwei-Jay Lin. 2007. Efficient Algorithms for Web Services Selection with End-to-End QoS Constraints. ACM Trans. Web (2007).Google ScholarGoogle Scholar
  77. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster Computing with Working Sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing (Boston, MA) (HotCloud'10). USENIX Association, USA, 10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Haoyu Zhang, Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, Paramvir Bahl, and Michael J. Freedman. 2017. Live Video Analytics at Scale with Approximation and Delay-Tolerance. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 377--392. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/zhangGoogle ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SoCC '21: Proceedings of the ACM Symposium on Cloud Computing
        November 2021
        685 pages
        ISBN:9781450386388
        DOI:10.1145/3472883

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 November 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate169of722submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader