skip to main content
10.1145/3456727.3463770acmconferencesArticle/Chapter ViewAbstractPublication PagessystorConference Proceedingsconference-collections
research-article

Jumpgate: automating integration of network connected accelerators

Published: 14 June 2021 Publication History

Abstract

Network-connected accelerators (NCA), such as programmable switches, ASICs, and FPGAs can speed up operations in data analytics. But so far, integration of NCAs into data analytics systems required manual effort.
We present Jumpgate, a system that simplifies integration of existing NCA code into data analytics systems, such as Apache Spark or Presto. Jumpgate places most of the integration code into the analytics system, which needs to be written once, leaving NCA programmers to write only a couple hundred lines of code to integrate new NCAs. Jumpgate relies on dataflow graphs that most analytics systems use internally, and takes care of the invocation of NCAs, the necessary format conversion, and orchestration of their execution via novel staged network pipelines.
Our implementation of Jumpgate in Apache Spark made it possible, for the first time, to study the benefits and drawbacks of using NCAs across the entire range of queries in the TPC-DS benchmark. Since we lack hardware that can accelerate all analytics operations, we implemented NCAs in software. We report on how and when analytics workloads will benefit from NCAs to motivate future designs.

References

[1]
AL Danial. [n.d.]. cloc Github repository. https://github.com/AlDanial/cloc.
[2]
Apache Software Foundation. [n.d.]. Apache Arrow. http://arrow.apache.org/.
[3]
Apache Software Foundation. [n.d.]. Apache ORC Core C++. https://orc.apache.org/docs/core-cpp.html.
[4]
Apache Software Foundation. [n.d.]. Apache Parquet. http://parquet.apache.org/.
[5]
Apache Software Foundation. [n.d.]. Apache Spark. http://spark.apache.org/.
[6]
Apache Software Foundation. [n.d.]. Hadoop. http://hadoop.apache.org/.
[7]
Barefoot Networks. [n.d.]. Barefoot Tofino Switches. https://www.barefootnetworks.com/products/brief-tofino-2/.
[8]
Eric Boutin, Paul Brett, Xiaoyu Chen, Jaliya Ekanayake, Tao Guan, Anna Korsun, Zhicheng Yin, Nan Zhang, and Jingren Zhou. 2015. JetScope: reliable and interactive analytics at cloud scale. Proceedings of the VLDB Endowment 8, 12 (2015), 1680--1691.
[9]
Databricks. 2018. Spark SQL Performance Tests. https://github.com/databricks/spark-sql-perf.
[10]
Jaeyoung Do, Yang-Suk Kee, Jignesh M. Patel, Chanik Park, Kwanghyun Park, and David J. DeWitt. 2013. Query Processing on Smart SSDs: Opportunities and Challenges. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (New York, New York, USA) (SIGMOD '13). ACM, New York, NY, USA, 1221--1230.
[11]
Sanjay Ghemawat and Paul Menage. 2009. Tcmalloc: Thread-caching malloc.
[12]
Ionel Gog, Malte Schwarzkopf, Natacha Crooks, Matthew P Grosvenor, Allen Clement, and Steven Hand. 2015. Musketeer: all for one, one for all in data processing systems. In Proceedings of the Tenth European Conference on Computer Systems. ACM, 2.
[13]
Richard L. Graham, Devendar Bureddy, Pak Lui, Hal Rosenstock, Gilad Shainer, Gil Bloch, Dror Goldenerg, Mike Dubman, Sasha Kotchubievsky, Vladimir Koushnir, Lion Levi, Alex Margolin, Tamir Ronen, Alexander Shpiner, Oded Wertheim, and Eitan Zahavi. 2016. Scalable Hierarchical Aggregation Protocol (SHArP): A Hardware Architecture for Efficient Data Reduction. In Proceedings of the First Workshop on Optimization of Communication in HPC (Salt Lake City, Utah) (COM-HPC '16). IEEE Press, Piscataway, NJ, USA, 1--10.
[14]
Boncheol Gu, Andre S. Yoon, Duck-Ho Bae, Insoon Jo, Jinyoung Lee, Jonghyun Yoon, Jeong-Uk Kang, Moonsang Kwon, Chanho Yoon, Sangyeun Cho, Jaeheon Jeong, and Duckhyun Chang. 2016. Biscuit: A Framework for Near-data Processing of Big Data Workloads. In Proceedings of the 43rd International Symposium on Computer Architecture (Seoul, Republic of Korea) (ISCA '16). IEEE Press, Piscataway, NJ, USA, 153--165.
[15]
Arpit Gupta, Rob Harrison, Marco Canini, Nick Feamster, Jennifer Rexford, and Walter Willinger. 2018. Sonata: Query-driven Streaming Network Telemetry. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (Budapest, Hungary) (SIGCOMM '18). ACM, New York, NY, USA, 357--371.
[16]
John L. Hennessy and David A. Patterson. 2011. Computer Architecture, Fifth Edition: A Quantitative Approach (5th ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
[17]
Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph, Randy H Katz, Scott Shenker, and Ion Stoica. 2011. Mesos: A platform for fine-grained resource sharing in the data center. In NSDI, Vol. 11. 22--22.
[18]
Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. 2007. Dryad: distributed data-parallel programs from sequential building blocks. In ACM SIGOPS operating systems review, Vol. 41. ACM, 59--72.
[19]
Zsolt István, David Sidler, and Gustavo Alonso. 2017. Caribou: Intelligent Distributed Storage. Proc. VLDB Endow. 10, 11 (Aug. 2017), 1202--1213.
[20]
Theo Jepsen, Daniel Alvarez, Nate Foster, Changhoon Kim, Jeongkeun Lee, Masoud Moshref, and Robert Soulé. 2019. Fast String Searching on PISA. In Proceedings of the 2019 ACM Symposium on SDN Research (San Jose, CA, USA) (SOSR '19). ACM, New York, NY, USA, 21--28.
[21]
Xin Jin, Xiaozhou Li, Haoyu Zhang, Nate Foster, Jeongkeun Lee, Robert Soulé, Changhoon Kim, and Ion Stoica. 2018. NetChain: Scale-Free Sub-RTT Coordination. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 35--49. https://www.usenix.org/conference/nsdi18/presentation/jin
[22]
Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soulé, Jeongkeun Lee, Nate Foster, Changhoon Kim, and Ion Stoica. 2017. NetCache: Balancing Key-Value Stores with Fast In-Network Caching. In Proceedings of the 26th Symposium on Operating Systems Principles (Shanghai, China) (SOSP '17). ACM, New York, NY, USA, 121--136.
[23]
Insoon Jo, Duck-Ho Bae, Andre S. Yoon, Jeong-Uk Kang, Sangyeun Cho, Daniel D. G. Lee, and Jaeheon Jeong. 2016. YourSQL: A High-performance Database System Leveraging In-storage Computing. Proc. VLDB Endow. 9, 12 (Aug. 2016), 924--935.
[24]
Vasiliki Kalavri, John Liagouris, Moritz Hoffmann, Desislava Dimitrova, Matthew Forshaw, and Timothy Roscoe. 2018. Three Steps is All You Need: Fast, Accurate, Automatic Scaling Decisions for Distributed Streaming Dataflows. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (Carlsbad, CA, USA) (OSDI'18). USENIX Association, USA, 783--798.
[25]
Antoine Kaufmann, SImon Peter, Naveen Kr. Sharma, Thomas Anderson, and Arvind Krishnamurthy. 2016. High Performance Packet Processing with FlexNIC. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (Atlanta, Georgia, USA) (ASPLOS '16). ACM, New York, NY, USA, 67--81.
[26]
Gunjae Koo, Kiran Kumar Matam, Te I, H. V. Krishna Giri Narra, Jing Li, Hung-Wei Tseng, Steven Swanson, and Murali Annavaram. 2017. Summarizer: Trading Communication with Computing Near Storage. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture (Cambridge, Massachusetts) (MICRO-50 '17). ACM, New York, NY, USA, 219--231.
[27]
Kubernetes. [n.d.]. Production-Grade Container Orchestration. https://kubernetes.io/.
[28]
Geoff Langdale and Daniel Lemire. 2019. Parsing Gigabytes of JSON per Second. CoRR abs/1902.08318 (2019). arXiv:1902.08318 http://arxiv.org/abs/1902.08318
[29]
I-Ting Angelina Lee, Charles E. Leiserson, Tao B. Schardl, Zhunping Zhang, and Jim Sukha. 2015. On-the-Fly Pipeline Parallelism. ACM Trans. Parallel Comput. 2, 3, Article 17 (Sept. 2015), 42 pages.
[30]
Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2015. How Good Are Query Optimizers, Really? Proc. VLDB Endow. 9, 3 (Nov. 2015), 204--215.
[31]
Alberto Lerner, Rana Hussein, and Philippe Cudre-Mauroux. 2019. The Case for Network-Accelerated Query Processing (CIDR 2019).
[32]
Yinan Li, Nikos R. Katsipoulakis, Badrish Chandramouli, Jonathan Goldstein, and Donald Kossmann. 2017. Mison: A Fast JSON Parser for Data Analytics. Proc. VLDB Endow. 10, 10 (June 2017), 1118--1129.
[33]
Ming Liu, Tianyi Cui, Henry Schuh, Arvind Krishnamurthy, Simon Peter, and Karan Gupta. 2019. Offloading Distributed Applications Onto smartNICs Using iPipe. In Proceedings of the ACM Special Interest Group on Data Communication (Beijing, China) (SIGCOMM '19). ACM, New York, NY, USA, 318--333.
[34]
Ming Liu, Simon Peter, Arvind Krishnamurthy, and Phitchaya Mangpo Phothilimthana. 2019. E3: Energy-Efficient Microservices on SmartNIC-Accelerated Servers. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). USENIX Association, Renton, WA, 363--378. https://www.usenix.org/conference/atc19/presentation/liu-ming
[35]
Michael Marty, Marc de Kruijf, Jacob Adriaens, Christopher Alfeld, Sean Bauer, Carlo Contavalli, Michael Dalton, Nandita Dukkipati, William C. Evans, Steve Gribble, Nicholas Kidd, Roman Kononov, Gautam Kumar, Carl Mauer, Emily Musick, Lena Olson, Erik Rubow, Michael Ryan, Kevin Springborn, Paul Turner, Valas Valancius, Xi Wang, and Amin Vahdat. 2019. Snap: A Microkernel Approach to Host Networking. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (Huntsville, Ontario, Canada) (SOSP '19). Association for Computing Machinery, New York, NY, USA, 399--413.
[36]
Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, and Theo Vassilakis. 2010. Dremel: Interactive Analysis of Web-scale Datasets. Proc. VLDB Endow. 3, 1-2 (Sept. 2010), 330--339.
[37]
Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A Timely Dataflow System. In ACM Symposium on Operating Systems Principles (SOSP) (Farminton, Pennsylvania). ACM, New York, NY, USA, 439--455.
[38]
Craig Mustard, Fabian Ruffy, Anny Gakhokidze, Ivan Beschastnikh, and Alexandra Fedorova. 2019. Jumpgate: In-Network Processing as a Service for Data Analytics. In 11th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 19). USENIX Association, Renton, WA. https://www.usenix.org/conference/hotcloud19/presentation/mustard
[39]
Raghunath Othayoth Nambiar and Meikel Poess. 2006. The making of TPC-DS. In Proceedings of the 32nd international conference on Very large data bases. VLDB Endowment, 1049--1058.
[40]
Apache ORC. [n.d.]. ORC Specification v1. https://orc.apache.org/specification/ORCv1/.
[41]
Shoumik Palkar, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2018. Filter Before You Parse: Faster Analytics on Raw Data with Sparser. Proceedings of the VLDB Endowment 11, 11 (2018).
[42]
Shoumik Palkar, James Thomas, Deepak Narayanan, Pratiksha Thaker, Rahul Palamuttam, Parimajan Negi, Anil Shanbhag, Malte Schwarzkopf, Holger Pirk, Saman Amarasinghe, Samuel Madden, and Matei Zaharia. 2018. Evaluating End-to-end Optimization for Data Analytics Applications in Weld. Proc. VLDB Endow. 11, 9 (May 2018), 1002--1015.
[43]
Shoumik Palkar, James J Thomas, Anil Shanbhag, Deepak Narayanan, Holger Pirk, Malte Schwarzkopf, Saman Amarasinghe, Matei Zaharia, and Stanford InfoLab. 2017. Weld: A common runtime for high performance data analytics. In Conference on Innovative Data Systems Research (CIDR).
[44]
Phitchaya Mangpo Phothilimthana, Ming Liu, Antoine Kaufmann, Simon Peter, Rastislav Bodik, and Thomas Anderson. 2018. Floem: A Programming System for NIC-Accelerated Network Applications. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 663--679. https://www.usenix.org/conference/osdi18/presentation/phothilimthana
[45]
Meikel Poess, Bryan Smith, Lubor Kollar, and Paul Larson. 2002. TPC-DS, Taking Decision Support Benchmarking to the Next Level. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (Madison, Wisconsin) (SIGMOD '02). ACM, New York, NY, USA, 582--587.
[46]
DPDK Project. [n.d.]. Data Plane Development Kit (DPDK). https://www.dpdk.org/.
[47]
Alexander Rasmussen, Vinh The Lam, Michael Conley, George Porter, Rishi Kapoor, and Amin Vahdat. 2012. Themis: An I/O-Efficient MapReduce. In Proceedings of the Third ACM Symposium on Cloud Computing (San Jose, California) (SoCC '12). Association for Computing Machinery, New York, NY, USA, Article 13, 14 pages.
[48]
Amedeo Sapio, Ibrahim Abdelaziz, Abdulla Aldilaijan, Marco Canini, and Panos Kalnis. 2017. In-Network Computation is a Dumb Idea Whose Time Has Come. In Proceedings of the 16th ACM Workshop on Hot Topics in Networks (Palo Alto, CA, USA) (HotNets-XVI). ACM, New York, NY, USA, 150--156.
[49]
Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan R. K. Ports, and Peter Richtárik. 2019. Scaling Distributed Machine Learning with In-Network Aggregation. CoRR abs/1903.06701 (2019). arXiv:1903.06701 http://arxiv.org/abs/1903.06701
[50]
Omar Sefraoui, Mohammed Aissaoui, and Mohsine Eleuldj. 2012. Open-Stack: toward an open-source solution for cloud computing. International Journal of Computer Applications 55, 3 (2012), 38--42.
[51]
R. Sethi, M. Traverso, D. Sundstrom, D. Phillips, W. Xie, Y. Sun, N. Yegitbasi, H. Jin, E. Hwang, N. Shingte, and C. Berner. 2019. Presto: SQL on Everything. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). 1802--1813.
[52]
Muhammad Tirmazi, Ran Ben Basat, Jiaqi Gao, and Minlan Yu. 2019. Cheetah: Accelerating Database Queries with Switch Pruning. In Proceedings of the ACM SIGCOMM 2019 Conference Posters and Demos (Beijing, China) (SIGCOMM Posters and Demos '19). ACM, New York, NY, USA, 72--74.
[53]
Muhammad Tirmazi, Ran Ben Basat, Jiaqi Gao, and Minlan Yu. 2020. Cheetah: Accelerating Database Queries with Switch Pruning. SIGMOD (2020).
[54]
Animesh Trivedi, Patrick Stuedi, Jonas Pfefferle, Adrian Schuepbach, and Bernard Metzler. 2018. Albis: High-Performance File Format for Big Data Systems. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 615--630. https://www.usenix.org/conference/atc18/presentation/trivedi
[55]
Shuotao Xu, Sungjin Lee, Sang-Woo Jun, Ming Liu, Jamey Hicks, et al. 2016. Bluecache: A scalable distributed flash-based key-value store. Proceedings of the VLDB Endowment 10, 4 (2016), 301--312.
[56]
Youngseok Yang, Jeongyoon Eo, Geon-Woo Kim, Joo Yeon Kim, Sanha Lee, Jangho Seo, Won Wook Song, and Byung-Gon Chun. 2019. Apache Nemo: A Framework for Building Distributed Dataflow Optimization Policies. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). USENIX Association, Renton, WA, 177--190. https://www.usenix.org/conference/atc19/presentation/yang-youngseok
[57]
Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. 2013. Discretized Streams: Fault-tolerant Streaming Computation at Scale. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (Farminton, Pennsylvania) (SOSP '13). ACM, New York, NY, USA, 423--438.
[58]
Hang Zhu, Zhihao Bai, Jialin Li, Ellis Michael, Dan R. K. Ports, Ion Stoica, and Xin Jin. 2019. Harmonia: Near-Linear Scalability for Replicated Storage with in-Network Conflict Detection. Proc. VLDB Endow. 13, 3 (Nov. 2019), 376--389.

Cited By

View all
  • (2023)Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35899807:2(1-23)Online publication date: 22-May-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SYSTOR '21: Proceedings of the 14th ACM International Conference on Systems and Storage
June 2021
226 pages
ISBN:9781450383981
DOI:10.1145/3456727
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • Technion: Israel Institute of Technology
  • USENIX Assoc: USENIX Assoc

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2021

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SYSTOR '21
Sponsor:

Acceptance Rates

SYSTOR '21 Paper Acceptance Rate 18 of 63 submissions, 29%;
Overall Acceptance Rate 108 of 323 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35899807:2(1-23)Online publication date: 22-May-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media