poster

Poster: A Fast, Scalable, and Energy-efficient Edge Acceleration Architecture based on GPU Cluster

Authors:
Yanwei Wang

Inspur Electronic Information Industry Co., Ltd, Jinan, China

Inspur Electronic Information Industry Co., Ltd, Jinan, China

0000-0001-6380-3626
View Profile

,
Dongdong Su

Inspur Electronic Information Industry Co., Ltd & Guangdong Inspur Intelligent Computing Technology Co., Ltd, Jinan, China

Inspur Electronic Information Industry Co., Ltd & Guangdong Inspur Intelligent Computing Technology Co., Ltd, Jinan, China

0009-0005-9021-5308
View Profile

,
Qianqian Zhao

Inspur Electronic Information Industry Co., Ltd, Jinan, China

Inspur Electronic Information Industry Co., Ltd, Jinan, China

0000-0002-6023-9879
View Profile

,
Linge Xiao

Inspur Electronic Information Industry Co., Ltd, Jinan, China

Inspur Electronic Information Industry Co., Ltd, Jinan, China

0000-0002-2491-4574
View Profile

,
Yanmei Shen

Inspur Electronic Information Industry Co., Ltd, Jinan, China

Inspur Electronic Information Industry Co., Ltd, Jinan, China

0009-0000-4408-9498
View Profile

,
Kefeng Zhu

Guangdong Inspur Intelligent Computing Technology Co., Ltd, Guangzhou, China

Guangdong Inspur Intelligent Computing Technology Co., Ltd, Guangzhou, China

0000-0002-0377-6176
View Profile

CoNEXT 2023: Companion of the 19th International Conference on emerging Networking EXperiments and TechnologiesDecember 2023Pages 53–54https://doi.org/10.1145/3624354.3630083

Published:05 December 2023Publication History

CoNEXT 2023: Companion of the 19th International Conference on emerging Networking EXperiments and Technologies

Pages 53–54

ABSTRACT

GPU-aided servers are widely used to accelerate the increasing computing workload on edge network. However, massive computing requests can cause delays in transmission and processing at edge servers, leading to long response time and high system energy consumption. Moreover, in distributed scenarios, GPUs from different vendors can hardly communicate directly through the network. As a rare open-source GPU architecture, Vortex has been emerged to enable an FPGA accelerator to serve as a PCIe-based soft GPU. On this basis, Vortex GPUs in our edge acceleration architecture further integrate the low-latency RDMA function. Compared with the original Vortex GPU-aided servers cluster, our preliminary study shows that our Vortex GPU cluster is fast, scalable, and energy-efficient.

References

Ching-Hsiang Chu, Xiaoyi Lu, Ammar A. Awan, Hari Subramoni, Bracy Elton, and Dhabaleswar K. Panda. 2019. Exploiting Hardware Multicast and GPUDirect RDMA for Efficient Broadcast. IEEE Transactions on Parallel and Distributed Systems 30, 3 (2019), 575--588. https://doi.org/10.1109/TPDS.2018.2867222Google ScholarCross Ref
Blaise Tine, Krishna Praveen Yalamarthy, Fares Elsabbagh, and Kim Hyesoon. 2021. Vortex: Extending the RISC-V ISA for GPGPU and 3D-Graphics. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (Virtual Event, Greece) (MICRO '21). Association for Computing Machinery, New York, NY, USA, 754--766. https://doi.org/10.1145/3466752.3480128Google ScholarDigital Library
Hong Zhang, Yupeng Tang, Anurag Khandelwal, and Ion Stoica. 2023. SHEPHERD: Serving DNNs in the Wild. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). USENIX Association, Boston, MA, 787--808. https://www.usenix.org/conference/nsdi23/presentation/zhang-hongGoogle Scholar

Index Terms

Poster: A Fast, Scalable, and Energy-efficient Edge Acceleration Architecture based on GPU Cluster
1. Networks
  1. Network architectures

Recommendations

A fast, scalable, and energy-efficient edge acceleration architecture based on FPGA cluster
CoNEXT '21: Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies

FPGA-based acceleration has been emerged to avoid the cloud computing overload problem by accelerating the compute-intensive workload on edge networks. Though existing studies for FPGA-based edge acceleration have focused on optimizing the computing ...
Read More
The development of Mellanox/NVIDIA GPUDirect over InfiniBand--a new model for GPU to GPU communications

The usage and adoption of General Purpose GPUs (GPGPU) in HPC systems is increasing due to the unparalleled performance advantage of the GPUs and the ability to fulfill the ever-increasing demands for floating points operations. While the GPU can ...
Read More
InfiniBand Verbs on GPU

Due to their massive parallelism and high performance per Watt, GPUs have gained high popularity in high-performance computing and are a strong candidate for future exascale systems. But communication and data transfer in GPU-accelerated systems remain ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CoNEXT 2023: Companion of the 19th International Conference on emerging Networking EXperiments and Technologies
December 2023
80 pages
ISBN:9798400704079
DOI:10.1145/3624354
General Chairs:
Dario Rossi
Huawei Technologies
,
Stefano Secci
CNAM
,
Program Chairs:
Olivier Bonaventure
UCLouvain
,
Lili Qiu
Microsoft Research Asia and University of Texas, Austin
Copyright © 2023 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 December 2023
Check for updates
Author Tags
RDMA
computing workload
edge acceleration
vortex GPU cluster
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate198of789submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 41
  Total Downloads
- Downloads (Last 12 months)41
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Poster: A Fast, Scalable, and Energy-efficient Edge Acceleration Architecture based on GPU Cluster

CoNEXT 2023: Companion of the 19th International Conference on emerging Networking EXperiments and Technologies

ABSTRACT

References

Cited By

Index Terms

Recommendations

A fast, scalable, and energy-efficient edge acceleration architecture based on FPGA cluster

The development of Mellanox/NVIDIA GPUDirect over InfiniBand--a new model for GPU to GPU communications

InfiniBand Verbs on GPU